• Title/Summary/Keyword: Data

Search Result 215,259, Processing Time 0.109 seconds

Verification Algorithm for the Duplicate Verification Data with Multiple Verifiers and Multiple Verification Challenges

  • Xu, Guangwei;Lai, Miaolin;Feng, Xiangyang;Huang, Qiubo;Luo, Xin;Li, Li;Li, Shan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.558-579
    • /
    • 2021
  • The cloud storage provides flexible data storage services for data owners to remotely outsource their data, and reduces data storage operations and management costs for data owners. These outsourced data bring data security concerns to the data owner due to malicious deletion or corruption by the cloud service provider. Data integrity verification is an important way to check outsourced data integrity. However, the existing data verification schemes only consider the case that a verifier launches multiple data verification challenges, and neglect the verification overhead of multiple data verification challenges launched by multiple verifiers at a similar time. In this case, the duplicate data in multiple challenges are verified repeatedly so that verification resources are consumed in vain. We propose a duplicate data verification algorithm based on multiple verifiers and multiple challenges to reduce the verification overhead. The algorithm dynamically schedules the multiple verifiers' challenges based on verification time and the frequent itemsets of duplicate verification data in challenge sets by applying FP-Growth algorithm, and computes the batch proofs of frequent itemsets. Then the challenges are split into two parts, i.e., duplicate data and unique data according to the results of data extraction. Finally, the proofs of duplicate data and unique data are computed and combined to generate a complete proof of every original challenge. Theoretical analysis and experiment evaluation show that the algorithm reduces the verification cost and ensures the correctness of the data integrity verification by flexible batch data verification.

Verification Control Algorithm of Data Integrity Verification in Remote Data sharing

  • Xu, Guangwei;Li, Shan;Lai, Miaolin;Gan, Yanglan;Feng, Xiangyang;Huang, Qiubo;Li, Li;Li, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.2
    • /
    • pp.565-586
    • /
    • 2022
  • Cloud storage's elastic expansibility not only provides flexible services for data owners to store their data remotely, but also reduces storage operation and management costs of their data sharing. The data outsourced remotely in the storage space of cloud service provider also brings data security concerns about data integrity. Data integrity verification has become an important technology for detecting the integrity of remote shared data. However, users without data access rights to verify the data integrity will cause unnecessary overhead to data owner and cloud service provider. Especially malicious users who constantly launch data integrity verification will greatly waste service resources. Since data owner is a consumer purchasing cloud services, he needs to bear both the cost of data storage and that of data verification. This paper proposes a verification control algorithm in data integrity verification for remotely outsourced data. It designs an attribute-based encryption verification control algorithm for multiple verifiers. Moreover, data owner and cloud service provider construct a common access structure together and generate a verification sentinel to verify the authority of verifiers according to the access structure. Finally, since cloud service provider cannot know the access structure and the sentry generation operation, it can only authenticate verifiers with satisfying access policy to verify the data integrity for the corresponding outsourced data. Theoretical analysis and experimental results show that the proposed algorithm achieves fine-grained access control to multiple verifiers for the data integrity verification.

Development of a method of the data generation with maintaining quantile of the sample data

  • Joohyung Lee;Young-Oh Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.244-244
    • /
    • 2023
  • Both the frequency and the magnitude of hydrometeorological extreme events such as severe floods and droughts are increasing. In order to prevent a damage from the climatic disaster, hydrological models are often simulated under various meteorological conditions. While performing the simulations, a synthetic data generated through time series models which maintains the key statistical characteristics of the sample data are widely applied. However, the synthetic data can easily maintains both the average and the variance of the sample data, but the quantile is not maintained well. In this study, we proposes a data generation method which maintains the quantile of the sample data well. The equations of the former maintenance of variance extension (MOVE) are expanded to maintain quantile rather than the average or the variance of the sample data. The equations are derived and the coefficients are determined based on the characteristics of the sample data that we aim to preserve. Monte Carlo simulation is utilized to assess the performance of the proposed data generation method. A time series data (data length of 500) is regarded as the sample data and selected randomly from the sample data to create the data set (data length of 30) for simulation. Data length of the selected data set is expanded from 30 to 500 by using the proposed method. Then, the average, the variance, and the quantile difference between the sample data, and the expanded data are evaluated with relative root mean square error for each simulation. As a result of the simulation, each equation which is designed to maintain the characteristic of data performs well. Moreover, expanded data can preserve the quantile of sample data more precisely than that those expanded through the conventional time series model.

  • PDF

Development of a National Research Data Platform for Sharing and Utilizing Research Data

  • Shin, Youngho;Um, Jungho;Seo, Dongmin;Shin, Sungho
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.25-38
    • /
    • 2022
  • Research data means data used or created in the course of research or experiments. Research data is very important for validation of research conducted and for use in future research and projects. Recently, convergence research between various fields and international cooperation has been continuously done due to the explosive increase of research data and the increase in the complexity of science and technology. Developed countries are actively promoting open science policies that share research results and processes to create new knowledge and values through convergence research. Communities to promote the sharing and utilization of research data such as RDA (Research Data Alliance) and COAR (Confederation of Open Access Repositories) are active, and various platforms for managing and sharing research data are being developed and used. OpenAIRE (Open Access Infrastructure for Research In Europe), a research data platform in Europe, ARDC (Australian Research Data Commons) in Australia, and IRDB (Institutional Repositories DataBase) in Japan provide research data or research data related services. Korea has been establishing and implementing a research data sharing and utilization strategy to promote the sharing and utilization of research data at the national level, led by the central government. Based on this strategy, KISTI has been building a Korean research data platform (DataON) since 2018, and has been providing research data sharing and utilization services to users since January 2020. This paper reviews the characteristics of DataON and how it is used for research by showing its applications.

An Empirical Study on the Effects of Source Data Quality on the Usefulness and Utilization of Big Data Analytics Results (원천 데이터 품질이 빅데이터 분석결과의 유용성과 활용도에 미치는 영향)

  • Park, Sohyun;Lee, Kukhie;Lee, Ayeon
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.4
    • /
    • pp.197-214
    • /
    • 2017
  • This study sheds light on the source data quality in big data systems. Previous studies about big data success have called for future research and further examination of the quality factors and the importance of source data. This study extracted the quality factors of source data from the user's viewpoint and empirically tested the effects of source data quality on the usefulness and utilization of big data analytics results. Based on the previous researches and focus group evaluation, four quality factors have been established such as accuracy, completeness, timeliness and consistency. After setting up 11 hypotheses on how the quality of the source data contributes to the usefulness, utilization, and ongoing use of the big data analytics results, e-mail survey was conducted at a level of independent department using big data in domestic firms. The results of the hypothetical review identified the characteristics and impact of the source data quality in the big data systems and drew some meaningful findings about big data characteristics.

A case study of ECN data conversion for Korean and foreign ecological data integration

  • Lee, Hyeonjeong;Shin, Miyoung;Kwon, Ohseok
    • Journal of Ecology and Environment
    • /
    • v.41 no.5
    • /
    • pp.142-144
    • /
    • 2017
  • In recent decades, as it becomes increasingly important to monitor and research long-term ecological changes, worldwide attempts are being conducted to integrate and manage ecological data in a unified framework. Especially domestic ecological data in South Korea should be first standardized based on predefined common protocols for data integration, since they are often scattered over many different systems in various forms. Additionally, foreign ecological data should be converted into a proper unified format to be used along with domestic data for association studies. In this study, our interest is to integrate ECN data with Korean domestic ecological data under our unified framework. For this purpose, we employed our semi-automatic data conversion tool to standardize foreign data and utilized ground beetle (Carabidae) datasets collected from 12 different observatory sites of ECN. We believe that our attempt to convert domestic and foreign ecological data into a standardized format in a systematic way will be quite useful for data integration and association analysis in many ecological and environmental studies.

Analysis of the Current Status of Data Repositories in the Field of Ecological Research

  • Kim, Suntae
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.2
    • /
    • pp.139-143
    • /
    • 2021
  • In this study, data repository information registered in re3data (re3data.org), a research data registry, was collected. Based on collected data, the current status was analyzed for 354 repositories (approximately 14% of total repositories) in the field using keywords in the ecological field suggested by two experts. Major metadata formats used to describe data in ecological research data repositories include Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC/CSDGM), Dublin Core, ISO 19115, Ecological Metadata Language (EML), Directory Interchange Format (DIF), Darwin Core, Data Documentation Initiative (DDI), and DataCite Metadata Schema. The number of ecological repositories according to country is 102 in the US, 34 in Germany, 31 in Canada, and one in Korea. A total of 771 non-profit organizations and 12 for-profit organizations are involved in the construction of the ecological field research data repository. Data version control ratio of the ecological field research data repositories registered in re3data was analyzed to be somewhat higher (86.6%) than the total ratio (83.9%). Results of this study can be used to establish policies to build and operate a research data repository in the ecological field.

A Study on Big Data Analytics Services and Standardization for Smart Manufacturing Innovation

  • Kim, Cheolrim;Kim, Seungcheon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.91-100
    • /
    • 2022
  • Major developed countries are seriously considering smart factories to increase their manufacturing competitiveness. Smart factory is a customized factory that incorporates ICT in the entire process from product planning to design, distribution and sales. This can reduce production costs and respond flexibly to the consumer market. The smart factory converts physical signals into digital signals, connects machines, parts, factories, manufacturing processes, people, and supply chain partners in the factory to each other, and uses the collected data to enable the smart factory platform to operate intelligently. Enhancing personalized value is the key. Therefore, it can be said that the success or failure of a smart factory depends on whether big data is secured and utilized. Standardized communication and collaboration are required to smoothly acquire big data inside and outside the factory in the smart factory, and the use of big data can be maximized through big data analysis. This study examines big data analysis and standardization in smart factory. Manufacturing innovation by country, smart factory construction framework, smart factory implementation key elements, big data analysis and visualization, etc. will be reviewed first. Through this, we propose services such as big data infrastructure construction process, big data platform components, big data modeling, big data quality management components, big data standardization, and big data implementation consulting that can be suggested when building big data infrastructure in smart factories. It is expected that this proposal can be a guide for building big data infrastructure for companies that want to introduce a smart factory.

Data Framework Design of EDISON 2.0 Digital Platform for Convergence Research

  • Sunggeun Han;Jaegwang Lee;Inho Jeon;Jeongcheol Lee;Hoon Choi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2292-2313
    • /
    • 2023
  • With improving computing performance, various digital platforms are being developed to enable easily utilization of high-performance computing environments. EDISON 1.0 is an online simulation platform widely used in computational science and engineering education. As the research paradigm changes, the demand for developing the EDISON 1.0 platform centered on simulation into the EDISON 2.0 platform centered on data and artificial intelligence is growing. Herein, a data framework, a core module for data-centric research on EDISON 2.0 digital platform, is proposed. The proposed data framework provides the following three functions. First, it provides a data repository suitable for the data lifecycle to increase research reproducibility. Second, it provides a new data model that can integrate, manage, search, and utilize heterogeneous data to support a data-driven interdisciplinary convergence research environment. Finally, it provides an exploratory data analysis (EDA) service and data enrichment using an AI model, both developed to strengthen data reliability and maximize the efficiency and effectiveness of research endeavors. Using the EDISON 2.0 data framework, researchers can conduct interdisciplinary convergence research using heterogeneous data and easily perform data pre-processing through the web-based UI. Further, it presents the opportunity to leverage the derived data obtained through AI technology to gain insights and create new research topics.

A Study on the Role and Security Enhancement of the Expert Data Processing Agency: Focusing on a Comparison of Data Brokers in Vermont (데이터처리전문기관의 역할 및 보안 강화방안 연구: 버몬트주 데이터브로커 비교를 중심으로)

  • Soo Han Kim;Hun Yeong Kwon
    • Journal of Information Technology Services
    • /
    • v.22 no.3
    • /
    • pp.29-47
    • /
    • 2023
  • With the recent advancement of information and communication technologies such as artificial intelligence, big data, cloud computing, and 5G, data is being produced and digitized in unprecedented amounts. As a result, data has emerged as a critical resource for the future economy, and overseas countries have been revising laws for data protection and utilization. In Korea, the 'Data 3 Act' was revised in 2020 to introduce institutional measures that classify personal information, pseudonymized information, and anonymous information for research, statistics, and preservation of public records. Among them, it is expected to increase the added value of data by combining pseudonymized personal information, and to this end, "the Expert Data Combination Agency" and "the Expert Data Agency" (hereinafter referred to as the Expert Data Processing Agency) system were introduced. In comparison to these domestic systems, we would like to analyze similar overseas systems, and it was recently confirmed that the Vermont government in the United States enacted the first "Data Broker Act" in the United States as a measure to protect personal information held by data brokers. In this study, we aim to compare and analyze the roles and functions of the "Expert Data Processing Agency" and "Data Broker," and to identify differences in designated standards, security measures, etc., in order to present ways to contribute to the activation of the data economy and enhance information protection.