• Title/Summary/Keyword: Computer data processing

Search Result 4,329, Processing Time 0.032 seconds

Big Data Key Challenges

  • Alotaibi, Sultan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.340-350
    • /
    • 2022
  • The big data term refers to the great volume of data and complicated data structure with difficulties in collecting, storing, processing, and analyzing these data. Big data analytics refers to the operation of disclosing hidden patterns through big data. This information and data set cloud to be useful and provide advanced services. However, analyzing and processing this information could cause revealing and disclosing some sensitive and personal information when the information is contained in applications that are correlated to users such as location-based services, but concerns are diminished if the applications are correlated to general information such as scientific results. In this work, a survey has been done over security and privacy challenges and approaches in big data. The challenges included here are in each of the following areas: privacy, access control, encryption, and authentication in big data. Likewise, the approaches presented here are privacy-preserving approaches in big data, access control approaches in big data, encryption approaches in big data, and authentication approaches in big data.

On Efficient Processing of Continuous Reverse Skyline Queries in Wireless Sensor Networks

  • Yin, Bo;Zhou, Siwang;Zhang, Shiwen;Gu, Ke;Yu, Fei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.4
    • /
    • pp.1931-1953
    • /
    • 2017
  • The reverse skyline query plays an important role in information searching applications. This paper deals with continuous reverse skyline queries in sensor networks, which retrieves reverse skylines as well as the set of nodes that reported them for continuous sampling epochs. Designing an energy-efficient approach to answer continuous reverse skyline queries is non-trivial because the reverse skyline query is not decomposable and a huge number of unqualified nodes need to report their sensor readings. In this paper, we develop a new algorithm that avoids transmission of updates from nodes that cannot influence the reverse skyline. We propose a data mapping scheme to estimate sensor readings and determine their dominance relationships without having to know the true values. We also theoretically analyze the properties for reverse skyline computation, and propose efficient pruning techniques while guaranteeing the correctness of the answer. An extensive experimental evaluation demonstrates the efficiency of our approach.

LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud

  • Xu, Hua;Liu, Weiqing;Shu, Guansheng;Li, Jing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.204-226
    • /
    • 2018
  • Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.

Privacy-Constrained Relational Data Perturbation: An Empirical Evaluation

  • Deokyeon Jang;Minsoo Kim;Yon Dohn Chung
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.524-534
    • /
    • 2024
  • The release of relational data containing personal sensitive information poses a significant risk of privacy breaches. To preserve privacy while publishing such data, it is important to implement techniques that ensure protection of sensitive information. One popular technique used for this purpose is data perturbation, which is popularly used for privacy-preserving data release due to its simplicity and efficiency. However, the data perturbation has some limitations that prevent its practical application. As such, it is necessary to propose alternative solutions to overcome these limitations. In this study, we propose a novel approach to preserve privacy in the release of relational data containing personal sensitive information. This approach addresses an intuitive, syntactic privacy criterion for data perturbation and two perturbation methods for relational data release. Through experiments with synthetic and real data, we evaluate the performance of our methods.

Data Source Management using weight table in u-GIS DSMS

  • Kim, Sang-Ki;Baek, Sung-Ha;Lee, Dong-Wook;Chung, Warn-Il;Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.27-33
    • /
    • 2009
  • The emergences of GeoSensor and researches about GIS have promoted many researches of u-GIS. The disaster application coupled in the u-GIS can apply to monitor accident area and to prevent spread of accident. The application needs the u-GIS DSMS technique to acquire, to process GeoSensor data and to integrate them with GIS data. The u-GIS DSMS must process big and large-volume data stream such as spatial data and multimedia data. Due to the feature of the data stream, in u-GIS DSMS, query processing can be delayed. Moreover, as increasing the input rate of data in the area generating events, the network traffic is increased. To solve this problem, in this paper we describe TRIGGER ACTION clause in CQ on the u-GIS DSMS environment and proposes data source management. Data source weight table controls GES information and incoming data rate. It controls incoming data rate as increasing weight at GES of disaster area. Consequently, it can contribute query processing rate and accuracy

  • PDF

Research on Fault Diagnosis of Wind Power Generator Blade Based on SC-SMOTE and kNN

  • Peng, Cheng;Chen, Qing;Zhang, Longxin;Wan, Lanjun;Yuan, Xinpan
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.870-881
    • /
    • 2020
  • Because SCADA monitoring data of wind turbines are large and fast changing, the unbalanced proportion of data in various working conditions makes it difficult to process fault feature data. The existing methods mainly introduce new and non-repeating instances by interpolating adjacent minority samples. In order to overcome the shortcomings of these methods which does not consider boundary conditions in balancing data, an improved over-sampling balancing algorithm SC-SMOTE (safe circle synthetic minority oversampling technology) is proposed to optimize data sets. Then, for the balanced data sets, a fault diagnosis method based on improved k-nearest neighbors (kNN) classification for wind turbine blade icing is adopted. Compared with the SMOTE algorithm, the experimental results show that the method is effective in the diagnosis of fan blade icing fault and improves the accuracy of diagnosis.

Privacy Enhancement and Secure Data Transmission Mechanism for Smart Grid System

  • Li, Shi;Choi, Kyung;Doh, Inshil;Chae, Ki-Joon
    • Annual Conference of KIPS
    • /
    • 2011.04a
    • /
    • pp.1009-1011
    • /
    • 2011
  • With the growth of Smart Grid technologies, security and privacy will become the most important issues and more attention should be paid. There are some existing solutions about anonymization of smart meters, however, they still have some potential threats. In this paper, we describe an enhanced method to protect the privacy of consumer data. When metering data are required by a utility or the electrical energy distribution center for operational reasons, data are delivered not with the real IDs but with temporary IDs. In addition, these temporary IDs are changed randomly to prevent the attackers from analyzing the energy usage patterns. We also describe secure data transmission method for securing data delivered. In this way, we can enhance the privacy of Smart Grid System with low overhead.

Mutable Encryption for Oblivious Data Access in Cloud Storage

  • Ahmad, Mahmood;Hussain, Shujjat;Pervez, Zeeshan;Lee, Sungyoung;Chung, Tae Choong
    • Annual Conference of KIPS
    • /
    • 2013.05a
    • /
    • pp.157-158
    • /
    • 2013
  • Data privacy and access control policies in computer clouds are a prime concerns while talking about the sensitive data. Authorized access is ensured with the help of secret keys given to a range of valid users. Granting the role access is a trivial matter but revoking user access is tricky and compute intensive. To revoke a user and making his data access ineffective the data owner has to compute new set of keys for the rest of effective users. This situation is inappropriate where user revocation is a frequent phenomenon. Time based revocation is another way to deal this issue where key for data access expires automatically. This solution rests in a very strong assumption of time determination in advance. In this paper we have proposed a mutable encryption for oblivious data access in cloud storage where the access key becomes ineffective after defined number of threshold by the data owner. The proposed solution adds to its novelty by introducing mutable encryption while accessing the data obliviously.

An Index Structure for Efficient X-Path Processing on S-XML Data (S-XML 데이터의 효율적인 X-Path 처리를 위한 색인 구조)

  • Zhang, Gi;Jang, Yong-Il;Park, Soon-Young;Oh, Young-Hwan;Bae, Hae-Young
    • Annual Conference of KIPS
    • /
    • 2005.05a
    • /
    • pp.51-54
    • /
    • 2005
  • This paper proposes an index structure which is used to process X-Path on S-XML data. There are many previous index structures based on tree structure for X-Path processing. Because of general tree index's top-down query fashion, the unnecessary node traversal makes heavy access and decreases the query processing performance. And both of the two query types for X-Path called single-path query and branching query need to be supported in proposed index structure. This method uses a combination of path summary and the node indexing. First, it manages hashing on hierarchy elements which are presented in tag in S-XML. Second, array blocks named path summary array is created in each node of hashing to store the path information. The X-Path processing finds the tag element using hashing and checks array blocks in each node to determine the path of query's result. Based on this structure, it supports both single-path query and branching path query and improves the X-Path processing performance.

  • PDF

Distributed Data Allocation Methods and Implementations using the Temporary Table (임시 테이블을 사용한 분산 데이타 할당 방법 및 구현)

  • Heo, Gye-Beom;Lee, Jong-Seop;Jeong, Gye-Dong;Choe, Yeong-Geun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.4
    • /
    • pp.1076-1088
    • /
    • 1997
  • Data repliciation techniques of distributed database a allocation methods occures to overhead to change together all repkicas for consistency maintenance at updates .In case of data migration horizontal or vertical fragment migration has caused to caused to increase data communication.In this paper we purpose batch proxessing method groups the associated items of table and stores temporary table assciated with file relication and migrations.This method increase the usagility of system buffer in distributed trancsaction processing with a number of data inputs and updates.After all,it improved the perform-ance of distributed transaction system by reducing disk I\O processing time and the cost of data cimmunication among local sites.

  • PDF