• Title/Summary/Keyword: Large data

Search Result 14,138, Processing Time 0.041 seconds

Enhanced Hybrid Privacy Preserving Data Mining Technique

  • Kundeti Naga Prasanthi;M V P Chandra Sekhara Rao;Ch Sudha Sree;P Seshu Babu
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.6
    • /
    • pp.99-106
    • /
    • 2023
  • Now a days, large volumes of data is accumulating in every field due to increase in capacity of storage devices. These large volumes of data can be applied with data mining for finding useful patterns which can be used for business growth, improving services, improving health conditions etc. Data from different sources can be combined before applying data mining. The data thus gathered can be misused for identity theft, fake credit/debit card transactions, etc. To overcome this, data mining techniques which provide privacy are required. There are several privacy preserving data mining techniques available in literature like randomization, perturbation, anonymization etc. This paper proposes an Enhanced Hybrid Privacy Preserving Data Mining(EHPPDM) technique. The proposed technique provides more privacy of data than existing techniques while providing better classification accuracy. The experimental results show that classification accuracies have increased using EHPPDM technique.

Shape Reconstruction from Large Amount of Point Data using Repetitive Domain Decomposition Method (반복적 영역분할법을 이용한 대용량의 점데이터로부터의 형상 재구성)

  • Yoo, Dong-Jin
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.23 no.11 s.188
    • /
    • pp.93-102
    • /
    • 2006
  • In this study an advanced domain decomposition method is suggested in order to construct surface models from very large amount of points. In this method the spatial domain of interest that is occupied by the input set of points is divided in repetitive manner. First, the space is divided into smaller domains where the problem can be solved independently. Then each subdomain is again divided into much smaller domains where the problem can be solved locally. These local solutions of subdivided domains are blended together to obtain a solution of each subdomain using partition of unity function. Then the solutions of subdomains are merged together in order to construct whole surface model. The suggested methods are conceptually very simple and easy to implement. Since RDDM(Repetitive Domain Decomposition Method) is effective in the computation time and memory consumption, the present study is capable of providing a fast and accurate reconstructions of complex shapes from large amount of point data containing millions of points. The effectiveness and validity of the suggested methods are demonstrated by performing numerical experiments for the various types of point data.

An Optimal Routing Algorithm for Large Data Networks (대규모 데이타 네트워크를 위한 최적 경로 설정 알고리즘)

  • 박성우;김영천
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.2
    • /
    • pp.254-265
    • /
    • 1994
  • For solving the optimal routing problem (ORP) in large data networks, and algorithm called the hierarchical aggregation/disaggregation and decomposition/composition gradient project (HAD-GP) algorithm os proposed. As a preliminary work, we improve the performance of the original iterative aggregation/disaggregation GP (IAD-GP) algorithm introduced in [7]. THe A/D concept used in the original IAD-GP algorithm and its modified version naturally fits the hierarchical structure of large data networks and we would expect speed-up in convengence. The proposed HAD-GP algorithm adds a D/C step into the modified IAD-GP algorithm. The HAD-GP algorithm also makes use of the hierarchical-structure topology of large data networks and achieves significant improvement in convergence speed, especially under a distributed environment. The speed-up effects are demonstrated by the numerical implementations comparing the HAD-GP algorithm with the (original and modified) IAD-GP and the ordinary GP (ORD-GP) algorithm.

  • PDF

Health Monitoring System of Large Civil Structural System Based on Local Wireless Communication System (근거리 무선통신을 이용한 대형토목구조물의 모니터링시스템)

  • Heo, Gwanghee;Choi, Man-Yong;Kim, Chi-Yup
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.3 no.4
    • /
    • pp.199-204
    • /
    • 1999
  • The continuing development of the sensors for the measurement of the safety of structures has been making a turning point in measuring and evaluating the larger civil structural system as well. However, there are still remaining problems to be solved for the extremely large structure because the natural damages of those structures are not so simple to be monitored for the reason of their locational and structural conditions. One of the most significant problems is that a number of cables which connect the measuring system to the analyzer are liable to distort actual data. This paper presents a new monitoring system for large structures by means of a local wireless communication technique which would eliminate the possibility of the distortion of data by noise in cables. This new monitoring system employs the wireless system and the software for data communication, along with the strain sensor and accelerometers which have been already used in the past. It makes it possible for the data, which have been chosen by the central controling system from the various sensors placed in the large civil structures, to be wirelessly delivered and then analyzed and evaluated by decision making system of the structures.

  • PDF

A Hybrid Mechanism of Particle Swarm Optimization and Differential Evolution Algorithms based on Spark

  • Fan, Debin;Lee, Jaewan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.5972-5989
    • /
    • 2019
  • With the onset of the big data age, data is growing exponentially, and the issue of how to optimize large-scale data processing is especially significant. Large-scale global optimization (LSGO) is a research topic with great interest in academia and industry. Spark is a popular cloud computing framework that can cluster large-scale data, and it can effectively support the functions of iterative calculation through resilient distributed datasets (RDD). In this paper, we propose a hybrid mechanism of particle swarm optimization (PSO) and differential evolution (DE) algorithms based on Spark (SparkPSODE). The SparkPSODE algorithm is a parallel algorithm, in which the RDD and island models are employed. The island model is used to divide the global population into several subpopulations, which are applied to reduce the computational time by corresponding to RDD's partitions. To preserve population diversity and avoid premature convergence, the evolutionary strategy of DE is integrated into SparkPSODE. Finally, SparkPSODE is conducted on a set of benchmark problems on LSGO and show that, in comparison with several algorithms, the proposed SparkPSODE algorithm obtains better optimization performance through experimental results.

Boosted Regression Method based on Rejection Limits for Large-Scale Data (대량 데이터를 위한 제한거절 기반의 회귀부스팅 기법)

  • Kwon, Hyuk-Ho;Kim, Seung-Wook;Choi, Dong-Hoon;Lee, Kichun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.4
    • /
    • pp.263-269
    • /
    • 2016
  • The purpose of this study is to challenge a computational regression-type problem, that is handling large-size data, in which conventional metamodeling techniques often fail in a practical sense. To solve such problems, regression-type boosting, one of ensemble model techniques, together with bootstrapping-based re-sampling is a reasonable choice. This study suggests weight updates by the amount of the residual itself and a new error decision criterion which constructs an ensemble model of models selectively chosen by rejection limits. Through these ideas, we propose AdaBoost.RMU.R as a metamodeling technique suitable for handling large-size data. To assess the performance of the proposed method in comparison to some existing methods, we used 6 mathematical problems. For each problem, we computed the average and the standard deviation of residuals between real response values and predicted response values. Results revealed that the average and the standard deviation of AdaBoost.RMU.R were improved than those of other algorithms.

Selective Encryption Scheme for Vector Map Data using Chaotic Map

  • Bang, N.V.;Moon, Kwang-Seok;Lim, Sanghun;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.7
    • /
    • pp.818-826
    • /
    • 2015
  • With the rapid interest in Geographic Information System (GIS) contents, a large volume of valuable GIS dataset has been distributed illegally by pirates, hackers, or unauthorized users. Therefore the problem focus on how to protect the copyright of GIS vector map data for storage and transmission. But GIS vector map data is very large and current data encryption techniques often encrypt all components of data. That means we have encrypted large amount of data lead to the long encrypting time and high complexity computation. This paper presents the selective encryption scheme using hybrid transform for GIS vector map data protection to store, transmit or distribute to authorized users. In proposed scheme, polylines and polygons in vector map are targets of selective encryption. We select the significant objects in polyline/polygon layer, and then they are encrypted by the key sets generated by using Chaotic map before changing them in DWT, DFT domain. Experimental results verified the proposed algorithm effectively and error in decryption is approximately zero.

A Design and Implementation of Efficient Storage Structure for a Large RDF Data Processing (대용량 RDF 데이터의 처리 성능 개선을 위한 효율적인 저장구조 설계 및 구현)

  • Mun, Hyeon-Jeong;Sung, Jung-Hwan;Kim, Young-Ji;Woo, Yong-Tae
    • The Journal of Society for e-Business Studies
    • /
    • v.12 no.3
    • /
    • pp.251-268
    • /
    • 2007
  • We design and implement an efficient storage technique to improve query processing for a large RDF data set. The proposed techniques can minimize data redundancy compared to the existing techniques by splitting relation information and data information from triple formatted RDF data. Also, we can enhance query processing speed separating and connecting the entire query steps by relation and data based on the proposed storage technique. The proposed technique can be applied to the areas, such as e-Commerce, semantic web, and KMS to store and retrieve a large RDF data set.

  • PDF

A Study on Light-weight Algorithm of Large scale BIM data for Visualization on Web based GIS Platform (웹기반 GIS 플랫폼 상 가시화 처리를 위한 대용량 BIM 데이터의 경량화 알고리즘 제시)

  • Kim, Ji Eun;Hong, Chang Hee
    • Spatial Information Research
    • /
    • v.23 no.1
    • /
    • pp.41-48
    • /
    • 2015
  • BIM Technology contains data from the life cycle of facility through 3D modeling. For these, one building products the huge file because of massive data. One of them is IFC which is the standard format, and there are issues that large scale data processing based on geometry and property information of object. It increases the rendering speed and constitutes the graphic card, so large scale data is inefficient for screen visualization to user. The light weighting of large scale BIM data has to solve for process and quality of program essentially. This paper has been searched and confirmed about light weight techniques from domestic and abroad researches. To control and visualize the large scale BIM data effectively, we proposed and verified the technique which is able to optimize the BIM character. For operating the large scale data of facility on web based GIS platform, the quality of screen switch from user phase and the effective memory operation were secured.

Design and Implementation of Large Tag Data Transmission Protocol for 2.4GHz Multi-Channel Active RFID System (2.4GHz 다중채널 능동형 RFID시스템을 위한 대용량 태그 데이터 전송 프로토콜의 설계 및 구현)

  • Lee, Chae-Suk;Kim, Dong-Hyun;Kim, Jong-Doek
    • Journal of KIISE:Information Networking
    • /
    • v.37 no.3
    • /
    • pp.217-227
    • /
    • 2010
  • To apply active RFID technology in the various kinds of industry, it needs to quickly transmit a large amount of data. ISO/IEC 18000-7 standard uses the 433.92MHz as single channel system and its transmit rate is just 27.8kbps, that is insufficient for a large amount of data transmission. To solve this problem, we designed a new data transmission protocol using 2.4GHz band. The feature of designed protocol is not only making over 255bytes data messages using the Burst Read UDB but also efficiently transmitting it. To implement this protocol, we use Texas Instruments's SmartRF04 develop kit and CC2500 transceiver as RF module. As an evaluation of 63.75kbytes data transmission, we demonstrate that transmission time of Burst Read UDB has improved as 17.95% faster than that of Read UDB in the ISO/IEC 18000-7.