• Title/Summary/Keyword: distributed-data processing algorithm

Search Result 181, Processing Time 0.027 seconds

A holistic distributed clustering algorithm based on sensor network (센서 네트워크 기반의 홀리스틱 분산 클러스터링 알고리즘)

  • Chen Ping;Kee-Wook Rim;Nam Ji-Yeun;Lee KyungOh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.874-877
    • /
    • 2008
  • Nowadays the existing data processing systems can only support some simple query for sensor network. It is increasingly important to process the vast data streams in sensor network, and achieve effective acknowledges for users. In this paper, we propose a holistic distributed k-means algorithm for sensor network. In order to verify the effectiveness of this method, we compare it with central k-means algorithm to process the data streams in sensor network. From the evaluation experiments, we can verify that the proposed algorithm is highly capable of processing vast data stream with less computation time. This algorithm prefers to cluster the data streams at the distributed nodes, and therefore it largely reduces redundant data communications compared to the central processing algorithm.

Matrix-based Filtering and Load-balancing Algorithm for Efficient Similarity Join Query Processing in Distributed Computing Environment (분산 컴퓨팅 환경에서 효율적인 유사 조인 질의 처리를 위한 행렬 기반 필터링 및 부하 분산 알고리즘)

  • Yang, Hyeon-Sik;Jang, Miyoung;Chang, Jae-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.7
    • /
    • pp.667-680
    • /
    • 2016
  • As distributed computing platforms like Hadoop MapReduce have been developed, it is necessary to perform the conventional query processing techniques, which have been executed in a single computing machine, in distributed computing environments efficiently. Especially, studies on similarity join query processing in distributed computing environments have been done where similarity join means retrieving all data pairs with high similarity between given two data sets. But the existing similarity join query processing schemes for distributed computing environments have a problem of skewed computing load balance between clusters because they consider only the data transmission cost. In this paper, we propose Matrix-based Load-balancing Algorithm for efficient similarity join query processing in distributed computing environment. In order to uniform load balancing of clusters, the proposed algorithm estimates expected computing cost by using matrix and generates partitions based on the estimated cost. In addition, it can reduce computing loads by filtering out data which are not used in query processing in clusters. Finally, it is shown from our performance evaluation that the proposed algorithm is better on query processing performance than the existing one.

Performance Optimization of Big Data Center Processing System - Big Data Analysis Algorithm Based on Location Awareness

  • Zhao, Wen-Xuan;Min, Byung-Won
    • International Journal of Contents
    • /
    • v.17 no.3
    • /
    • pp.74-83
    • /
    • 2021
  • A location-aware algorithm is proposed in this study to optimize the system performance of distributed systems for processing big data with low data reliability and application performance. Compared with previous algorithms, the location-aware data block placement algorithm uses data block placement and node data recovery strategies to improve data application performance and reliability. Simulation and actual cluster tests showed that the location-aware placement algorithm proposed in this study could greatly improve data reliability and shorten the application processing time of I/O interfaces in real-time.

Design of Distributed Computer Systems Using Tabu Search Method (Tabu 탐색 기법을 이용한 분산 컴퓨팅 시스템 설계)

  • Hong, Jin-Won;Kim, Jae-Yearn
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.18 no.36
    • /
    • pp.143-152
    • /
    • 1995
  • This paper determines the allocation of computers and data files to minimize the sum of processing and communication costs which occur in processing jobs at each node. The problem of optimally configuring a distributed computer system belongs to the class of NP-Complete problems and the object function of this paper is nonlinear function and is hard to solve. This paper seeks the solution of distributed processing system by Tabu Search. Firstly, it presents the method of generating the starting solution proper to the distributed processing system. Secondly, it develops the method of searching neighborhood solutions. Finally, it determines the Tabu restriction appropriate to the distributed processing system. According to the experimental results, this algorithm solves a sized problems in reasonable time and is effective in the convergence of the solution. The algorithm developed in this paper is also applicable to the general allocation problems of the distributed processing system.

  • PDF

Design of Distributed Processing Framework Based on H-RTGL One-class Classifier for Big Data (빅데이터를 위한 H-RTGL 기반 단일 분류기 분산 처리 프레임워크 설계)

  • Kim, Do Gyun;Choi, Jin Young
    • Journal of Korean Society for Quality Management
    • /
    • v.48 no.4
    • /
    • pp.553-566
    • /
    • 2020
  • Purpose: The purpose of this study was to design a framework for generating one-class classification algorithm based on Hyper-Rectangle(H-RTGL) in a distributed environment connected by network. Methods: At first, we devised one-class classifier based on H-RTGL which can be performed by distributed computing nodes considering model and data parallelism. Then, we also designed facilitating components for execution of distributed processing. In the end, we validate both effectiveness and efficiency of the classifier obtained from the proposed framework by a numerical experiment using data set obtained from UCI machine learning repository. Results: We designed distributed processing framework capable of one-class classification based on H-RTGL in distributed environment consisting of physically separated computing nodes. It includes components for implementation of model and data parallelism, which enables distributed generation of classifier. From a numerical experiment, we could observe that there was no significant change of classification performance assessed by statistical test and elapsed time was reduced due to application of distributed processing in dataset with considerable size. Conclusion: Based on such result, we can conclude that application of distributed processing for generating classifier can preserve classification performance and it can improve the efficiency of classification algorithms. In addition, we suggested an idea for future research directions of this paper as well as limitation of our work.

Implementation of a Real-time Data fusion Algorithm for Flight Test Computer (비행시험통제컴퓨터용 실시간 데이터 융합 알고리듬의 구현)

  • Lee, Yong-Jae;Won, Jong-Hoon;Lee, Ja-Sung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.8 no.4 s.23
    • /
    • pp.24-31
    • /
    • 2005
  • This paper presents an implementation of a real-time multi-sensor data fusion algorithm for Flight Test Computer. The sensor data consist of positional information of the target from a radar, a GPS receiver and an INS. The data fusion algorithm is designed by the 21st order distributed Kalman Filter which is based on the PVA model with sensor bias states. A fault detection and correction logics are included in the algorithm for bad measurements and sensor faults. The statistical parameters for the states are obtained from Monte Carlo simulations and covariance analysis using test tracking data. The designed filter is verified by using real data both in post processing and real-time processing.

Hilbert-curve based Multi-dimensional Indexing Key Generation Scheme and Query Processing Algorithm for Encrypted Databases (암호화 데이터를 위한 힐버트 커브 기반 다차원 색인 키 생성 및 질의처리 알고리즘)

  • Kim, Taehoon;Jang, Miyoung;Chang, Jae-Woo
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.10
    • /
    • pp.1182-1188
    • /
    • 2014
  • Recently, the research on database outsourcing has been actively done with the popularity of cloud computing. However, because users' data may contain sensitive personal information, such as health, financial and location information, the data encryption methods have attracted much interest. Existing data encryption schemes process a query without decrypting the encrypted databases in order to support user privacy protection. On the other hand, to efficiently handle the large amount of data in cloud computing, it is necessary to study the distributed index structure. However, existing index structure and query processing algorithms have a limitation that they only consider single-column query processing. In this paper, we propose a grid-based multi column indexing scheme and an encrypted query processing algorithm. In order to support multi-column query processing, the multi-dimensional index keys are generated by using a space decomposition method, i.e. grid index. To support encrypted query processing over encrypted data, we adopt the Hilbert curve when generating a index key. Finally, we prove that the proposed scheme is more efficient than existing scheme for processing the exact and range query.

An Iterative Algorithm for the Bottom Up Computation of the Data Cube using MapReduce (맵리듀스를 이용한 데이터 큐브의 상향식 계산을 위한 반복적 알고리즘)

  • Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.455-464
    • /
    • 2012
  • Due to the recent data explosion, methods which can meet the requirement of large data analysis has been studying. This paper proposes MRIterativeBUC algorithm which enables efficient computation of large data cube by distributed parallel processing with MapReduce framework. MRIterativeBUC algorithm is developed for efficient iterative operation of the BUC method with MapReduce, and overcomes the limitations about the storage size and processing ability caused by large data cube computation. It employs the idea from the iceberg cube which computes only the interesting aspect of analysts and the distributed parallel process of cube computation by partitioning and sorting. Thus, it reduces data emission so that it can reduce network overload, processing amount on each node, and eventually the cube computation cost. The bottom-up cube computation and iterative algorithm using MapReduce, proposed in this paper, can be expanded in various way, and will make full use of many applications.

Processing-Node Status-based Message Scattering and Gathering for Multi-processor Systems on Chip

  • Park, Jongsu
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.4
    • /
    • pp.279-284
    • /
    • 2019
  • This paper presents processing-node status-based message scattering and gathering algorithms for multi-processor systems on chip to reduce the communication time between processors. In the message-scattering part of the message-passing interface (MPI) scatter function, data transmissions are ordered according to the proposed linear algorithm, based on the processor status. The MPI hardware unit in the root processing node checks whether each processing node's status is 'free' or 'busy' when an MPI scatter message is received. Then, it first transfers the data to a 'free' processing node, thereby reducing the scattering completion time. In the message-gathering part of the MPI gather function, the data transmissions are ordered according to the proposed linear algorithm, and the gathering is performed. The root node receives data from the processing node that wants to transfer first, and reduces the completion time during the gathering. The experimental results show that the performance of the proposed algorithm increases at a greater rate as the number of processing nodes increases.

A study on the Design and the Performance Analysis of Radar Data Integrating Systems for a Early Warning System (조기경보 체제를 위한 통합 레이다 정보처리 시스템의 설계 및 성능분석에 관한 연구)

  • 이상웅;라극환;조동래
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.29A no.11
    • /
    • pp.25-39
    • /
    • 1992
  • Due to the data processing development by the computer, the early warning system recently has made a remarkable evolution in its functions and performance as a component of the communication and control system which is also supported by the computer communication and intelligence system. In this paper it is presented that a integrated data processing system is designed to integrate the information sent from the various radar systems which constitute an early warning system. The suggested system model of this paper is devided into two types of structures, the centralized model and the distributed model, according to the data processing algorithm. We apply the queueing theory to analyse the performance of the designed models and the OPNET system kernel to make the analysing program with C language. From the analysis of the queueing components by applying the analysis programs to the designed systems, we got the tendancies and characteristics of both models, that is, a fast data processing performance of the distributed model and a stable data processing capability of the centralized model.

  • PDF