• Title/Summary/Keyword: Parallel data processing

Search Result 751, Processing Time 0.032 seconds

Extracting Maximum Parallelism for Parallel Computing (병렬 계산을 위한 최대 병렬성 추출 방법)

  • Park, Doo-Soon
    • The Journal of Korean Association of Computer Education
    • /
    • v.8 no.1
    • /
    • pp.93-103
    • /
    • 2005
  • Since the most program execution time is consumed in a loop structure, extracting parallelism from sequential loop programs is critical for the faster program execution. Conventional studies for extracting the parallelism are focused mostly on a uniform data dependence distance. In this paper, we proposed data dependency elimination method for a nested loop and extended data dependency elimination method to extract parallelism from the loop with procedure calls. The data dependency elimination method and the extended data dependency elimination method can be applied to uniform and non-uniform data dependency distance. We compared our method with conventional methods using CRAY-T3E for the performance evaluation. The results show that the proposed algorithms are very effective.

  • PDF

Development of the design methodology for large-scale database based on MongoDB

  • Lee, Jun-Ho;Joo, Kyung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.57-63
    • /
    • 2017
  • The recent sudden increase of big data has characteristics such as continuous generation of data, large amount, and unstructured format. The existing relational database technologies are inadequate to handle such big data due to the limited processing speed and the significant storage expansion cost. Thus, big data processing technologies, which are normally based on distributed file systems, distributed database management, and parallel processing technologies, have arisen as a core technology to implement big data repositories. In this paper, we propose a design methodology for large-scale database based on MongoDB by extending the information engineering methodology based on E-R data model.

Efficient Parallel Logic Simulation on SIMD Computers (SIMD 컴퓨터상에서 효율적인 병렬처리 논리 시뮬레이션)

  • Chung, Yun-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.2
    • /
    • pp.315-326
    • /
    • 1996
  • As the complexity of VLSI circuits has increased, a lot of simulation time for verifying their correctness has been required. This paper presents efficient parallelel logic simulation protocols, data structures, algorithms to implement fast logic simulation on SIMD parallel processing computers. The performance results of the presented schemes on CM-2 are given and analyzed.

  • PDF

A Rule Generation Technique Utilizing a Parallel Expansion Method (병렬확장을 활용한 규칙생성 기법)

  • Lee, Kee-Cheol;Kim, Jin-Bong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.4
    • /
    • pp.942-950
    • /
    • 1998
  • Extraction of knowledge, especially in the form of rules, from raw data is very important in data mining, the aim of which is to help users who feel the lack of knowledge in spite of the abundance of data. Logic minimization tools are ones which derive optimized knowledge given ON set and DC set. First, the parallel expansion scheme of logic minimization is extracted and used to obtain intial knowledge to get final rules, which are successfully applicable to real world data. The prototype system based on this new approach has been experimented with real world data to show that it is as practical as conventional long studied decision tree methods like C4.5 system.

  • PDF

Implementation of Underwater Simulation of a Net using OpenMP (OpenMP 병렬프로그램을 이용한 그물의 수중형상 시뮬레이션 구현)

  • Park, Myeong-Chul;Park, Seok-Gyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.2
    • /
    • pp.11-17
    • /
    • 2008
  • The net shape effects by the various vectors in underwater. Each particle of the net calculating the effect of all vectors augments an accuracy and reality. But, the time complexity becomes larger because of huge calculation. The previous techniques reduced a physics reality. And embodied the underwater virtual reality which augments visual reality with simulation. In this paper, parallel processing the particles, it embodied the simulation which is satisfied a physical reality and time reality. The parallel processing used the OpenMP, and the reality graphic expression used the OpenGL. The simulation which this paper Proposes will be the possibility becoming the fundamental data for a model analysis or a specialist system from game and marine field.

  • PDF

Parallelization of Raster GIS Operations Using PC Clusters (PC 클러스터를 이용한 래스터 GIS 연산의 병렬화)

  • 신윤호;박수홍
    • Spatial Information Research
    • /
    • v.11 no.3
    • /
    • pp.213-226
    • /
    • 2003
  • With the increasing demand of processing massive geographic data, conventional GISs based on the single processor architecture appear to be problematic. Especially, performing complex GIS operations on the massive geographic data is very time consuming and even impossible. This is due to the processor speed development does not keep up with the data volume to be processed. In the field of GIS, this PC clustering is one of the emerging technology for handling massive geographic data effectively. In this study, a MPI(Message Passing Interface)-based parallel processing approach was conducted to implement the existing raster GIS operations that typically requires massive geographic data sets in order to improve the processing capabilities and performance. Specially for this research, four types of raster CIS operations that Tomlin(1990) has introduced for systematic analysis of raster GIS operation. A data decomposition method was designed and implemented for selected raster GIS operations.

  • PDF

Implementation of high performance parallel LU factorization program for multi-threads on GPGPUs (GPGPU의 멀티 쓰레드를 활용한 고성능 병렬 LU 분해 프로그램의 구현)

  • Shin, Bong-Hi;Kim, Young-Tae
    • Journal of Internet Computing and Services
    • /
    • v.12 no.3
    • /
    • pp.131-137
    • /
    • 2011
  • GPUs were originally designed for graphic processing, and GPGPUs are general-purpose GPUs for numerical computation with high performance and low electric power. In this paper, we implemented the parallel LU factorization program for GPGPUs. In CUDA, which is computational environment for Nvidia GPGPUs, domains are divided into blocks, and multi-threads compute each sub-blocks Simultaneously. In LU factorization program, computation order should be artificially decided due to the data dependence. To resolve the data dependancy, we suggested a parallel LU program for GPGPUs, and also explained parallel reduction algorithm for partial pivoting of LU factorization. We finally present performance analysis to show efficiency of the parallel LU factorization program based on multi-threads on GPGPUs.

A Parallel Algorithm for Merging Heaps on MasPar Machine (MasPar 머쉰상의 병렬 힙 병합 알고리즘)

  • Min, Yong-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.2 no.4
    • /
    • pp.554-560
    • /
    • 1995
  • In this paper, we suggest a parallel algorithm to merge priority queues organized in two heaps, kheap and nheap of sizes k and n, correspondingly. Employing max(2$^{-1}$, $\ulcorner$(m+1)/4$\lrcorner$'s processors, this algorithm requires O(log(n/k)*log(n)) on an EREW-PRAM, where i is the height of the heap and m is the summation of sizes n and k. Also, when we run it on the MasPar machine, this method achieves a 33.934-fold speedup with 64 processors to merge 8 million data items which consist of two heaps of different sizes. So our parallel algorithm's EPU is close to 1, which is considered as an optimal speedup ratio.eedup ratio.

  • PDF

Support vector machines for big data analysis (빅 데이터 분석을 위한 지지벡터기계)

  • Choi, Hosik;Park, Hye Won;Park, Changyi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.989-998
    • /
    • 2013
  • We cannot analyze big data, which attracts recent attentions in industry and academy, by batch processing algorithms developed in data mining because big data, by definition, cannot be uploaded and processed in the memory of a single system. So an imminent issue is to develop various leaning algorithms so that they can be applied to big data. In this paper, we review various algorithms for support vector machines in the literature. Particularly, we introduce online type and parallel processing algorithms that are expected to be useful in big data classifications and compare the strengths, the weaknesses and the performances of those algorithms through simulations for linear classification.

Gene Expression Data Analysis Using Parallel Processor based Pattern Classification Method (병렬 프로세서 기반의 패턴 분류 기법을 이용한 유전자 발현 데이터 분석)

  • Choi, Sun-Wook;Lee, Chong-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.6
    • /
    • pp.44-55
    • /
    • 2009
  • Diagnosis of diseases using gene expression data obtained from microarray chip is an active research area recently. It has been done by general machine learning algorithms, because it is difficult to analyze directly. However, recent research results about the analysis based on the interaction between genes is essential for the gene expression analysis, which means the analysis using the traditional machine learning algorithms has limitations. In this paper, we classify the gene expression data using the hyper-network model that considers the higher-order correlations between the features, and then compares the classification accuracies. And also, we present the new hypo-network model that improve the disadvantage of existing model, and compare the processing performances of the existing hypo-network model based on general sequential processor and the improved hypo-network model implemented on parallel processors. In the experimental results, we show that the performance of our model shows improved and competitive classification performance than traditional machine learning methods, as well as, the existing hypo-network model. We show that the performance is maximized when the hypernetwork model is implemented on our parallel processors.