• Title/Summary/Keyword: Parallel data processing

Search Result 751, Processing Time 0.033 seconds

(Task Creation and Allocation for Static Load Balancing in Parallel Spatial Join (병렬 공간 조인 시 정적 부하 균등화를 위한 작업 생성 및 할당 방법)

  • Park, Yun-Phil;Yeom, Keun-Hyuk
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.418-429
    • /
    • 2001
  • Recently, a GIS has been applicable to the most important computer applications such as urban information systems and transportation information systems. These applications require spatial operations for an efficient management of a large volume of data. In particular, a spatial join among basic operations has the property that its response time is increased exponentially according to the number of spatial objects included in the operation. Therefore, it is not proper to the systems demanding the fast response time. To satisfy these requirements, the efficient parallel processing of spatial joins has been required. In this paper, the efficient method for creating and allocating tasks to balance statically the load of each processor in a parallel spatial join is presented. A task graph is developed in which a vertex weight is calculated by the cost model I have proposed. Then, it is partitioned through a graph partitioning algorithm. According to the experiments in CC16 parallel machine, our method made an improvement in the static load balance by decreasing the variance of a task execution time on each processor.

  • PDF

Design and Implementation of SDR-based Multi-Constellation Multi-Frequency Real-Time A-GNSS Receiver Utilizing GPGPU

  • Yoo, Won Jae;Kim, Lawoo;Lee, Yu Dam;Lee, Taek Geun;Lee, Hyung Keun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.10 no.4
    • /
    • pp.315-333
    • /
    • 2021
  • Due to the Global Navigation Satellite System (GNSS) modernization, recently launched GNSS satellites transmit signals at various frequency bands such as L1, L2 and L5. Considering the Korean Positioning System (KPS) signal and other GNSS augmentation signals in the future, there is a high probability of applying more complex communication techniques to the new GNSS signals. For the reason, GNSS receivers based on flexible Software Defined Radio (SDR) concept needs to be developed to evaluate various experimental communication techniques by accessing each signal processing module in detail. This paper proposes a novel SDR-based A-GNSS receiver capable of processing multi-GNSS/RNSS signals at multi-frequency bands. Due to the modular structure, the proposed receiver has high flexibility and expandability. For real-time implementation, A-GNSS server software is designed to provide immediate delivery of satellite ephemeris data on demand. Due to the sampling bandwidth limitation of RF front-ends, multiple SDRs are considered to process the multi-GNSS/RNSS multi-frequency signals simultaneously. To avoid the overflow problem of sampled RF data, an efficient memory buffer management strategy was considered. To collect and process the multi-GNSS/RNSS multi-frequency signals in real-time, the proposed SDR A-GNSS receiver utilizes multiple threads implemented on a CPU and multiple NVIDIA CUDA GPGPUs for parallel processing. To evaluate the performance of the proposed SDR A-GNSS receiver, several experiments were performed with field collected data. By the experiments, it was shown that A-GNSS requirements can be satisfied sufficiently utilizing only milliseconds samples. The continuous signal tracking performance was also confirmed with the hundreds of milliseconds data for multi-GNSS/RNSS multi-frequency signals and with the ten-seconds data for multi-GNSS/RNSS single-frequency signals.

Implementation of the Squared-Error Pattern Clustering Processor Using the Residue Number System (剩餘數體系를 이용한 자승오차 패턴 클러스터링 프로세서의 실현)

  • Kim, Hyeong-Min;Cho, Won-Kyung
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.2
    • /
    • pp.87-93
    • /
    • 1989
  • Squared-error Pattern Clustering algorithm used in unsupervised pattern recognition and image processing application demands substantial processing time for operation of feature vector matrix. So, this paper propose the fast squared-error Pattern Clustering Processor using the Residue Number System which have been the nature of parallel processing and pipeline. The proposed Squared-error Pattern Clustering Processor illustrate satisfiable error rate for Cluster number which can be divide meaningful region and about 200 times faster than 80287 coprocessor from experiments result of image segmentation. In this result, it is useful to real-time processing application for large data.

  • PDF

Real-time Full-view 3D Human Reconstruction using Multiple RGB-D Cameras

  • Yoon, Bumsik;Choi, Kunwoo;Ra, Moonsu;Kim, Whoi-Yul
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.4
    • /
    • pp.224-230
    • /
    • 2015
  • This manuscript presents a real-time solution for 3D human body reconstruction with multiple RGB-D cameras. The proposed system uses four consumer RGB/Depth (RGB-D) cameras, each located at approximately $90^{\circ}$ from the next camera around a freely moving human body. A single mesh is constructed from the captured point clouds by iteratively removing the estimated overlapping regions from the boundary. A cell-based mesh construction algorithm is developed, recovering the 3D shape from various conditions, considering the direction of the camera and the mesh boundary. The proposed algorithm also allows problematic holes and/or occluded regions to be recovered from another view. Finally, calibrated RGB data is merged with the constructed mesh so it can be viewed from an arbitrary direction. The proposed algorithm is implemented with general-purpose computation on graphics processing unit (GPGPU) for real-time processing owing to its suitability for parallel processing.

An Improved Convex Hull Algorithm Considering Sort in Plane Point Set (평면 점집합에서 정렬을 고려한 개선된 컨벡스 헐 알고리즘)

  • Park, Byeong-Ju;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.17 no.1
    • /
    • pp.29-35
    • /
    • 2013
  • In this paper, we suggest an improved Convex Hull algorithm considering sort in plane point set. This algorithm has low computational complexity since processing data are reduced by characteristic of extreme points. Also it obtains a complete convex set with just one processing using an convex vertex discrimination criterion. Initially it requires sorting of point set. However we can't quickly sort because of its heavy operations. This problem was solved by replacing value and index. We measure the execution time of algorithms by generating a random set of points. The results of the experiment show that it is about 2 times faster than the existing algorithm.

An Implementation of a Convolutional Accelerator based on a GPGPU for a Deep Learning (Deep Learning을 위한 GPGPU 기반 Convolution 가속기 구현)

  • Jeon, Hee-Kyeong;Lee, Kwang-yeob;Kim, Chi-yong
    • Journal of IKEEE
    • /
    • v.20 no.3
    • /
    • pp.303-306
    • /
    • 2016
  • In this paper, we propose a method to accelerate convolutional neural network by utilizing a GPGPU. Convolutional neural network is a sort of the neural network learning features of images. Convolutional neural network is suitable for the image processing required to learn a lot of data such as images. The convolutional layer of the conventional CNN required a large number of multiplications and it is difficult to operate in the real-time on the embedded environment. In this paper, we reduce the number of multiplications through Winograd convolution operation and perform parallel processing of the convolution by utilizing SIMT-based GPGPU. The experiment was conducted using ModelSim and TestDrive, and the experimental results showed that the processing time was improved by about 17%, compared to the conventional convolution.

The Distributed Encryption Processing System for Large Capacity Personal Information based on MapReduce (맵리듀스 기반 대용량 개인정보 분산 암호화 처리 시스템)

  • Kim, Hyun-Wook;Park, Sung-Eun;Euh, Seong-Yul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.3
    • /
    • pp.576-585
    • /
    • 2014
  • Collecting and utilizing have a huge amount of personal data have caused severe security issues such as leakage of personal information. Several encryption algorithms for collected personal information have been widely adopted to prevent such problems. In this paper, a novel algorithm based on MapReduce is proposed for encrypting such private information. Furthermore, test environment has been built for the performance verification of the distributed encryption processing method. As the result of the test, average time efficiency has improved to 15.3% compare to encryption processing of token server and 3.13% compare to parallel processing.

Low-Power Multiplication Processing Element Hardware to Support Parallel Convolutional Neural Network Processing (합성곱 신경망 병렬 연산처리를 지원하는 저전력 곱셈 프로세싱 엘리먼트 설계)

  • Eunpyoung Park;Jongsu Park
    • Journal of Platform Technology
    • /
    • v.12 no.2
    • /
    • pp.58-63
    • /
    • 2024
  • CNNs tend to take a long time to learn and consume a lot of power due to lack of system resources with many data processing units when there are repetitive handles that do not have high performance in the image field. In this paper, we propose a handling method based on a low-power bus that can increase the exchange rate of multipliers and multiplicands within the convolution mixer, which is a tendency activity that occurs when a convolution mixer has multiplication, which is the core element of combination. Convolutional neural networks have proprietary low-power shared processor support and the design was implemented on an Intel DE1-SoC FPGA board using Verilog-HDL. The experiments validated the performance by comparing it with the exchange rate of the multiplier originally proposed by Shen on MNIST's numeric image database.

  • PDF

Multi-Sever based Distributed Coding based on HEVC/H.265 for Studio Quality Video Editing

  • Kim, Jongho;Lim, Sung-Chang;Jeong, Se-Yoon;Kim, Hui-Yong
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.201-208
    • /
    • 2018
  • High Efficiency Video Coding range extensions (HEVC RExt) is a kind of extension model of HEVC. HEVC RExt was specially designed for dealing the high quality images. HEVC RExt is very essential for studio editing which handle the very high quality and various type of images. There are some problems to dealing these massive data in studio editing. One of the most important procedure is re-encoding and decoding procedure during the editing. Various codecs are widely used for studio data editing. But most of the codecs have common problems to dealing the massive data in studio editing. First, the re-encoding and decoding processes are frequently occurred during the studio data editing and it brings enormous time-consuming and video quality loss. This paper, we suggest new video coding structure for the efficient studio video editing. The coding structure which is called "ultra-low delay (ULD)". It has the very simple and low-delayed referencing structure. To simplify the referencing structure, we can minimize the number of the frames which need decoding and re-encoding process. It also prevents the quality degradation caused by the frequent re-encoding. Various fast coding algorithms are also proposed for efficient editing such as tool-level optimization, multi-serve based distributed coding and SIMD (Single instruction, multiple data) based parallel processing. It can reduce the enormous computational complexity during the editing procedure. The proposed method shows 9500 times faster coding speed with negligible loss of quality. The proposed method also shows better coding gain compare to "intra only" structure. We can confirm that the proposed method can solve the existing problems of the studio video editing efficiently.

Development of a gridded crop growth simulation system for the DSSAT model using script languages (스크립트 언어를 사용한 DSSAT 모델 기반 격자형 작물 생육 모의 시스템 개발)

  • Yoo, Byoung Hyun;Kim, Kwang Soo;Ban, Ho-Young
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.20 no.3
    • /
    • pp.243-251
    • /
    • 2018
  • The gridded simulation of crop growth, which would be useful for shareholders and policy makers, often requires specialized computation tasks for preparation of weather input data and operation of a given crop model. Here we developed an automated system to allow for crop growth simulation over a region using the DSSAT (Decision Support System for Agrotechnology Transfer) model. The system consists of modules implemented using R and shell script languages. One of the modules has a functionality to create weather input files in a plain text format for each cell. Another module written in R script was developed for GIS data processing and parallel computing. The other module that launches the crop model automatically was implemented using the shell script language. As a case study, the automated system was used to determine the maximum soybean yield for a given set of management options in Illinois state in the US. The AgMERRA dataset, which is reanalysis data for agricultural models, was used to prepare weather input files during 1981 - 2005. It took 7.38 hours to create 1,859 weather input files for one year of soybean growth simulation in Illinois using a single CPU core. In contrast, the processing time decreased considerably, e.g., 35 minutes, when 16 CPU cores were used. The automated system created a map of the maturity group and the planting date that resulted in the maximum yield in a raster data format. Our results indicated that the automated system for the DSSAT model would help spatial assessments of crop yield at a regional scale.