• Title/Summary/Keyword: massive parallel system

Search Result 43, Processing Time 0.022 seconds

Dynamic Load Balancing Algorithm using Execution Time Prediction on Cluster Systems

  • Yoon, Wan-Oh;Jung, Jin-Ha;Park, Sang-Bang
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.176-179
    • /
    • 2002
  • In recent years, an increasing amount of computer network research has focused on the problem of cluster system in order to achieve higher performance and lower cost. The load unbalance is the major defect that reduces performance of a cluster system that uses parallel program in a form of SPMD (Single Program Multiple Data). Also, the load unbalance is a problem of MPP (Massive Parallel Processors), and distributed system. The cluster system is a loosely-coupled distributed system, therefore, it has higher communication overhead than MPP. Dynamic load balancing can solve the load unbalance problem of cluster system and reduce its communication cost. The cluster systems considered in this paper consist of P heterogeneous nodes connected by a switch-based network. The master node can predict the average execution time of tasks for each slave node based on the information from the corresponding slave node. Then, the master node redistributes remaining tasks to each node considering the predicted execution time and the communication overhead for task migration. The proposed dynamic load balancing uses execution time prediction to optimize the task redistribution. The various performance factors such as node number, task number, and communication cost are considered to improve the performance of cluster system. From the simulation results, we verified the effectiveness of the proposed dynamic load balancing algorithm.

  • PDF

GPU-Based ECC Decode Unit for Efficient Massive Data Reception Acceleration

  • Kwon, Jisu;Seok, Moon Gi;Park, Daejin
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1359-1371
    • /
    • 2020
  • In transmitting and receiving such a large amount of data, reliable data communication is crucial for normal operation of a device and to prevent abnormal operations caused by errors. Therefore, in this paper, it is assumed that an error correction code (ECC) that can detect and correct errors by itself is used in an environment where massive data is sequentially received. Because an embedded system has limited resources, such as a low-performance processor or a small memory, it requires efficient operation of applications. In this paper, we propose using an accelerated ECC-decoding technique with a graphics processing unit (GPU) built into the embedded system when receiving a large amount of data. In the matrix-vector multiplication that forms the Hamming code used as a function of the ECC operation, the matrix is expressed in compressed sparse row (CSR) format, and a sparse matrix-vector product is used. The multiplication operation is performed in the kernel of the GPU, and we also accelerate the Hamming code computation so that the ECC operation can be performed in parallel. The proposed technique is implemented with CUDA on a GPU-embedded target board, NVIDIA Jetson TX2, and compared with execution time of the CPU.

The Design of Parallel Processing S/W Using CUDA for Realtime 3D Laser Ladar Imaging System (실시간 3차원 레이저 레이더 영상 생성을 위한 CUDA 기반 병렬처리 소프트웨어 설계)

  • Cho, Yong Il;Ha, Choong Lim;Yang, Ji Hyeon;Kim, Jae Hyup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.1
    • /
    • pp.1-10
    • /
    • 2013
  • In this paper, we propose a CUDA(Common Unified Device Architecture) based SW(software) design method for CPU(Central Processing Unit) and GPU(Graphic Processing Unit) parallel structure to implement real-time process in 3D Laser ladar(LADAR) imaging system. LADAR is a complex system to generate 3-dimensional image based on the laser ranging information, and requires massive process resources in each phase. Therefore, designing and implementing parallel structure are crucial to realize a real-time process within limited system resource. As a conclusion, we can meet the speed of required real-time process allocating separable work load to CUDA GPU by analyzing process algorithm in each phase and confirm the process speed increase by 46%.

Method for Importance based Streamline Generation on the Massive Fluid Dynamics Dataset (대용량 유동해석 데이터에서의 중요도 기반 스트림라인 생성 방법)

  • Lee, Joong-Youn;Kim, Min Ah;Lee, Sehoon
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.6
    • /
    • pp.27-37
    • /
    • 2018
  • Streamline generation is one of the most representative visualization methods to analyze the flow stream of fluid dynamics dataset. It is a challenging problem, however, to determine the seed locations for effective streamline visualization. Meanwhile, it needs much time to compute effective seed locations and streamlines on the massive flow dataset. In this paper, we propose not only an importance based method to determine seed locations for the effective streamline placements but also a parallel streamline visualization method on the distributed visualization system. Moreover, we introduce case studies on the real fluid dynamics dataset using GLOVE visualization system to evaluate the proposed method.

Optoneural Multitarget Tracking System Based on Optical BJTC and Neural Networks (광 BJTC와 신경회로망을 이용한 광-신경망 다중 표적 추적 시스템)

  • 이상이;류충상;김승현;김은수
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.3
    • /
    • pp.1-9
    • /
    • 1994
  • In this paper as a new approach for real-time multitarget tracking, a hybrid OptoNeural multitarget tracking system based on optical BJTC and neural networks data association algorithm is suggested. In the proposed hybrid tracking system, an optical BJTC is introduced as a preprocessor to reduce the massive input target data into a few correlation peak signals and then the neural networks data association algorithm is used for the massively parallel data association between measurement signals and targets in real-time. Finally, new hybrid type OptoNeural target tracking system is constructed and then some experimental results on multitarget tracking is included. The real-time implementation method of the proposed hybrid system is also discussed.

  • PDF

A Parallel Mode Confocal System using a Micro-Lens and Pinhole Array in a Dual Microscope Configuration (이중 현미경 구조를 이용한 마이크로 렌즈 및 핀홀 어레이 기반 병렬 공초점 시스템)

  • Bae, Sang Woo;Kim, Min Young;Ko, Kuk Won;Koh, Kyung Chul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.979-983
    • /
    • 2013
  • The three-dimensional measurement method of confocal systems is a spot scanning method which has a high resolution and good illumination efficiency. However, conventional confocal systems had a weak point in that it has to perform XY axis scanning to achieve FOV (Field of View) vision through spot scanning. There are some methods to improve this problem involving the use of a galvano mirror [1], pin-hole array, etc. Therefore, in this paper we propose a method to improve a parallel mode confocal system using a micro-lens and pin-hole array in a dual microscope configuration. We made an area scan possible by using a combination MLA (Micro Lens Array) and pin-hole array, and used an objective lens to improve the light transmittance and signal-to-noise ratio. Additionally, we made it possible to change the objective lens so that it is possible to select a lens considering the reflection characteristic of the measuring object and proper magnification. We did an experiment using 5X, 2.3X objective lens, and did a calibration of height using a VLSI calibration target.

Design and Implementation of Big Data Platform for Image Processing in Agriculture (농업 이미지 처리를 위한 빅테이터 플랫폼 설계 및 구현)

  • Nguyen, Van-Quyet;Nguyen, Sinh Ngoc;Vu, Duc Tiep;Kim, Kyungbaek
    • Annual Conference of KIPS
    • /
    • 2016.10a
    • /
    • pp.50-53
    • /
    • 2016
  • Image processing techniques play an increasingly important role in many aspects of our daily life. For example, it has been shown to improve agricultural productivity in a number of ways such as plant pest detecting or fruit grading. However, massive quantities of images generated in real-time through multi-devices such as remote sensors during monitoring plant growth lead to the challenges of big data. Meanwhile, most current image processing systems are designed for small-scale and local computation, and they do not scale well to handle big data problems with their large requirements for computational resources and storage. In this paper, we have proposed an IPABigData (Image Processing Algorithm BigData) platform which provides algorithms to support large-scale image processing in agriculture based on Hadoop framework. Hadoop provides a parallel computation model MapReduce and Hadoop distributed file system (HDFS) module. It can also handle parallel pipelines, which are frequently used in image processing. In our experiment, we show that our platform outperforms traditional system in a scenario of image segmentation.

Multi-Sever based Distributed Coding based on HEVC/H.265 for Studio Quality Video Editing

  • Kim, Jongho;Lim, Sung-Chang;Jeong, Se-Yoon;Kim, Hui-Yong
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.201-208
    • /
    • 2018
  • High Efficiency Video Coding range extensions (HEVC RExt) is a kind of extension model of HEVC. HEVC RExt was specially designed for dealing the high quality images. HEVC RExt is very essential for studio editing which handle the very high quality and various type of images. There are some problems to dealing these massive data in studio editing. One of the most important procedure is re-encoding and decoding procedure during the editing. Various codecs are widely used for studio data editing. But most of the codecs have common problems to dealing the massive data in studio editing. First, the re-encoding and decoding processes are frequently occurred during the studio data editing and it brings enormous time-consuming and video quality loss. This paper, we suggest new video coding structure for the efficient studio video editing. The coding structure which is called "ultra-low delay (ULD)". It has the very simple and low-delayed referencing structure. To simplify the referencing structure, we can minimize the number of the frames which need decoding and re-encoding process. It also prevents the quality degradation caused by the frequent re-encoding. Various fast coding algorithms are also proposed for efficient editing such as tool-level optimization, multi-serve based distributed coding and SIMD (Single instruction, multiple data) based parallel processing. It can reduce the enormous computational complexity during the editing procedure. The proposed method shows 9500 times faster coding speed with negligible loss of quality. The proposed method also shows better coding gain compare to "intra only" structure. We can confirm that the proposed method can solve the existing problems of the studio video editing efficiently.

GLOVE: Distributed Shared Memory Based Parallel Visualization Tool for Massive Scientific Dataset (GLOVE: 대용량 과학 데이터를 위한 분산공유메모리 기반 병렬 가시화 도구)

  • Lee, Joong-Youn;Kim, Min Ah;Lee, Sehoon;Hur, Young Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.6
    • /
    • pp.273-282
    • /
    • 2016
  • Visualization tool can be divided by three components - data I/O, visual transformation and interactive rendering. In this paper, we present requirements of three major components on visualization tools for massive scientific dataset and propose strategies to develop the tool which satisfies those requirements. In particular, we present how to utilize open source softwares to efficiently realize our goal. Furthermore, we also study the way to combine several open source softwares which are separately made to produce a single visualization software and optimize it for realtime visualization of massiv espatio-temporal scientific dataset. Finally, we propose a distributed shared memory based scientific visualization tool which is called "GLOVE". We present a performance comparison among GLOVE and well known open source visualization tools such as ParaView and VisIt.

Multi-Target Tracking System based on Neural Network Data Association Algorithm (신경회로망 데이터 연관 알고리즘에 근거한 다중표적 추적 시스템)

  • 이진호;류충상;김은수
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.29A no.11
    • /
    • pp.70-77
    • /
    • 1992
  • Generally, the conventional tracking algorithms are very limited in the practical applications because of that the computation load is exponentially increased as the number of targets being tracked is increase. Recently, to overcome this kind of limitation, some new tracking methods based on neural network algorithms which have learning and parallel processing capabilities are introduced. By application of neural networks to multi-target tracking problems, the tracking system can be made computationally independent of the number of objects being tracked, through their characteristics of massive parallelism and dense interconnectivity. In this paper, a new neural network tracking algorithm, which has capability of adaptive target tracking with little increase of the amount of calculation under the clutter and noisy environments, is suggested and the possibility of real-time multi-target tracking system based on neural networks is also demonstrated through some good computer simulation results.

  • PDF