• Title/Summary/Keyword: Parallel Processing method

Search Result 731, Processing Time 0.025 seconds

GPU-ACCELERATED SPECKLE MASKING RECONSTRUCTION ALGORITHM FOR HIGH-RESOLUTION SOLAR IMAGES

  • Zheng, Yanfang;Li, Xuebao;Tian, Huifeng;Zhang, Qiliang;Su, Chong;Shi, Lingyi;Zhou, Ta
    • Journal of The Korean Astronomical Society
    • /
    • v.51 no.3
    • /
    • pp.65-71
    • /
    • 2018
  • The near real-time speckle masking reconstruction technique has been developed to accelerate the processing of solar images to achieve high resolutions for ground-based solar telescopes. However, the reconstruction of solar subimages in such a speckle reconstruction is very time-consuming. We design and implement a new parallel speckle masking reconstruction algorithm based on the Compute Unified Device Architecture (CUDA) on General Purpose Graphics Processing Units (GPGPU). Tests are performed to validate the correctness of our program on NVIDIA GPGPU. Details of several parallel reconstruction steps are presented, and the parallel implementation between various modules shows a significant speed increase compared to the previous serial implementations. In addition, we present a comparison of runtimes across serial programs, the OpenMP-based method, and the new parallel method. The new parallel method shows a clear advantage for large scale data processing, and a speedup of around 9 to 10 is achieved in reconstructing one solar subimage of $256{\times}256pixels$. The speedup performance of the new parallel method exceeds that of OpenMP-based method overall. We conclude that the new parallel method would be of value, and contribute to real-time reconstruction of an entire solar image.

Performance Evaluation of PDP System Using Realtime Network Monitoring (실시간 네트워크 모니터링을 적용한 PDP 시스템의 성능 평가)

  • Song, Eun-Ha;Jeong, Jae-Hong;Jeong, Young-Sik
    • The KIPS Transactions:PartA
    • /
    • v.11A no.3
    • /
    • pp.181-188
    • /
    • 2004
  • PDF(Parallel/Distributed Processing) is an internet-based parallel/distributed processing system that utilizes resources from hosts on the internet in idle state to perform large scale application through parallel processing, thus decreasing the total execution time. In this paper. do propose an adaptive method to be changed network environment at any time using realtime monitoring of host. It is found from experiments that parallel/distributed processing has better performance than its without monitoring as an adaptive strategy, which copy with task delay factor by overload and fault of network, be applicable to the cockpits of task allocation algorithm in PDP.

Study on the parallel processing algorithms with implicit integration method for real-time vehicle simulator development (실시간 차량 시뮬레이터 개발을 위한 암시적 적분기법을 이용한 병렬처리 알고리즘에 관한 연구)

  • 박민영;이정근;배대성
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.497-500
    • /
    • 1995
  • In this paper, a program for real time simulation of a vehicle is developed. The program uses relative coordinates and BEF(Backward Difference Formula) numerical integration method. Numerical tests showed that the proposed implicit method is more stable in carring out the numerical integration for vehicl dynamics than the explicit method. Hardware requirements for real time simulation are suggested. Algorithms of parallel processing is developed with DSP (digital signal processor).

  • PDF

A Parallel Processing of Finding Neighbor Agents in Flocking Behaviors Using GPU (GPU를 이용한 무리 짓기에서 이웃 에이전트 찾기의 병렬 처리)

  • Lee, Jae-Moon
    • Journal of Korea Game Society
    • /
    • v.10 no.5
    • /
    • pp.95-102
    • /
    • 2010
  • This paper proposes a parallel algorithm of the flocking behaviors using GPU. To do this, we used CUDA as the parallel processing architecture of GPU and then analyzed its characteristics and constraints. Based on them, the paper improved the performance by parallelizing to find the neighbors for an agent which requires the largest cost in the flocking behaviors. We implemented the proposed algorithm on GTX 285 GPU and compared experimentally its performance with the original spatial partitioning method. The results of the comparison showed that the proposed algorithm outperformed the original method up to 9 times with respect to the execution time.

A Fast Transmission of Mobile Agents Using Binomial Trees (바이노미얼 트리를 이용한 이동 에이전트의 빠른 전송)

  • Cho, Soo-Hyun;Kim, Young-Hak
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.341-350
    • /
    • 2002
  • As network environments have been improved and the use of internet has been increased, mobile agent technologies are widely used in the fields of information retrieval, network management, electronic commerce, and parallel/distributed processing. Recently, a lot of researchers have studied the concepts of parallel/distributed processing based on mobile agents. SPMD is the parallel processing method which transmits a program to all the computers participated in parallel environment, and performs a work with different data. Therefore, to transmit fast a program to all the computers is one of important factors to reduce total execution time. In this paper, we consider the parallel environment consisting of mobile agents system, and propose a new method which transmits fast a mobile agent code to all the computers using binomial trees in order to efficiently perform the SPMD parallel processing. The proposed method is compared with another ones through experimental evaluation on the IBM's Aglets, and gets greatly better performance. Also this paper deals with fault tolerances which can be occurred in transmitting a mobile agent using binomial trees.

Efficient Face Recognition using Low-Dimensional PCA: Hierarchical Image & Parallel Processing

  • Song, Young-Jun;Kim, Young-Gil;Kim, Kwan-Dong;Kim, Nam;Ahn, Jae-Hyeong
    • International Journal of Contents
    • /
    • v.3 no.2
    • /
    • pp.1-5
    • /
    • 2007
  • This paper proposes a technique for principal component analysis (PCA) to raise the recognition rate of a front face in a low dimension by hierarchical image and parallel processing structure. The conventional PCA shows a recognition rate of less than 50% in a low dimension (dimensions 1 to 6) when used for facial recognition. In this paper, a face is formed as images of 3 fixed-size levels: the 1st being a region around the nose, the 2nd level a region including the eyes, nose, and mouth, and the 3rd level image is the whole face. PCA of the 3-level images is treated by parallel processing structure, and finally their similarities are combined for high recognition rate in a low dimension. The proposed method under went experimental feasibility study with ORL face database for evaluation of the face recognition function. The experimental demonstration has been done by PCA and the proposed method according to each level. The proposed method showed high recognition of over 50% from dimensions 1 to 6.

CDN Scalability Improvement using a Moderate Peer-assisted Method

  • Shi, Peichang;Wang, Huaimin;Yin, Hao;Ding, Bo;Wang, Tianzuo;Wang, Miao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.3
    • /
    • pp.954-972
    • /
    • 2012
  • Content Delivery Networks (CDN) server loads that fluctuant necessitate CDN to improve its service scalability especially when the peak load exceeds its service capacity. The peer assisted scheme is widely used in improving CDN scalability. However, CDN operators do not want to lose profit by overusing it, which may lead to the CDN resource utilization reduced. Therefore, improving CDN scalability moderately and guarantying CDN resource utilization maximized is necessary. However, when and how to use the peer-assisted scheme to achieve such improvement remains a great challenge. In this paper, we propose a new method called Dynamic Moderate Peer-assisted Method (DMPM), which uses time series analysis to predict and decide when and how many server loads needs to offload. A novel peer-assisted mechanism based on the prediction designed, which can maximize the profit of the CDN operators without influencing scalability. Extensive evaluations based on an actual CDN load traces have shown the effectiveness of DMPM.

The GPU-based Parallel Processing Algorithm for Fast Inspection of Semiconductor Wafers (반도체 웨이퍼 고속 검사를 위한 GPU 기반 병렬처리 알고리즘)

  • Park, Youngdae;Kim, Joon Seek;Joo, Hyonam
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.12
    • /
    • pp.1072-1080
    • /
    • 2013
  • In a the present day, many vision inspection techniques are used in productive industrial areas. In particular, in the semiconductor industry the vision inspection system for wafers is a very important system. Also, inspection techniques for semiconductor wafer production are required to ensure high precision and fast inspection. In order to achieve these objectives, parallel processing of the inspection algorithm is essentially needed. In this paper, we propose the GPU (Graphical Processing Unit)-based parallel processing algorithm for the fast inspection of semiconductor wafers. The proposed algorithm is implemented on GPU boards made by NVIDIA Company. The defect detection performance of the proposed algorithm implemented on the GPU is the same as if by a single CPU, but the execution time of the proposed method is about 210 times faster than the one with a single CPU.

Design of modified Feistel structure for high-capacity and high speed achievement (대용량 고속화 수행을 위한 변형된 Feistel 구조 설계에 관한 연구)

  • Lee Seon-Keun;Jung Woo-Yeol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.3 s.35
    • /
    • pp.183-188
    • /
    • 2005
  • Parallel processing in block cryptographic algorithm is difficult, because Feistel structure that is basis structure of block cryptographic algorithm is sequential processing structure. Therefore this paper changes these sequential processing structure and Feistel structure made parallel processing to be possible. This paper that apply this modified structure designed DES that have parallel Feistel structure. Proposed parallel Feistel structure could prove greatly block cryptographic algorithm's performance such as DES and so on that could not but have trade-off relation the data processing speed and data security interval because block cryptographic algorithm can not use pipeline method because of itself structural problem. Therefore, modified Feistel structure is going to display more superior security function and processing ability of high speed than now in case apply way that is proposed to SEED, AES's Rijndael, Twofish etc. that apply Feistel structure.

  • PDF

A Genetic Algorithm for Scheduling Sequence-Dependant Jobs on Parallel Identical Machines (병렬의 동일기계에서 처리되는 순서의존적인 작업들의 스케쥴링을 위한 유전알고리즘)

  • Lee, Moon-Kyu;Lee, Seung-Joo
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.25 no.3
    • /
    • pp.360-368
    • /
    • 1999
  • We consider the problem of scheduling n jobs with sequence-dependent processing times on a set of parallel-identical machines. The processing time of each job consists of a pure processing time and a sequence-dependent setup time. The objective is to maximize the total remaining machine available time which can be used for other tasks. For the problem, a hybrid genetic algorithm is proposed. The algorithm combines a genetic algorithm for global search and a heuristic for local optimization to improve the speed of evolution convergence. The genetic operators are developed such that parallel machines can be handled in an efficient and effective way. For local optimization, the adjacent pairwise interchange method is used. The proposed hybrid genetic algorithm is compared with two heuristics, the nearest setup time method and the maximum penalty method. Computational results for a series of randomly generated problems demonstrate that the proposed algorithm outperforms the two heuristics.

  • PDF