• 제목/요약/키워드: Parallel computation

검색결과 592건 처리시간 0.032초

GPGPU를 이용한 홀로그램 생성 가속화 방법 (Hologram Generation Acceleration Method Using GPGPU)

  • 이윤혁;김동욱;서영호
    • 방송공학회논문지
    • /
    • 제22권6호
    • /
    • pp.800-807
    • /
    • 2017
  • 컴퓨터를 이용하여 홀로그램을 생성하기 위해서는 방대한 양의 계산이 필요하다. 이를 고속화하기 위해 GPGPU(General Purpose computing on Graphic Process Unit)를 이용하여 병렬 프로그래밍을 통한 고속화 방법들이 많이 연구되었다. 본 논문에서는 홀로그램 화소 기반의 병렬처리에서 생기는 병목현상을 줄이고, 공통항을 이용한 가속화 방법을 제안한다. 또한 최적의 쓰레드를 결정하기 위해 nVidia사의 CUDA와 함께 제공되는 Visual Profiler를 이용한 최적화 방법을 소개한다. 구현 결과 기존 연구 대비 최대 40%의 계산시간을 줄일 수 있었다.

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제6권3호
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

Parallel Implementation Strategy for Content Based Video Copy Detection Using a Multi-core Processor

  • Liao, Kaiyang;Zhao, Fan;Zhang, Mingzhu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권10호
    • /
    • pp.3520-3537
    • /
    • 2014
  • Video copy detection methods have emerged in recent years for a variety of applications. However, the lack of efficiency in the usual retrieval systems restricts their use. In this paper, we propose a parallel implementation strategy for content based video copy detection (CBCD) by using a multi-core processor. This strategy can support video copy detection effectively, and the processing time tends to decrease linearly as the number of processors increases. Experiments have shown that our approach is successful in speeding up computation and as well as in keeping the performance.

Efficient m-step Generalization of Iterative Methods

  • 김선경
    • 한국산업정보학회논문지
    • /
    • 제11권5호
    • /
    • pp.163-169
    • /
    • 2006
  • In order to use parallel computers in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming in simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications and main memory access compared to the standard methods.

  • PDF

Two-Step Eulerian 기법에 기반 한 충돌 해석의 병렬처리 및 병렬효율 평가 (Parallel Procedure and Evaluation of Parallel Performance of Impact Simulation Based on Two-Step Eulerian Scheme)

  • 김승조;이민형;백승훈
    • 대한기계학회논문집A
    • /
    • 제30권10호
    • /
    • pp.1320-1327
    • /
    • 2006
  • Parallel procedure and performance of two-step Eulerian code have not been reported sufficiently yet even though it was developed and utilized widely in the impact simulation. In this study, parallel strategy of two-step Eulerian code was proposed and described in detail. The performance was evaluated in the self-made linux cluster computer. Compared with commercial code, a relatively good performance is achieved. Through the performance evaluation of each computation stage, remap is turned out to be the most time consuming part among the other part such as FE processing, communication, time marching etc.

배경회전하에서 형성되는 주기적 유동의 3차원 수치해석과 실험 (Three-Dimensional Numerical Computation and Experiment on Periodic Flows under a Background Rotation)

  • 서용권;박재현
    • 대한기계학회논문집B
    • /
    • 제27권5호
    • /
    • pp.628-634
    • /
    • 2003
  • We present numerical and experimental results of periodic flows inside a rectangular container under a background rotation. The periodic flows are generated by changing the speed of rotation periodically so that a time-periodic body forces produce the unsteady flows. In numerical computation, a parallel-computation technique with MPI is implemented. Flow visualization and PIV measurement are also performed to obtain velocity fields at the free surface. Through a series of numerical and experimental works, we aim to clarify, if any, the fundamental reasons \ulcornerf discrepancy between the two-dimensional computation and the experimental measurement, which was detected in the previous study for the same flow model. Specifically, we check if the various assumptions prerequisite for the validity of the classical Ekman pumping law are satisfied for periodic flows under a background rotation.

Dynamic Model of PEM Fuel Cell Using Real-time Simulation Techniques

  • Jung, Jee-Hoon;Ahmed, Shehab
    • Journal of Power Electronics
    • /
    • 제10권6호
    • /
    • pp.739-748
    • /
    • 2010
  • The increased integration of fuel cells with power electronics, critical loads, and control systems has prompted recent interest in accurate electrical terminal models of the polymer electrolyte membrane (PEM) fuel cell. Advancement in computing technologies, particularly parallel computation techniques and various real-time simulation tools have allowed the prototyping of novel apparatus to be investigated in a virtual system under a wide range of realistic conditions repeatedly, safely, and economically. This paper builds upon both advancements and provides a means of optimized model construction boosting computation speeds for a fuel cell model on a real-time simulator which can be used in a power hardware-in-the-loop (PHIL) application. Significant improvement in computation time has been achieved. The effectiveness of the proposed model developed on Opal RT's RT-Lab Matlab/Simulink based real-time engineering simulator is verified using experimental results from a Ballard Nexa fuel cell system.

분산 병렬 계산환경에 적합한 초대형 유한요소 해석 결과의 효율적 병렬 가시화 (Efficient Parallel Visualization of Large-scale Finite Element Analysis Data in Distributed Parallel Computing Environment)

  • 김창식;송유미;김기욱;조진연
    • 한국항공우주학회지
    • /
    • 제32권10호
    • /
    • pp.38-45
    • /
    • 2004
  • 본 논문에서는 병렬 랜더링 기법의 특정들을 고창하고 이를 토대로 대규모 유한요소 해석결과를 효율적으로 가시화 할 수 있는 병렬 가시화 알고리듬을 제안하였다. 제안된 알고리듬은 요소영역별 계산을 기반으로 하는 병렬 유한요소 해석의 특성에 적합하도록 부분 후 분류방식을 기반으로 설계되었으며, 이미지 조합 과정에 수반되는 네트워크 통신을 효율화하고자 이진 트리구조 통신 패턴을 적용하여 구성되었다. 자체 개발된 소프트웨어를 이용하여 벤치마킹 테스트를 수행하고, 이를 통해 제안된 알고리듬의 병렬 가시화 성능을 측정하였다.

NUMERICAL SOLUTION OF EQUILIBRIUM EQUATIONS

  • Jang, Ho-Jong
    • 대한수학회논문집
    • /
    • 제15권1호
    • /
    • pp.133-142
    • /
    • 2000
  • We consider some numerical solution methods for equilibrium equations Af + E$^{T}$ λ = r, Ef = s. Algebraic problems of this form evolve from many applications such as structural optimization, fluid flow, and circuits. An important approach, called the force method, to the solution to such problems involves dimension reduction nullspace computation for E. The purpose of this paper is to investigate the substructuring method for the solution step of the force method in the context of the incompressible fluid flow. We also suggests some iterative methods based upon substructuring scheme..

  • PDF