• Title/Summary/Keyword: GPU algorithm

Search Result 265, Processing Time 0.025 seconds

An Analytical Model for Performance Prediction of AES on GPU Architecture (GPU 아키텍처의 AES 암호화 성능 예측 분석 모델)

  • Kim, Kyuwoon;Kim, Hyunwoo;Kim, Huijeong;Huh, Taeyoung;Jung, Sanghyuk;Song, Yong Ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.4
    • /
    • pp.89-96
    • /
    • 2013
  • The graphic processor unit (GPU) has been developed to process not only graphic data but also general system data. It shows a better performance than CPU in algorithm for 3D graphics and parallel program. In order to execute algorithm for CPU on GPU, we should understand about GPU architectures and rewrite program considering parallel processing capability and new memory model of GPU. For this reasons, a performance prediction model for the algorithm and its predicted performance through GPU system are required. These can predict problems in GPU application development or construct a performance evaluation standard for GPU. In this paper, we applied the AES encryption algorithms on our performance model and accomplished performance prediction with high accuracy under a heavy workload.

Designing Hybrid Sorting Algorithm for PC with GPU (GPU가 장착된 PC를 위한 혼합 정렬 알고리즘 설계)

  • Kwon, Oh-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.2
    • /
    • pp.281-286
    • /
    • 2011
  • Data sorting is one of important pre-process to utilize huge data in modern society, but sorting spends a lot of time by sorting itself. In this paper, we presented hybrid sorting algorithm that splits array to sort concurrently in CPU and GPU. To do this, we decided most effective range of array based on hardware performance, then accomplished reducing whole sorting time by concurrent sorting on CPU and GPU. As shown in results of experiment, hybrid sorting improved about eight percent of sorting time in comparison with the sorting time using only GPU.

An Efficient k-D tree Traversal Algorithm for Ray Tracing on a GPU (GPU상에서 동작하는 Ray Tracing을 위한 효과적인 k-D tree 탐색 알고리즘)

  • Kang, Yoon-Sig;Park, Woo-Chan;Seo, Choong-Won;Yang, Sung-Bong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.3
    • /
    • pp.133-140
    • /
    • 2008
  • This paper proposes an effective k-D tree traversal algorithm for ray tracing on a GPU. The previous k-D tree traverse algorithm based on GPU uses bottom-up searching from a leaf to the root after failing to find the ray intersected primitive in the leaf node. During the bottom-up search the algorithm decides the current node is visited or not from the parent node. In such a way, we need to visit the parent node which was already visited and the duplicated bounding box intersection tests. The new k-D tree traverse algorithm reduces the brother and parent duplicated visit by using an efficient method which decides whether the brother node is already visited or not during the bottom-up search. Also the algorithm take place bounding box intersection tests only for the nodes which is not yet done. As a result our experiment shows the new algorithm is about 30% faster than the previous.

Discolored Metal Pad Image Classification Based on Gabor Texture Features Using GPU (GPU를 이용한 Gabor Texture 특징점 기반의 금속 패드 변색 분류 알고리즘)

  • Cui, Xue-Nan;Park, Eun-Soo;Kim, Jun-Chul;Kim, Hak-Il
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.8
    • /
    • pp.778-785
    • /
    • 2009
  • This paper presents a Gabor texture feature extraction method for classification of discolored Metal pad images using GPU(Graphics Processing Unit). The proposed algorithm extracts the texture information using Gabor filters and constructs a pattern map using the extracted information. Finally, the golden pad images are classified by utilizing the feature vectors which are extracted from the constructed pattern map. In order to evaluate the performance of the Gabor texture feature extraction algorithm based on GPU, a sequential processing and parallel processing using OpenMP in CPU of this algorithm were adopted. Also, the proposed algorithm was implemented by using Global memory and Shared memory in GPU. The experimental results were demonstrated that the method using Shared memory in GPU provides the best performance. For evaluating the effectiveness of extracted Gabor texture features, an experimental validation has been conducted on a database of 20 Metal pad images and the experiment has shown no mis-classification.

A Study on High Speed Face Tracking using the GPGPU-based Depth Information (GPGPU 기반의 깊이 정보를 이용한 고속 얼굴 추적에 대한 연구)

  • Kim, Woo-Youl;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.5
    • /
    • pp.1119-1128
    • /
    • 2013
  • In this paper, we propose an algorithm to detect and track the human face with a GPU-based high speed. Basically the detection algorithm uses the existing Adaboost algorithm but the search area is dramatically reduced by detecting movement and skin color region. Differently from detection process, tracking algorithm uses only depth information. Basically it uses a template matching method such that it searches a matched block to the template. Also, In order to fast track the face, it was computed in parallel using GPU about the template matching. Experimental results show that the GPU speed when compared with the CPU has been increased to up to 49 times.

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • v.18 no.5
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

GPU Implementation Techniques of Genetic Algorithm and Comparative Studies (유전 알고리즘의 GPU 구현 기법 및 비교 연구)

  • Hyeon, Byeong-Yong;Seo, Ki-Sung
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.4
    • /
    • pp.328-335
    • /
    • 2011
  • GPU (Graphics Processing Units) is consists of SIMD (Single Instruction Multiple Data) architecture and provides fast parallel processing. A GA (Genetic Algorithm), which requires large computations, is implemented in GPU using CUDA (Compute Unified Device Architecture). Three kinds of execution models are presented according to different combinations of processing modules in GPU. Comparison experiments between GPU models and CPU are tested for a couple of benchmark problems by variation of population sizes and complexity of problem sizes.

Localization and Autonomous Navigation Using GPU-based SIFT and Virtual Force for Mobile Robots (GPU 기반 SIFT 방법과 가상의 힘을 이용한 이동 로봇의 위치 인식 및 자율 주행 제어)

  • Tak, Myung Hwan;Joo, Young Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.10
    • /
    • pp.1738-1745
    • /
    • 2016
  • In this paper, we present localization and autonomous navigation method using GPU(Graphics Processing Unit)-based SIFT(Scale-Invariant Feature Transform) algorithm and virtual force method for mobile robots. To do this, at first, we propose the localization method to recognize the landmark using the GPU-based SIFT algorithm and to update the position using extended Kalman filter. And then, we propose the A-star algorithm for path planning and the virtual force method for autonomous navigation of the mobile robot. Finally, we demonstrate the effectiveness and applicability of the proposed method through some experiments using the mobile robot with OPRoS(Open Platform for Robotic Services).

Systematic Evaluation of Island based Real-Valued Genetic Algorithm with Graphics Processing Unit (Graphics Processing Unit를 이용한 섬기반 Real-Valued Genetic Algorithm의 체계적 평가)

  • Park, Hyun-Soo;Kim, Kyung-Joong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.328-333
    • /
    • 2010
  • 최적해를 구하는 효과적인 방법 중 하나인 GA (Genetic Algorithm)은 높은 품질의 해를 구하기 위해서 많은 연산시간이 필요하지만, GPU (Graphics Processing Unit)의 높은 데이터 병렬처리 능력과 우수한 부동소수 연산능력을 이용하면 빠르게 처리 가능하다. 이 논문에서는 GPU를 이용하여 가속한 섬 기반의 RVGA (Real-Valued Genetic Algorithm)와 GPU를 이용하지 않는 RVGA를 비교하여 평가하였으며, 또한 GPU를 이용하지만 RVGA가 아닌 Simple GA인 경우와도 비교하여 평가 하였다. 그 결과, GPU를 이용한 경우 속도 향상을 할 수 있었으며, Simple GA보다 RVGA가 더 속도가 향상되었다.

  • PDF