A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API

Woo, DongHee;Kim, YoonHo;

doi:10.7838/jsebs.2018.23.1.037

The Journal of Society for e-Business Studies (한국전자거래학회지)

Volume 23 Issue 1
/
Pages.37-45
/
2018
/
2288-3908(pISSN)
/
2765-3846(eISSN)

Society for e-Business Studies (한국전자거래학회)

DOI QR Code

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API

GCN 아키텍쳐 상에서의 OpenCL을 이용한 GPGPU 성능향상 기법 연구

Woo, DongHee (Graduate School of Computer Science, Sangmyung University) ;
Kim, YoonHo (Department of Computer Science, Sangmyung University)

우동희 ;
김윤호

Received : 2017.12.20
Accepted : 2018.02.20
Published : 2018.02.28

https://doi.org/10.7838/jsebs.2018.23.1.037 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The current system upon which a variety of programs are in operation has continuously expanded its domain from conventional single-core and multi-core system to many-core and heterogeneous system. However, existing researches have focused mostly on parallelizing programs based CUDA framework and rarely on AMD based GCN-GPU optimization. In light of the aforementioned problems, our study focuses on the optimization techniques of the GCN architecture in a GPGPU environment and achieves a performance improvement. Specifically, by using performance techniques we propose, we have reduced more then 30% of the computation time of matrix multiplication and convolution algorithm in GPGPU. Also, we increase the kernel throughput by more then 40%.

현재 프로그램이 운용되는 시스템은 기존의 싱글코어 및 멀티코어 환경을 넘어서 매니코어, 부가 프로세스 및 이기종 환경까지 그 영역이 확장되고 있는 중이다. 하지만, 기존 연구의 경우 NVIDIA 벤더에서 나온 아키텍쳐 및 CUDA로의 병렬화가 주로 이루어졌고 AMD에서 나온 범용 GPU 아키텍쳐인 GCN 아키텍쳐에 대한 성능향상에 관한 연구는 제한적으로 이루어졌다. 이런 점을 고려해 본 논문에서는 GCN 아키텍쳐의 GPGPU 환경인 OpenCL 내에서의 성능향상 기법에 대해 연구하고 실질적인 성능향상을 보였다. 구체적으로, 행렬 곱셈과 컨볼루션을 적용한 GPGPU 프로그램을 본 논문에서 제시한 성능향상 기법을 통해 최대 30% 이상의 실행시간을 감소시켰으며, 커널 이용률 또한 40% 이상 높였다.

Keywords

References

AMD OpenCL Programming User Guide.
Aritsugi, M., Fukatsu, H., and Kanamori, Y., “Parallel Image Convolution Processing with Replicas in a Network of Workstations,” Institute of Electronics Information and Communication, Vol. 88, No. 6, pp. 1199-1209, 2005.
Choi, H. J. and Kim, C. H., "Performance Evaluation of the GPU Architecture Executing Parallel Applications," The Korea Contents Society, Vol. 12, No. 5, 10-21, 2012.
Fraire, J. A., Ferreyra, A., and Marques, C., “OpenCL Overview, Implementation, and Performance Comparison,” IEEE, Vol. 11, No. 1, pp. 274-280, 2013.
http://www.amd.com/ko-kr.
http://www.khronos.org/opencl/.
Huang, D., Wen, M., Xun, C., Chen, D., Cai, X., Qiao, Y., Wu, N., and Zhang, C., "Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Muiti-Core/Many-Core CPUs," Lecture Notes in Computer Science, No. 8632, pp. 210-221, 2014.
Jung, H. I., Park, I. S., and Ahn, H. C., “Identifying the Key Success Factors of Massively Multiplayer Online Role Playing Game Design using Artificial Neural Networks,” The Journal of Society for e-Business Studies, Vol. 17, No. 1, pp. 23-38, 2012. https://doi.org/10.7838/JSEBS.2012.17.1.023
Lee, D., Dinov, I., Dong, B., Gutman, B., Yanovsky, I., and Toga, A. W., “CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms,” Computer Methods and Programs in Biomedicine, Vol. 106, No. 3, pp. 175-187, 2012. https://doi.org/10.1016/j.cmpb.2010.10.013
Lee, S. G., “Enhancing Performance of Embedded System using FPGA Processor,” Namseoul University Press, Vol. 7, No. 1, pp. 56-67, 2010.
Lee, Y. H. and Kim, Y. J., “Parallel Intersection Detection Algorithm using CUDA,” HCI, Vol. 2008, No. 2, pp. 451-455, 2008.
Moon, H. J., Jeon, J. N., and Kim, S., “A Performance Analysis for Benchmarks on Heterogeneous Environment,” KISS, Vol. 23, No. 2B, pp. 1635-1638, 1996.
Oyarzun, G., Borrell, R., Gorobets, A., and Oliva, A., "MPI-CUDA sparse matrixvector multiplication for the conjugate gradient method with an approximate inverse preconditioner," Computers & Fluids, Vol. 92, pp. 244-252, 2014. https://doi.org/10.1016/j.compfluid.2013.10.035
Venetillo, J. S. and Celes, W., "GPU-based particle simulation with inter-collisions," The Visual Computer, Vol. 23, No. 9-11, pp. 851-860, 2007 https://doi.org/10.1007/s00371-007-0151-6

The Journal of Society for e-Business Studies (한국전자거래학회지)

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API

GCN 아키텍쳐 상에서의 OpenCL을 이용한 GPGPU 성능향상 기법 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)