Efficient Implementation of Convolutional Neural Network Using CUDA

Ki, Cheol-Min;Cho, Tai-Hoon;

doi:10.6109/jkiice.2017.21.6.1143

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Volume 21 Issue 6
/
Pages.1143-1148
/
2017
/
2234-4772(pISSN)
/
2288-4165(eISSN)

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

DOI QR Code

Efficient Implementation of Convolutional Neural Network Using CUDA

CUDA를 이용한 Convolutional Neural Network의 효율적인 구현

Ki, Cheol-Min (Department of Computer Engineering, Korea University of Technology and Education) ;
Cho, Tai-Hoon (School of Computer Science and Engineering, Korea University of Technology and Education)

기철민 ;
조태훈

Received : 2017.05.26
Accepted : 2017.06.01
Published : 2017.06.30

https://doi.org/10.6109/jkiice.2017.21.6.1143 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Currently, Artificial Intelligence and Deep Learning are rising as hot social issues, and these technologies are applied to various fields. A good method among the various algorithms in Artificial Intelligence is Convolutional Neural Networks. Convolutional Neural Network is a form that adds Convolution Layers to Multi Layer Neural Network. If you use Convolutional Neural Networks for small amount of data, or if the structure of layers is not complicated, you don't have to pay attention to speed. But the learning should take long time when the size of the learning data is large and the structure of layers is complicated. In these cases, GPU-based parallel processing is frequently needed. In this paper, we developed Convolutional Neural Networks using CUDA, and show that its learning is faster and more efficient than learning using some other frameworks or programs.

현재 인공지능과 딥 러닝이 사회적인 이슈로 떠오르고 있는 추세이며, 다양한 분야에 이 기술들을 응용하고 있다. 인공지능 분야의 여러 알고리즘들 중에서 각광받는 방법 중 하나는 Convolutional Neural Network이다. Convolutional Neural Network를 적은 양의 데이터에서 이용하거나, Layer의 구조가 복잡하지 않은 경우에는 학습시간이 길지 않아 속도에 크게 신경 쓰지 않아도 되지만, 학습 데이터의 크기가 크고, Layer의 구조가 복잡할수록 학습시간이 상당히 오래 걸린다. 이로 인해 GPU를 이용하여 병렬처리를 하는 방법을 많이 사용하는데, 본 논문에서는 CUDA를 이용한 Convolutional Neural Network를 구현하였으며, 비교에 사용한 Framework/Program들 보다 학습속도가 빨라지고 큰 데이터를 학습 시키는데 더욱 효율적으로 진행하도록 한다.

Keywords

References

Y. LeCun, L. Bottou, Y. Bengio, and P.Haffner, "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998. https://doi.org/10.1109/5.726791
Patrice Y. Simard, Dave Steinkraus, John Platt, "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis," Proceedings of the ICDAR, pp. 958-962, 2003.
Yann Lecun's Homepage. The MNIST DATABASE of Handwritten Digits [Internet], Available: https://yann.lecun.com/exdb/mnist.
Wikipedia. Description of GPGPU [Internet]. Available: https://ko.wikipedia. org/wiki/GPGPU.
Wikipedia. CUDA Processing Flow [Internet]. Available: https://upload.wikimedia.org/wikipedia/commons/5/59/CUDA_processing_flow_%28En%29.PNG.
J. Long, E. Shelhamer, T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," Proceeding of the CVPR, pp. 3431-3440, 2015.
Y. Huang, K. Li, G. Wang, M. Cao, P. Li, Y. Zhang. (2015, May). Recognition of convolutional neural network based on CUDA technology. arXiv preprint arXiv:1506.00074 [Online]. Available: https://arxiv.org/abs/1506.00074.
Dan C. Ciresan, U. Meier, J. Masci, Luca M. gambardella, J, Schmidhuber. (2011, February). High-Performance Neural Networks for Visual Object Classification. Arxiv preprint arXiv:1102.0183 [Online]. Available: https://arxiv.org/abs/1102.0183.
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "Overfeat: Integrated recognition, localization and detection using convolutional networks," Proceedings of the ICLR, 2014.
Kumar Chellapila, Sidd Puri, and Patrice Simard, "High performance convolutional neural network for document processing", International Workshop on Frontiers in Handwriting Recognition, 2006.
Joseph Chet Redmon's Homepage. Darknet Framework [Internet]. Available: https://pjreddie.com/darknet
Dan Ciresan's Homepage. Net CPU Version Program [Internet]. Available: https://people.idsia.ch/-ciresan/index.htm.
University of Science and Technology of China. CUDA-CNN program [Internet]. Available: https://github.com/zhxfl/CUDA-CNN.
Berkeley AI Research. Caffe framework [Internet]. Available: http://caffe.berkeleyvision.org.
Y.jia, E. Shelhamer, J. Donahue, S. Karayev, J.Long, R. Girshick, S. Guadarrama, and T. Darrell. (2014, June). Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 [Online]. Available: https://arxiv.org/abs/1408.5093.

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Efficient Implementation of Convolutional Neural Network Using CUDA

CUDA를 이용한 Convolutional Neural Network의 효율적인 구현

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)