The Implementation of Fast Object Recognition Using Parallel Processing on CPU and GPU

Kim, Jun-Chul;Jung, Young-Han;Park, Eun-Soo;Cui, Xue-Nan;Kim, Hak-Il;Huh, Uk-Youl;

doi:10.5302/J.ICROS.2009.15.5.488

Journal of Institute of Control, Robotics and Systems (제어로봇시스템학회논문지)

Volume 15 Issue 5
/
Pages.488-495
/
2009
/
1976-5622(pISSN)
/
2233-4335(eISSN)

Institute of Control, Robotics and Systems (제어로봇시스템학회)

DOI QR Code

The Implementation of Fast Object Recognition Using Parallel Processing on CPU and GPU

CPU와 GPU의 병렬 처리를 이용한 고속 물체 인식 알고리즘 구현

김준철 (인하대학교 정보공학과) ;
정용한 (인하대학교 정보공학과) ;
박은수 (인하대학교 정보공학과) ;
최학남 (인하대학교 정보공학과) ;
김학일 (인하대학교 정보공학과) ;
허욱렬 (인하대학교 전기공학과)

Published : 2009.05.01

https://doi.org/10.5302/J.ICROS.2009.15.5.488 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This paper presents a fast feature extraction method for autonomous mobile robots utilizing parallel processing and based on OpenMP, SSE (Streaming SIMD Extension) and CUDA programming. In the first step on CPU version, the algorithms and codes are optimized and then implemented by parallel processing. The parallel algorithms are debugged to maintain the same level of performance and the process for extracting key points and obtaining dominant orientation with respect to key points is parallelized. After extraction, a parallel descriptor via SSE instructions is constructed. And the GPU version also implemented by parallel processing using CUDA based on the SIFT. The GPU-Parallel descriptor achieves an acceleration up to five times compared with the CPU-Parallel descriptor, but it shows the lower performance than CPU version. CPU version also speed-up the four and half times compared with the original SIFT while maintaining robust performance.

Keywords

References

S. Ullman, 'High-level Vision - Object Recognition and Visual Cognition,' MIT Press, 2000
D. G Lowe, Distinctive image features from scale invariant keypoints, International Journal of Computer Vision, vol. 60, no. 2,pp.91-110,2004 https://doi.org/10.1023/B:VISI.0000029664.99615.94
Y. Ke and R. Sukthankar, 'PCA-SIFT: A more distinctive representation for local image descriptors,' Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511-517,2004 https://doi.org/10.1109/CVPR.2004.183
H. Bay, Y. Tuytelaars, and G L. Van, 'SURF: Speeded up robust features,' Computer VISion and Image Understanding, vol. 110, pp. 346-359, 2008 https://doi.org/10.1016/j.cviu.2007.09.014
S. Cagnoni, F. Bergenti, M. Mordonini, and G Adorni, 'Evolving binary classifiers through Parallel computation of multiple fitness cases,' IEEE Trans. on Systems, Man, and Cybernetics, part B, vol. 35, no. 3, 2005 https://doi.org/10.1109/TSMCB.2005.846671
E. N. Mortensen, D. Hongli, and L. Shapiro, 'A SIFT descriptor with global context,' Proc. of the Conforence on Computer Vision and Pattern Recognition, pp. 184-190,2005 https://doi.org/10.1109/CVPR.2005.45
J. Shotton, A. Blake, and R. Cipolla, 'Multiscale categorical object recognition using contour fragments,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1270-1281,2008 https://doi.org/10.1109/TPAMI.2007.70772
M. Brown and D. G Lowe, 'Invariant features from interest point groups,' Proc. of the British Machine Vision Conference, pp. 656-665, 2002
K. Mikolajczyk, et aI., 'A comparison of affine region detectors,' International Journal of Computer Vision, vol. 65, no. 1, pp. 43-72, 2005 https://doi.org/10.1007/s11263-005-3848-x
K. Mikolajczyk and C. Schmid, 'A performance evaluation of local descriptors,' IEEE Trans. on Pattern Analysis and Machine Intelligence, pp. 1615-1630,2005 https://doi.org/10.1109/TPAMI.2005.188
OpenMP Architecture Review Board, OpenMP Application Program Interface, ver. 2.5, 2005, http://www.openmp.org
R. Chandra, L. Dagurn, D. Kohr, D. Maydan, J. McDonald, and R. Menon, Parallel Programming in OpenMP, Morgan Kaufinann,2005
C. Nicolescu and P. Jonker, 'A data and task parallel image processing environment,' Parallel Computing, vol. 28, pp. 945- 965,2005 https://doi.org/10.1016/S0167-8191(02)00105-9
Intel${\circledR}$64 and lA-32 Architectures Software Developer's Manual, Intel Corporation, vol 2A,B, Instruction Set Reference., 2007, http://www.intel.com
Intel${\circledR}$64 and lA-32 Architectures Optimization Reference manual, Intel Corporation, 2007, http://www.intel.com
S. Asadollah, B. Juurlink, and S. Vassiliadis, 'Performance comparison of SIMD implementations of the discrete wavelet transform,' Proc. of the 16th IEEE International Conforence on Application-Specific Systems, Architecture Processors, pp. 393-398,2005 https://doi.org/10.1109/ASAP.2005.51
NVIDIA CUDA, Programming Guide, v2.1, 2008, http://www.nvidia.comlobjectlcuda_home.html
L. Yuancheng and R. Duraiswami, 'Canny edge detection on NVIDIA CUDA,' Computer Vision and Pattern Recognition Workshops, 2008
J. D. Owens, D. Luebke, N. govindaraju, M. Harris, J. Kruuger, A. E. Lefohn, and T. J. Purcell, ' A survey of general-purpose computaion on graphics hardware,' In Eurographics 2005, State of the Art Reports, pp. 21-51, 2005
V. Garcia, E. Debreuve, and M. Barlaud, 'Fast k nearest neighbor search using GPU,' Computer Vision and Pattern Recognition Workshops, 2008 https://doi.org/10.1109/CVPRW.2008.4563100
N. S. Sudipta, J.-M. Frahm, M. Pollefeys, and Y Genc, 'Feature tracking and matching in video using programmable graphics hardware,' Machine Vision and Applications, 2007 https://doi.org/10.1007/s00138-007-0105-z

Journal of Institute of Control, Robotics and Systems (제어로봇시스템학회논문지)

The Implementation of Fast Object Recognition Using Parallel Processing on CPU and GPU

CPU와 GPU의 병렬 처리를 이용한 고속 물체 인식 알고리즘 구현

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)