Analysis of GPU-based Parallel Shifted Sort Algorithm by comparing with General GPU-based Tree Traversal

Kim, Heesu;Park, Taejung;

doi:10.9728/dcs.2017.18.6.1151

디지털콘텐츠학회 논문지 (Journal of Digital Contents Society)

제18권6호
/
Pages.1151-1156
/
2017
/
1598-2009(pISSN)
/
2287-738X(eISSN)

한국디지털콘텐츠학회 (Digital Contents Society)

DOI QR Code

일반적인 GPU 트리 탐색과의 비교실험을 통한 GPU 기반 병렬 Shifted Sort 알고리즘 분석

Analysis of GPU-based Parallel Shifted Sort Algorithm by comparing with General GPU-based Tree Traversal

김희수 (덕성여자대학교 디지털미디어학과) ;
박태정 (덕성여자대학교 디지털미디어학과)

Kim, Heesu (Department of Digital Media, Duksung Women's University) ;
Park, Taejung (Department of Digital Media, Duksung Women's University)

투고 : 2017.09.01
심사 : 2017.10.25
발행 : 2017.10.31

https://doi.org/10.9728/dcs.2017.18.6.1151 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

일반적으로 GPU 기반 트리 탐색을 수행할 경우 병렬 처리 속도가 생각보다 크게 향상되지 않는 경우가 대부분이다. 본 논문에서는 이러한 원인을 분석하고 그 분석 결과로 GPU 병렬 처리 하드웨어 아키텍처 내 최소 물리적 스레드 실행 단위인 warp 내에서 분기문(if문)으로 인한 warp divergence가 일어나기 때문임을 제시한다. 또한 이러한 warp divergence를 최소화할 수 있는 병렬 shifted sort 알고리즘과의 비교를 통해 shifted sort 알고리즘이 일반적인 GPU 내 트리 탐색에 비해 우수한 성능을 보이는 구조임을 제시하였다. 분석 결과 GPU 기반 kd-tree 탐색에 비해 warp divergence가 발생하지 않은 shifted sort 탐색은 3차원 공간에서 데이터나 쿼리의 수가 $2^{23}$개 일 때 16배 이상의 빠른 처리 속도를 보였으며 이 성능 차이는 데이터나 쿼리의 개수가 증가함에 따라 더 커지는 경향을 보였다.

It is common to achieve lower performance in traversing tree data structures in GPU than one expects. In this paper, we analyze the reason of lower-than-expected performance in GPU tree traversal and present that the warp divergences is caused by the branch instructions ("if${\ldots}$ else") which appear commonly in tree traversal CUDA codes. Also, we compare the parallel shifted sort algorithm which can reduce the number of warp divergences with a kd-tree CUDA implementation to show that the shifted sort algorithm can work faster than the kd-tree CUDA implementation thanks to less warp divergences. As the analysis result, the shifted sort algorithm worked about 16-fold faster than the kd-tree CUDA implementation for $2^{23}$ query points and $2^{23}$ data points in $R^3$ space. The performance gaps tend to increase in proportion to the number of query points and data points.

키워드

참고문헌

Euclidean distance website. Available: https://en.wikipedia.org/wiki/Euclidean_distance
Manhattan distance website. Available: https://en.wikipedia.org/wiki/Taxicab_geometry
Max distance website. Available:https://en.wikipedia.org/wiki/Chebyshev_distance
ANN: A Library for Approximate Nearest Neighbor Searching website. Available: https://www.cs.umd.edu/-mount/ANN/
kd-tree searching website. Available: https://en.wikipedia.org/wiki/K-d_tree
T. Park, "Optimization of Warp-wide CUDA Implementation for Parallel Shifted Sort Algorithm," Journal of Digital Contents Society, Vol. 18, No. 4, pp. 739-745, July 2017. https://doi.org/10.9728/DCS.2017.18.4.739
Ingo Wald, "On fast Construction of SAH-based Bounding Volume Hierarchies," Proceedings of the 2007 IEEE symposium on Interactive Ray Tracing, Washington, pp. 33-40, 2007.
S.Li, L. Simons, J. B. Pakaravoor, F. Abbasinejad, J. D. Owens, and N. Amenta, "kANN on the GPU with shifted sorting," In Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics (EGH-HPG'12), Switzerland, pp. 39-47, 2012.
T. Park, "Analysis of Morton Code Conversion for 32 Bit IEEE 754 Floating Point Variables," The Journal of Digital Contents Society, Vol. 17, No. 3, pp. 165-172, June 2016. https://doi.org/10.9728/dcs.2016.17.3.165
J. Cheng, M. Grossman, and T. McKercher, Professional CUDA C Programming, 1sted. Wrox, pp. 6-8, 2014.
NVIDIA Visual Profiler website. Available: https://developer.nvidia.com/nvidia-visual-profiler
J. Cheng, M. Grossman, and T. McKercher, Professional CUDA C Programming, 1sted. Wrox, pp. 87-96, 2014.

디지털콘텐츠학회 논문지 (Journal of Digital Contents Society)

일반적인 GPU 트리 탐색과의 비교실험을 통한 GPU 기반 병렬 Shifted Sort 알고리즘 분석

Analysis of GPU-based Parallel Shifted Sort Algorithm by comparing with General GPU-based Tree Traversal

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)