DOI QR코드

DOI QR Code

래스터화 알고리즘을 위한 최적의 매니코어 프로세서 구조 탐색

Architecture Exploration of Optimal Many-Core Processors for a Vector-based Rasterization Algorithm

  • 투고 : 2013.07.02
  • 심사 : 2013.10.01
  • 발행 : 2014.02.28

초록

In this paper, we implement and evaluate the performance of a vector-based rasterization algorithm for 3D graphics by using a SIMD (single instruction multiple data) many-core processor architecture. In addition, we evaluate the impact of a data-per-processing elements (DPE) ratio that is defined as the amount of data directly mapped to each processing element (PE) within many-core in terms of performance, energy efficiency, and area efficiency. For the experiment, we utilize seven different PE configurations by varying the DPE ratio (or the number PEs), which are implemented in the same 130 nm CMOS technology with a 500 MHz clock frequency. Experimental results indicate that the optimal PE configuration is achieved as the DPE ratio is in the range from 16,384 to 256 (or the number of PEs is in the range from 16 and 1,024), which meets the requirements of mobile devices in terms of the optimal performance and efficiency.

키워드

참고문헌

  1. W. Yoo, S. Shi, W.J. Jeon, K. Nahrstedt, R,H. Campbell, "Real-Time Parallel Remote Rendering for Mobile Devices using Graphics Processing Units," Proceedings of the IEEE International Conference on Multimedia and Expo, pp.902-907, 2010.
  2. N. Singhal, J.W. Yoo, H.Y. Choi, I.K. Park, "Implementation and Optimization of Image Processing Algorithms on Embedded GPU," IEICE Trans. Inf. & Syst., Vol. E95-D, No. 5, pp.1475-1484, 2012. https://doi.org/10.1587/transinf.E95.D.1475
  3. I.K. Park, N. Singhal, M.H. Lee, S. Cho, C.W. Kim, "Design and Performance Evaluation of Image Processing Algorithm on GPUs," IEEE Trans. Parallel Distrib. Syst., Vol. 22, No. 1, pp.91-104, 2011. https://doi.org/10.1109/TPDS.2010.115
  4. Y.H Ahn, Y.S. Hwang, K.S. Chung, "Kernel Level Power Management Solution for Multi-Core," Journal of IEMEK, Vol. 4, No. 2, pp.50-54, 2009 (in Korean).
  5. D.K. Shon, J.M. Kim, "Implementation and Performance Evaluation of Vector based Rasterization Algorithm using a Many-Core Processor," Journal of IEMEK, Vol. 8, No. 2, pp.87-93, 2013 (in Korean).
  6. B.K. Choi, C.H. Kim, J.M. Kim, "Implementation of SIMD-based Many-Core Processor for Efficient Image Data Processing," Journal of the KSCI, Vol. 16, No. 1, pp.1-9, 2011 (in Korean). https://doi.org/10.9708/jksci.2011.16.1.001
  7. Y.M. Kim, J.M. Kim, "Design and Verification of High-Performance Parallel Processor Hardware for JPEG Encoder," Journal of IEMEK, Vol. 6, No. 2, pp.100-107, 2011 (in Korean).
  8. S.M. Chai, T.M. Taha, D.S. Wills, J.D. Meindl, "Heterogeneous architecture models for interconnect-motivated system design," IEEE Trans. VLSI Systems, special issue on system level interconnect prediction, Vol. 8, No. 6, pp.660-670, 2000.
  9. S.P. Nugent, "A Second Generation GENEric SYstems Simulator (GENESYS) for a Gigascale System-on-a-Chip (SoC)," PhD dissertation, Georgia Institute of Technology, USA, 2005.
  10. International Technology Roadmap for Semiconductors 2011 Edition, http://www.itrs.net/Links/2011ITRS/2011Chapters/2011SysDrivers.pdf