DOI QR코드

DOI QR Code

특이치 분해를 위한 최적의 2차원 멀티코어 시스템 탐색

Exploration of an Optimal Two-Dimensional Multi-Core System for Singular Value Decomposition

  • 박용훈 (울산대학교 전기전자컴퓨터공학과) ;
  • 김철홍 (전남대학교 전자컴퓨터공학부) ;
  • 김종면 (울산대학교 전기전자컴퓨터공학과)
  • Park, Yong-Hun (School of Electrical, Electronics, and Computer Engineering, University of Ulsan) ;
  • Kim, Cheol-Hong (School of Electronics and Computer Engineering, Chonnam National University) ;
  • Kim, Jong-Myon (School of Electrical, Electronics, and Computer Engineering, University of Ulsan)
  • 투고 : 2014.06.18
  • 심사 : 2014.08.28
  • 발행 : 2014.09.30

초록

특이치 분해는 다양한 분야의 데이터 집단에서 고유한 특성을 찾는 특징 추출 분야에 많이 활용되고 있다. 하지만 특이치 분해의 복잡 행렬 연산은 많은 연산 시간을 요구한다. 본 논문에서는 특이치 분해의 대표적인 알고리즘인 one-sided block Jacobi를 고속 처리하기 위해 2차원 멀티코어 시스템을 이용하여 효율적으로 병렬 구현하고 성능을 향상시킨다. 또한, one-sided block Jacobi 알고리즘의 다양한 행렬 ($128{\times}128$, $64{\times}64$, $32{\times}32$, $16{\times}16$)을 서로 다른 2차원 PE 구조에 구현하고 성능 및 에너지를 분석함으로써 각 행렬에 대한 최적의 멀티코어 구조를 탐색한다. 더불어 동일한 행렬의 one-sided block Jacobi 알고리즘에 대해 선택된 멀티코어 구조와 상용 고성능 그래픽스 프로세싱 유닛 (GPU)과의 성능 비교를 통해 제안한 2차원 멀티코어 방법의 잠재 가능성을 확인한다.

Singular value decomposition (SVD) has been widely used to identify unique features from a data set in various fields. However, a complex matrix calculation of SVD requires tremendous computation time. This paper improves the performance of a representative one-sided block Jacoby algorithm using a two-dimensional (2D) multi-core system. In addition, this paper explores an optimal multi-core system by varying the number of processing elements in the 2D multi-core system with the same 400MHz clock frequency and TSMC 28nm technology for each matrix-based one-sided block Jacoby algorithm ($128{\times}128$, $64{\times}64$, $32{\times}32$, $16{\times}16$). Moreover, this paper demonstrates the potential of the 2D multi-core system for the one-sided block Jacoby algorithm by comparing the performance of the multi-core system with a commercial high-performance graphics processing unit (GPU).

키워드

참고문헌

  1. E. Beltrami, "On bilinear functions," Journal of Mathematics, Vol. 11, pp. 98-106, 1873.
  2. C. Jordan, "Memory on bilinear forms," Journal of Pure and Applied Mathematics, Vol. 19, pp. 35-54, 1874.
  3. J. J. Sylvester, "A new proof that a general quadric may be reduced to its canonical form (that is, a linear function of squares) by means of a real orthogonal substitution," Messenger of Mathematics, Vol. 19, pp. 1-5, 1889.
  4. E. Schmidt, "On the theory of linear and nonlinear integral equations," Journal of Mathematische Annalen, Vol. 65, pp. 370-399, 1907.
  5. H. Weyl, "The asymptotic law granting the eigenvalues of linear partial differential equations with an application of the theory of black body radiation," Journal of Mathematische Annalen, Vol. 71, pp. 441-479, 1912. https://doi.org/10.1007/BF01456804
  6. K. Fernando, H. Nicholson, "Identification of linear systems with input and output noise: the Koopmans-Levin method," IEE Proceedings. Control Theory and Applications, Vol. 132, pp. 30-36, 1985. https://doi.org/10.1049/ip-d.1985.0007
  7. Ake Bjorck, "A bidiagonalization algorithm for solving large and sparse ill-posed systems of linear equations," Journal of BIT Numerical Mathematics, Vol. 28, pp. 659-670, 1988. https://doi.org/10.1007/BF01941141
  8. M. Darouach, M. Zasadzinski, S. J. Xu, "Full-order observers for linear systems with unknown inputs," IEEE Transaction on Automatic Control, Vol. 39, No. 3, pp. 606-609, March 1994. https://doi.org/10.1109/9.280770
  9. R. G. King, M. W. Watson, "System reduction and solution algorithm for singular linear difference systems under rational expectations," Journal of Computational Economics, Vol. 20, pp. 57-86, 2002. https://doi.org/10.1023/A:1020576911923
  10. A. Samui, S. R. Samantaray, "Wavelet singular entropy-based islanding detection in distributed generation," IEEE Transaction on Power Delivery, Vol. 28, No. 1, pp. 411-418, January 2013. https://doi.org/10.1109/TPWRD.2012.2220987
  11. W. Dong, G. Shi, and X. Li, "Nonlocal image restoration with bilateral variance estimation : a low-rank approach," IEEE Transactions on Image Processing, Vol. 22, No. 2, pp. 700-711, 2012.
  12. F. G. Yan, M. Jin, X. Qiao, "Low-complexity DOA estimation based on compressed MUSIC and its performance analysis," IEEE Transactions on Signal Processing, Vol. 61, No. 8, pp 1915-1930, 2013. https://doi.org/10.1109/TSP.2013.2243442
  13. S. C. Chan, Y. J. Chu, Z. G. Zhang, K. M. Tsui, "A NEW variable regularized QR decomposition-based recursive least M-estimate algorithm-performance analysis and acoustic applications," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, No. 5, pp. 907-922, May. 2013. https://doi.org/10.1109/TASL.2012.2236315
  14. A Rajwade, A Rangarajan, A Banerjee, "Image denoising using the higher order singular value decomposition," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 4, pp. 849-862, April 2013. https://doi.org/10.1109/TPAMI.2012.140
  15. Shutao Li, Leyuan Fang, Haitao Yin, "An efficient dictionary learning algorithm and its application to 3-D medical image denoising," IEEE Transactions on Biomedical Engineering, Vol. 59, No. 2, pp. 417-427, February 2012. https://doi.org/10.1109/TBME.2011.2173935
  16. A. Jindal, Mingyan Liu, "Networked computing in wireless sensor networks for structural health monitoring," IEEE Transactions on Networking, Vol. 20, No. 4, pp. 1203-1216, August 2012. https://doi.org/10.1109/TNET.2011.2175450
  17. G. Tang, A. Nehorai, "Stability of low-rank matrix reconstruction: A constrained singular value view," IEEE Transactions on Information Theory Society, Vol. 58, No. 9, 2012.
  18. G. H. Golub and C. Reinsch, "Singular value decomposition and least square solutions," Journal of Numerische Mathematik, Vol. 14, No. 5, pp. 403-420, Apr. 1970. https://doi.org/10.1007/BF02163027
  19. J. Demmel, K. Veselic, "Jacobi's method is more accurate than QR," SIAM Journal on Matrix Analysis and Applications, Vol. 13, No. 4, pp. 1204-1245, 1992. https://doi.org/10.1137/0613074
  20. B. A. Chartres, "Adaptation of the Jacobi method for a computer with magnetic-tape backing store," The Computer Journal, Vol. 5, No. 1, pp. 51-60, 1962. https://doi.org/10.1093/comjnl/5.1.51
  21. V. L. Charles, "The block Jacobi method for computing the singular value decomposition," Cornell University, 1985.
  22. B. B. Zhou, R. P. Brent, M. Kahn, "A one-sided Jacobi algorithm for the symmetric eigenvalue problem," in Proc. of 3rd Parallel Computing Workshop, 1994.
  23. B. B. Zhou, R. P. Brent, "A parallel ring ordering algorithm for efficient one-sided Jacobi SVD computations," Journal of Parallel and Distributed Computing, Vol. 42, No. 1, pp. 1-10, 1997. https://doi.org/10.1006/jpdc.1997.1304
  24. B. B. Zhou, R. P. Brent, "On parallel implementation of the one-sided Jacobi algorithmfor singular value decompositions," in Proceedings of Euromicro Workshop on Parallel and Distributed Processing, pp. 401-408, 1995.
  25. Y. Takahashi, Y. Hirota, Y. Yamamoto, "Performance of the block Jacobi method for the symmetric eigenvalue problem on a modern massively parallel computer," in Proceedings of Algoritmy, pp. 151-160, 2012.
  26. I, Bethune, J. M. Bull, N. J. Dingle, N. J. Higham, " Performance analysis of asynchronous Jacobi's method implemented in MPI, SHMEM and OpenMP," Manchster Institute for Mathematical Sciences School of Mathematics, 2012.
  27. A. Gentile, D. S. Wills, "Portable video supercomputing," IEEE Transactions on Computers, Vol. 53, No. 8, pp. 960-973, 2004. https://doi.org/10.1109/TC.2004.48
  28. S. M. Kang, J. M. Kim, "Multimedia extension instructions and optimal many-core processor architecture exploration for portable ultrasonic image processing," Journal of Korea Society Computer Institute, Vol. 17, No. 8, pp. 1-10, 2012. https://doi.org/10.9708/jksci.2012.17.8.001
  29. J. Y. Kim. D. K. Shon, J. M. Kim, H. S. Jun "Parallel implementation and performance evaluation of the SIFT algorithm using a many-core processor," Journal of Korea Society Computer Institute, Vol. 18, No. 9, pp. 1-10, 2013. https://doi.org/10.9708/jksci.2013.18.9.001
  30. J. S. Seo, M. S. Kang, C. H. Kim, J. M. Kim, "Design space exploration of embedded many-core processors for real-time fire feature extraction," Journal of Korea Society Computer Institute, Vol. 18, No. 10, pp. 1-12, 2013. https://doi.org/10.9708/jksci.2013.18.10.001