저속 네트웍 PC 클러스터상에서 NOW-Sort의 성능향상

Enhanced NOW-Sort on a PC Cluster with a Low-Speed Network

  • 발행 : 2002.10.01

초록

병렬 외부정렬을 클러스터형 분산 컴퓨터에서 실행하는 경우에는 순수하게 주메모리에서 부분적인 정렬과 머지를 위해 실행되는 과정(순수 계산)뿐만 아니라 디스크로부터의 입출력 과정 및 각 노드들간의 데이타 교환에 따르는 통신과정을 적절히 배치, 설계함이 필요하다. 그 주된 이유는 전체 수행시간이 순수 계산시판보다는 디스크 입출력에 소요되는 시간 및 통신의 소요시간의 비중이 크기 때문이다. 본 연구에서는 저속 네트웍 PC 클러스터를 계산도구로 하여 단위시간당 정렬 자료규모를 최대화함을 목표로 하여, 알고리즘적인 최적화를 통해서, 즉, 정렬 도중 통신과정에서 발생하는 지체시간을 최소화하여 전체적인 통신 성능을 높이고, 디스크 입출력 작업은 전송 규모와 횟수를 조절하여 계산과 통신작업등과의 중첩정도를 극대화시켜 외부정렬의 성능을 개선하였다. 실험 결과 새 알고리즘이 기존의 NOW-sort 알고리즘[1]에 비해서 동일한 PC 클러스터 경에서 최대 45% 정도까지 실행시간을 단축시킬 수 있고, 확장성 면에 있어서도 더 우수한 것을 확인하였다.

External sort on cluster computers requires not only fast internal sorting computation but also careful scheduling of disk input and output and interprocessor communication through networks. This is because the overall time for the execution is determined by reflecting the times for all the jobs involved, and the portion for interprocessor communication and disk I/O operations is significant. In this paper, we improve the sorting performance (sorting throughput) on a cluster of PCs with a low-speed network by developing a new algorithm that enables even distribution of load among processors, and optimizes the disk read and write operations with other computation/communication activities during the sort. Experimental results support the effectiveness of the algorithm. We observe the algorithm reduces the sort time by 45% compared to the previous NOW-sort[1], and provides more scalability in the expansion of the computing nodes of the cluster as well.

키워드

참고문헌

  1. B. Ahn and D. Kim, 'External sort on a cluster of PCs.' 2000 Int'l Con! Parallel and Distributed Processing Techniques and Applications, pp.1443-1448, Las Vegas, Nevada, USA, June 25-29, 2000
  2. W. A. Martin, Sorting, ACM Computing Surveys, Vol. 3, No.4, pp. 147-174, 1971 https://doi.org/10.1145/356593.356594
  3. http://research.microsoft.com/barc/SortBenchmark, Sort Benchmark Home Page
  4. Y.C. Kim, M. Jeon, D. Kim, A. Sohn, 'Communication-efficient bitonic sort on a distributed memory parallel computer.' Proc. Int'l Conference on Parallel and Distributed Systems (ICP ADS' 2001), pp.165-170, Kyung-Ju, Korea, June 26-29, 2001
  5. S-J Lee, M. Jeon, A. Sohn and D. Kim, 'Partitioned Parallel Radix Sort,' Journal of Parallel and Distributed Computing, Vol. 62, pp. 656-668, Academic Press, April 2002 https://doi.org/10.1006/jpdc.2001.1808
  6. A. Sohn, Y Kodama. 'Load balanced parallel radix sort.' Proc. the 1998 international conference on Supercomputing, pp 305 - 312, 1998 https://doi.org/10.1145/277830.277903
  7. K.E. Batcher, 'Sorting networks and their applications.' Proc. AFIPS Conference, pp. 307-314, 1968
  8. T.E. Anderson, DE. Culler, and D.A Patterson, 'A Case for NOW(Networks of Workstations).' IEEE Micro, Feb. 1994 https://doi.org/10.1109/40.342018
  9. A.C. Arpaci-Desseau, R.H. Arpaci-Desseau, D.E. Culler, J.M, Hellerstein, and D.A Patterson, 'High-Performance Sorting on Networks of Workstations.' ACM SIGMOD '97, Tucson, Arizona, May 1997 https://doi.org/10.1145/253260.253322
  10. J. Wyllie, 'SPsort: How to sort a terabyte quickly.' Technical Report, IBM Almaden Lab., Feb. 1999, http://www.almaden.ibm.com/cs/gpfsspsort.html
  11. L. Rivera, X. Zhang, A Chien, 'HPVM Minutesort.' Sort Benchmark Home Page, http://research.microsoft.com/barc/SortBenchmark/
  12. D. Taniar and J.W. Rahayu, 'Sorting in parallel database systems.' Proc. High Performance Computing in the Asia Pacific Region, 2000: The Fourth Int'l Conf. and Exibition Vol.2, pp. 830-835, 2000 https://doi.org/10.1109/HPC.2000.843555
  13. L.M. Wegner, J.I. Teuhola, 'The external heapsort' IEEE Trans. Software Engineering, Vol.15, No.7, pp. 917-925, July 1989 https://doi.org/10.1109/32.29490
  14. C. Cerin, 'An out-of-core sorting algorithm for clusters with processors at different speed.' Proc. 2002 Parallel and Distributed Processing Symp., April 15-18, Fort Lauderdale, FL, USA https://doi.org/10.1109/IPDPS.2002.1015576
  15. A.C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein and David A Patterson. 'for the sorting record: experiences in tuning NOW-Sort.' Proc. the SIGMETRICS symposium on Parallel and distributed tools, pp 124 - 133, 1998
  16. F. Popovici, J. Bent, B. Forney, A.A. Dusseau, R.A. Dusseau, 'Datamation 2001: A Sorting Odyssey.' Sort Benchmark Home Page, http://research.microsoft.com/barc/SortBenchmark/
  17. 김지형, 통신과 디스크 입출력 최적화를 통한 병렬 외 부정렬의 성능 향상, 석사학위논문, 고려대학교, Jan. 2002
  18. Anon et al., 'A Measure of Transaction Processing Power.' Datamation, V.31(7):112-118. also in Readings in Database Systems, M.J., Stonebraker ed., Morgan Kaufmann, San Mateo, 1989
  19. Anon et al., 'A Measure of Transaction Processing Power.' Datamation, V.31(7):112-118. also in Readings in Database Systems, M.J., Stonebraker ed., Morgan Kaufmann, San Mateo, 1989
  20. C. Nyberg, T. Barclay, Z. Cvetanovic, J. Gray, D. Lomet, 'AlphaSort: A Cache-Sensitive Parallel External Sort.' ACM SIGMOD Record, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, Volume 23 Issue 2, 1994
  21. LAM/MPI Parallel Computing, http://www.lammpi.org
  22. The Beowulf Project, http://www.beowulf.org