Design and Implementation of an SCI-Based Network Cache Coherent NUMA System for High-Performance PC Clustering

고성능 PC 클러스터 링을 위한 SCI 기반 Network Cache Coherent NUMA 시스템의 설계 및 구현

  • Published : 2004.12.01

Abstract

It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For PC cluster systems, it is possible to reduce network access time by maintaining network cache in each cluster node. This paper presents a Network Cache Coherent NUMA (NCC-NUMA) system to utilize network cache by locating shared memory on the PCI bus, and the NCC-NUMA card which is core module of the NCC-NUMA system is developed. The NCC-NUMA card is directly plugged into the PCI slot of each node, and contains shared memory, network cache, shared memory control module and network control module. The network cache is maintained for the shared memory on the PCI bus of cluster nodes. The coherency mechanism between the network cache and the shared memory is based on the IEEE SCI standard. According to the SPLASH-2 benchmark experiments, the NCC-NUMA system showed improvements of 56% compared with an SCI-based cluster without network cache.

고성능 PC 클러스터 시스템을 구축하기 위해서는 네트워크 접근 시간을 최소화하는 것이 중요하다. SCI 기반 PC 클러스터 시스템에서는 각 노드에 네트워크 캐쉬를 유지함으로써 네트워크 접근 시간을 줄이는 것이 가능하다. 본 논문에서는 공유 메모리를 PCI 버스상에 위치시킴으로써 네트워크 캐쉬지원을 가능하게 하였으며, 이에 기반한 Network Cache Coherenet NUMA(NCC-NUMA) 시스템을 제안하고, 핵심 모듈인 NCC-NUMA 카드를 개발하였다. NCC-NUMA 카드는 각 노드의 PCI 슬롯(slot)에 plug-in되는 형태이며, 공유메모리, 네트워크 캐쉬, 공유메모리 제어 모듈 및 네트워크 제어 모듈을 포함한다. 공유메모리와 네트워크 캐쉬 사이의 일관성은 IEEE SCI 표준에 의해 유지된다. NCC-NUMA 시스템의 성능 측정을 위해 SPLASH-2 벤치마크를 수행하였으며, NCC-NUMA 시스템이 네트워크 캐쉬를 활용하지 않는 NUMA 기반 클러스터 시스템에 비해서 최대 56%의 성능향상을 보임을 알 수 있었다.

Keywords

References

  1. http://www.myri.com
  2. IEEE Standard for Scalable Coherent Interface (SCI), IEEE Computer Society, August 1993
  3. H. Ong and P. A. Farrell, 'Performance Comparison of LAM/MPI, MPICH, and MVICH on a Linux Cluster connected by a Gigabit Ethernet Network,' Proceedings of the 4th Annual Linux Showcase & Conference, Atlanta, Georgia, USA, October 2000
  4. Sang-Hwa Chung, Soo-Cheol Oh, Se-Jin Park, Han-Kook Jang, Chi-Jung Ha, 'A CC-NUMA Prototype Card for SCI-Based PC Clustering,' Proceedings of IEEE International Conference on Cluster Computing, Nov. 2000 https://doi.org/10.1109/CLUSTER.2000.10018
  5. Sang-Hwa Chung, Soo-Cheol Oh, 'An SCI-Based PC Cluster Utilizing Coherent Network Cache,' Cluster Computing, Vol. 6, Issue. 2, pp. 153-159, Apr. 2003 https://doi.org/10.1023/A:1022856606542
  6. http://www.dolphinics.no/dolphin2/interconnect/index.html
  7. Georg Acher, Wolfgang Karl, and Markus Lebe-recht, 'The TUM PCI/SCI Adapter,' Scalable Coherent Interface/SCI, Architecture and Software for High-Performance Compute Clusters, LNCS State-of-the-Art Survey, October 1999
  8. Mario Trams, Wolfgang Rehm, Daniel Balkanski, Stanislav Simeonov, 'Memory Management in a combined VIA/SCI Hardware,' IPDPS 2000 Workshops, pp. 4-15 Cancun, Mexico, May 2000
  9. M. Schulz, J. Tao, C. Trinitis, and W. Karl, 'SMiLE: An Integrated, Multi-Paradigm Software Infrastructure for SCI-based Clusters,' 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), Berlin, Germany , May, 2002 https://doi.org/10.1109/CCGRID.2002.1017133
  10. Emmanuel Cecchet, 'Memory Mapped Networks: A New Deal for Distributed Shared Memories? The SciFS Experience,' IEEE International Conference on Cluster Computing (CLUSTER'02), September, 2002 https://doi.org/10.1109/CLUSTR.2002.1137751
  11. P. Keleher, S. Dwarkadas, A.L. Cox, and W. Zwaenepoel, 'TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems,' Proceedings of the Winter 94 Usenix Conference, pp. 115-131, January 1994
  12. Jeffrey Kuskin, David Ofelt, Mark Heinrich, John Heinlein, Richard Simoni, Kourosh Gharachorloo, John Chapin, David Nakahira, Joel Baxter, Mark Horowitz, Anoop Gupta, Mendel Rosenblum, and John Hennessy, 'The Stanford FLASH Multiprocessor,' Proceedings of the 21st Annual International Symposium on Computer Architecture, 1994 https://doi.org/10.1145/191995.192056
  13. Anant Agarwal, Ricardo Bianchini, David Chaiken, Kirk L. Johnson, David Kranz, John Kubiatowicz, Beng-Hong Lim, Kenneth Machkenize, and Donald Yeung, 'The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor,' MIT/ LCS Memo TM-454, Messachusetts Institute of Technology, 1991
  14. R. Clark. 'SCI Interconnect Chipset and Adapter: Building Large Scale Enterprise Servers with Pent-ium Pro SHV Nodes,' White Paper, Data General Corporation, 1999
  15. http://www-1.ibm.com/servers/eserver/xseries/numa/ index.html
  16. http://www.plxtech.com
  17. Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. 'The SPLASH-2 Programs: Characterization and Methodological Considerations,' In Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24-36, Santa Margherita Ligure, Italy, June 1995