A VIA-based RDMA Mechanism for High Performance PC Cluster Systems

고성능 PC 클러스터 시스템을 위한 VIA 기반 RDMA 메커니즘 구현

  • 정인형 (삼성전자 무선사업부) ;
  • 정상화 (부산대학교 컴퓨터공학과) ;
  • 박세진 (부산대학교 컴퓨터공학과)
  • Published : 2004.12.01

Abstract

The traditional communication protocols such as TCP/IP are not suitable for PC cluster systems because of their high software processing overhead. To eliminate this overhead, industry leaders have defined the Virtual Interface Architecture (VIA). VIA provides two different data transfer mechanisms, a traditional Send/Receive model and the Remote Direct Memory Access (RDMA) model. RDMA is extremely efficient way to reduce software overhead because it can bypass the OS and use the network interface controller (NIC) directly for communication, also bypass the CPU on the remote host. In this paper, we have implemented VIA-based RDMA mechanism in hardware. Compared to the traditional Send/Receive model, the RDMA mechanism improves latency and bandwidth. Our RDMA mechanism can also communicate without using remote CPU cycles. Our experimental results show a minimum latency of 12.5${\mu}\textrm{s}$ and a maximum bandwidth of 95.5MB/s. As a result, our RDMA mechanism allows PC cluster systems to have a high performance communication method.

PC 클러스터 상에서 기존의 TCP/IP와 같은 통신 프로토콜의 높은 소프트웨어 오버헤드를 제거하기 위한 노력으로 산업계 표준으로 Virtual Interface Architecture(VIA)가 제안되었다. VIA가 제공하는 통신 방식중, Remote Direct Memory Access(RDMA) 방식은 커널과 리모트 노드의 개입 없이 통신을 가능하게 함으로써 PC 클러스터 시스템에 효율적인 통신 방법을 제공한다. 본 논문에서는 VIA 기반 RDMA 메커니즘을 하드웨어로 구현하였다. 일반적인 송수신방식과 비교하여 본 논문에서 구현한 RDMA 메커니즘은 커널의 개입 없이 무복사 통신을 가능하게 하며, 또한 리모트 노드의 CPU의 사용 없이 통신을 수행할 수 있다. 실험결과, RDMA를 하드웨어 VIA 기반 네트워크 어댑터상에 구현함으로써 최소 12.5${\mu}\textrm{s}$의 지연시간, 최대 95.5MB/s의 대역폭을 얻을 수 있었다. 결과적으로 본 논문에서 구현한 VIA 기반 RDAM 메커니즘은 PC 클러스터 시스템에 효율적인 통신 방법을 제공한다.

Keywords

References

  1. N. J. Boden, D. Cohen, R. E. Felderman, A. E. Kulawik, C. L. Seitz, J. N. Seizovic, W. Su, 'Myrinet A Gigabit per second Local Area Network,' IEEE Micro, 1995 https://doi.org/10.1109/40.342015
  2. IEEE: Standard for Scalable Coherent Interface (SCI) IEEE Std.l596-1992, IEEE Computer Society, Aug. 1993
  3. D. D. Clark, V. Jacobson, J. Romkey, H. Salwen, 'An Analysis of TCP Processing Overhead,' IEEE Communications Magazine , pp. 23-29, June 1989 https://doi.org/10.1109/35.29545
  4. J. Kay and J. Pasquale, 'Profiling and Reducing Processing Overheads in TCP/IP,' IEEE/ACM Transactions on Networking, Vol. 4, No.6, pp, 817-828, Dec. 1996 https://doi.org/10.1109/90.556340
  5. R. A.F. Bhoedjang, T. Ruhl, and H. E. Bal, 'User-Level Network Interface Protocols,' IEEE Computer, Vol. 31, No. 11, pp, 53-60, Nov. 1998 https://doi.org/10.1109/2.730737
  6. T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser, 'Active Messages: A Mechanism for Integrated Communication and Computation,' 19th International Symposium on Computer Architecture, May 1992 https://doi.org/10.1145/139669.140382
  7. T. von Eicken, A. Basu, V. Buch, and W. Vogels. 'U-Net: A User-level Network Interface for Parallel and Distributed Computing,' Proc, of the 15th ACM Symposium on Operating Systems Principles (SOSP), Colorado, December 3-6, 1995 https://doi.org/10.1145/224056.224061
  8. C. Dubnicki, A. Bilas, K. Li, and J. Philbin, 'Design and Implementation of Virtual Memory-Mapped communication on Myrinet,' presented at Proceedings of the International Parallel Processing Symposium, pp, 388-396, 1997 https://doi.org/10.1109/IPPS.1997.580931
  9. S. Pakin, M. Lauria, and A. Chien. 'High Performance Messaging on Workstations: Illinois Fast Messages(FM) for Myrinet,' Proc. of the Supercomputing'95, December 3-8, 1995 https://doi.org/10.1145/224170.224360
  10. Virtual Interface Architecture Specification. http://www.viarch.org/
  11. Myricom, The GM Message Passing System, 10/16/1999
  12. Various. Infiniband tutorials. In Proceedings of the I/O Technology Forum and Expo and Server I/O 2000, Monterey, CA, February 2000. http://www.sresearch.com
  13. http://www.nersc.gov/research/FTG/via
  14. http://www.millennium.berkeley.edu/via.php3, P. Buonadonna, A. Begel, D. Gay, and D. Culler, 'An Analysis of VI Architecture Primitives in Support of Parallel and Distributed Communication,' Apr. 2000
  15. Emulex Corporation, Hardware-based (ASIC) implementation of the Virtual Interface standard, http://www.emulex.com/products/legacy/vi/clan1000.html
  16. ftp://ftp.compaq.com/pub/supportinformation/papers/tc000602wp.pdf
  17. '고성능 클러스터 시스템을 위한 VIA 기반 네트워크 카드의 구현', 박세진, 정상화, 윤인수, 정인형, 이소명, 한국정보과학회 병렬처리시스템연구회, 2003. 11