Research Trends for Improving MPI Collective Communication Performance

H.Y., Ahn;Y.M., Park;S.Y., Kim;W.J., Han;

doi:10.22648/ETRI.2022.J.370605

Electronics and Telecommunications Trends (전자통신동향분석)

Volume 37 Issue 6
/
Pages.43-53
/
2022
/
1225-6455(pISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Research Trends for Improving MPI Collective Communication Performance

MPI 집합통신 성능 향상 연구 동향

H.Y., Ahn ;
Y.M., Park ;
S.Y., Kim ;
W.J., Han

안후영 (슈퍼컴퓨팅기술연구센터 ) ;
박유미 (슈퍼컴퓨팅기술연구센터 ) ;
김선영 (슈퍼컴퓨팅기술연구센터 ) ;
한우종 (인공지능연구소 )

Published : 2022.12.01

https://doi.org/10.22648/ETRI.2022.J.370605 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Message Passing Interface (MPI) collective communication has been applied to various science and engineering area such as physics, chemistry, biology, and astronomy. The parallel computing performance of the data-intensive workload in the above research fields depends on the collective communication performance. To overcome this limitation, MPI collective communication technology has been extensively researched over the last several decades to improve communication performance. In this paper, we provide a comprehensive survey of the state-of-the-art research performed on the MPI collective communication and examine the trends of recently developed technologies. We also discuss future research directions for providing high performance and scalability to large-scale MPI applications.

Keywords

MPI

Acknowledgement

이 논문은 대한민국 정부(과학기술정보통신부)의 재원으로 한국연구재단 슈퍼컴퓨터개발선도사업의 지원을 받아 수행된 연구임[과제번호: 2021M3H6A1017683].

References

D.K. Panda et al., "The MVAPICH project: Transforming research into high-performance MPI library for HPC community," J. Comput. Sci., vol. 52, 2021, article no. 101208.
K.S. Jin, S.M. Lee, and Y.C. Kim, "Adaptive and optimized agent placement scheme for parallel agent-based simulation," ETRI J., vol. 44, no. 2, 2021.
B. Andjelkovic et al., "Grid-enabled parallel simulation based on parallel equation formulation," ETRI J., vol. 32, no. 4, 2010, pp. 555-565. https://doi.org/10.4218/etrij.10.0109.0197
M . GAO et al., "Proteome-scale deployment of protein structure prediction workflows on the summit supercomputer," arXiv preprint, CoRR, 2022, arXiv: 2201.10024.
A. Acharya et al., "Supercomputer-based ensemble docking drug discovery pipeline with application to COVID-19," J. Chem. Inf. Model., vol. 60, no. 12, 2020, pp. 5832-5852. https://doi.org/10.1021/acs.jcim.0c01010
M. Tolstykh et al., "SL-AV model: Numerical weather prediction at extra-massively parallel supercomputer," in Russian Supercomputing Days, Springer, Cham, Switzerland, 2018, pp. 379-387.
V . Khryashchev et al., "Comparison of different convolutional neural network architectures for satellite image segmentation," in Proc. Conf. Open Innov. Assoc. (FRUCT), (Bologna, Italy), Nov. 2018, pp. 172-179.
M.P. KATZ et al., "Preparing nuclear astrophysics for exascale," in Proc. SC20: Int. Conf. High Perform. Comput., Netw., Storage Analysis (Atlanta, GA, USA), Nov. 2020, pp. 1-12.
Wikipedia, Shared Memory, https://en.wikipedia.org/wiki/Shared_memory
Wikipedia, Distributed Memory, https://en.wikipedia.org/wiki/Distributed_memory
Wikipedia, Distributed Shared Memory, https://en.wikipedia.org/wiki/Distributed_shared_memory
Z. Huang et al., "VODCA: View-oriented, distributed, cluster-based approach to parallel computing," in Proc. IEEE Int. Symp. Cluster Comput. the Grid (CCGRID'06), (Singapore, Singapore), May 2006.
Argonne, MPICH: A High-Performance, Portable Implementation of MPI, https://www.anl.gov/mcs/mpich-a-highperformance-portable-implementation-of-mpi
The Open MPI Project, Open MPI: Open Source High Performance Computing, https://www.open-mpi.org/
The Ohio State University, MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE, http://mvapich.cse.ohio-state.edu/
IBM, IBM Spectrum MPI, https://www.ibm.com/products/spectrum-mpi?utm_content=SRCWW&p1=Search&p4=43700067987454012&p5=p&gclid=Cj0KCQiAr5iQBhCsARIsAPcwROMfviu3UCQI4w4tdjcY6gF9AzywHVCsqODz2ZBdV-RxaIcASCobBfMaAhPMEALw_wcB&gclsrc=aw.ds
Microsoft, Microsoft MPI, https://docs.microsoft.com/en-us/message-passing-interface/microsoft-mpi
Intel, Intel MPI Library, https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpilibrary.html#gs.pcbhj9
S.H. Cho and Y.H. Kim, "A fast transmission of mobile agents using binomial trees," The KIPS Trans.: Part A, vol. 9A, no. 3, 2002, pp. 341-350. https://doi.org/10.3745/KIPSTA.2002.9A.3.341
H. Zhao and J. Canny, "Butterfly mixing: Accelerating incremental-update algorithms on clusters," in Proceedings of the 2013 SIAM International Conference on Data Mining, SIAM, Philadelphia, PA, USA, 2013, pp. 785-793.
이정희, 한동수, "패킷단위 병렬데이터 전송을 통한 MPICH-G2 집합통신 성능 향상," 한국정보과학회, 가을 학술발표논문집, 제30권 제2호, 2003.
R. Thakur et al., "Optimization of collective communication operations in MPICH," Int. J. High Perform. Comput. Appl., vol. 19, no. 1, 2005, pp. 49-66. https://doi.org/10.1177/1094342005051521
M. Chaarawi et al., "A tool for optimizing runtime parameters of Open MPI," in Recent Advances in Parallel Virtual Machine and Message Passing Interface, vol. 5205, Springer, Berlin, Heidelberg, Germany, 2008, pp. 210-217.
E. Nuriyev and A. Lastovetsky, "Accurate runtime selection of optimal MPI collective algorithms using analytical performance modelling," arXiv preprint, 2020, CoRR, arXiv: 2004.11062.
J. Pjesivac-Grbovic et al., "MPI collective algorithm selection and quadtree encoding," Parallel Comput. vol. 33, no. 9, 2007, pp. 613-623. https://doi.org/10.1016/j.parco.2007.06.005
S. Hunold et al., "Predicting MPI collective communication performance using machine learning," in Proc. IEEE Int. Conf. Clust. Comput. (CLUSTER), (Kobe, Japan), Sept. 2020.
J.M. Hashmi et al., "Design and characterization of shared address space MPI collectives on modern architectures," in Proc. IEEE/ACM Int. Symp. Clust., Cloud Grid Comput. (CCGRID), (Larnaca, Cyprus), May 2019.
Google, Xpmem: Cross-process Memory Mapping, 2011, https://code.google.com/archive/p/xpmem/
Google, Google Code Archive XPMEM, https://code.google.com/archive/p/xpmem/
S. Chakraborty et al., "SHMEMPMI--Shared memory based PMI for improved performance and scalability," in Proc. IEEE/ACM Int. Symp. Clust., Cloud Grid Comput. (CCGrid), (Cartagena, Colombia), May 2016.
P. Balaji et al., "PMI: A scalable parallel process-management interface for extreme-scale systems," in European MPI Users' Group Meeting, Springer, Berlin, Heidelberg, Germany, 2010, pp. 31-41.
R.L. Graham et al., "Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction," in Proc. Int. Workshop Commun. Optim. HPC (COMHPC), (Salt Lake City, UT, USA), Nov. 2016.
NVIDIA, NVIDIA Mellanox Scalable Hierarchical Aggregation and Reduction Protocol (SHARP), https://docs.nvidia.com/networking/display/sharpv214
J. Stern et al., "Accelerating MPI_Reduce with FPGAs in the Network," Proc Workshop on Exascale MPI. 2017.
P. Haghi et al., "FPGAs in the network and novel communicator support accelerate MPI collectives," in Proc. IEEE High Perform. Extreme Comput. Conf. (HPEC), (Waltham, MA, USA), Sept. 2020.
Mvapich, OSU Collective MPI Benchmarks, http://mvapich.cse.ohio-state.edu/benchmarks/
S. Kumar et al., "Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer," Int. J. High Perform. Comput. Appl., vol. 28, no. 4, 2014, pp. 450-464. https://doi.org/10.1177/1094342014552086
J. Liu, A.R. Mamidala, and D.K. Panda, "Fast and scalable MPI-level broadcast using InfiniBand's hardware multicast support," in Proc. Int. Parallel Distrib. Process. Symp., (Santa Fe, NM, USA), Apr. 2004.
T. Hoefler, C. Siebert, and W. Rehm, "A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast," in Proc. Int. Symp. Parallel Distrib. Process. Symp., (Long Beach, CA, USA), Mar. 2007, pp. 1-8.
S. Aga et al., "Compute caches," in Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA), (Austin, TX, USA), Feb. 2017.
S. Jung et al., "A crossbar array of magnetoresistive memory devices for in-memory computing," Nature, vol. 601, 2022, pp. 211-216. https://doi.org/10.1038/s41586-021-04196-6
J. Huang et al., "Active-routing: Compute on the way for near-data processing," in Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA), (Washington, DC, USA), Feb. 2019.
M. Torabzadehkashi et al., "Catalina: In-storage processing acceleration for scalable big data analytics," in Proc. Euromicro Int. Conf. Parallel, Distrib. Netw.-Based Process. (PDP), (Pavia, Italy), Feb. 2019.
Github, Faiss, https://github.com/facebookresearch/faiss
Texmex, Datasets for approximate nearest neighbor search, http://corpus-texmex.irisa.fr
S.W. Jun et al., "Bluedbm: An appliance for big data analytics," in Proc. ACM/IEEE Annu. Int. Symp. Comput. Archit. (ISCA), (Portland, OR, USA), June 2015.
B. GU et al., "Biscuit: A framework for near-data processing of big data workloads," ACM SIGARCH Comput. Archit. News, 2016, vol. 44, no. 3, pp. 153-165. https://doi.org/10.1145/3007787.3001154
S.C. Kim et al., "In-storage processing of database scans and joins," Inf. Sci., vol. 327, 2016, pp. 183-200. https://doi.org/10.1016/j.ins.2015.07.056
SoCs, MPSoCs & RFSoCs, https://www.xilinx.com/products/silicon-devices/soc.html
Hadoop, Hadoop MapReduce, https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
김선영 외, "CCIX 연결망과 메모리 확장기술 동향," 전자통신동향 분석, 제37권 제1호, 2022, pp. 42-52. https://doi.org/10.22648/ETRI.2022.J.370105

Electronics and Telecommunications Trends (전자통신동향분석)

Research Trends for Improving MPI Collective Communication Performance

MPI 집합통신 성능 향상 연구 동향

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)