Study on Multiple sparse matrix-matrix multiplication hardware accelerator

Tae-Hyoung Kim;Yeong-Pil Cho;

doi:10.3745/PKIPS.y2024m05a.47

한국정보처리학회:학술대회논문집 (Proceedings of the Korea Information Processing Society Conference)

한국정보처리학회 2024년도 춘계학술발표대회
/
Pages.47-50
/
2024
/
2005-0011(pISSN)
/
2671-7298(eISSN)

한국정보처리학회 (Korea Information Processing Society)

DOI QR Code

다중 희소 행렬-행렬 곱셈 하드웨어 가속기 연구

Study on Multiple sparse matrix-matrix multiplication hardware accelerator

김태형 (한양대학교 컴퓨터소프트웨어학과 (미래자동차-SW 융합전공)) ;
조영필 (한양대학교 컴퓨터소프트웨어학과 )

Tae-Hyoung Kim (Dept, of Computer and Software (Automotive-Computer Convergence), Hanyang University) ;
Yeong-Pil Cho (Dept. of Computer Software, Hanyang University)

발행 : 2024.05.23

https://doi.org/10.3745/PKIPS.y2024m05a.47 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

희소 행렬은 대부분의 요소가 0 인 행렬이다. 이러한 희소 행렬-행렬 곱셈을 수행할 경우 0 인 데이터 또한 곱셈을 수행하니 불필요한 연산이 발생한다. 이러한 문제를 해결하고자 행렬 압축 알고리즘 또는 곱셈의 부분합의 수를 줄이는 연구들이 활발히 진행 중이다. 하지만 현재의 연구들은 주로 단일 행렬 연산에 집중되어 있어 FPGA(Field Programmable Gate Array)와 특정 용도로 사용하는 가속기에서는 리소스를 충분히 활용하지 못해 비효율적이다. 본 연구는 FPGA 의 모든 리소스를 사용하여 다중 희소 행렬 곱셈을 수행하는 아키텍처를 제안한다.

키워드

과제정보

이 논문은 과학기술정보통신부의 재원으로 정보통신기획평가원(No. 2020-0-01840, 스마트폰의 내부데이터 접근 및 보호 기술 분석)과 한국연구재단(No. NRF-2022R1A4A1032361, Processing-in-Memory 보안 기술 개발)의 지원을 받아 수행된 연구임

참고문헌

Lin, Dian-Lun, and Tsung-Wei Huang. "Accelerating large sparse neural network inference using GPU task graph parallelism." IEEE Transactions on Parallel and Distributed Systems 33.11 (2021): 3041-3052. https://doi.org/10.1109/TPDS.2021.3138856
Xie, Xinfeng, et al. "Exploiting sparsity to accelerate fully connected layers of cnn-based applications on mobile socs." ACM Transactions on Embedded Computing Systems (TECS) 17.2 (2017): 1-25. https://doi.org/10.1145/3122788
Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
https://aws.amazon.com/ko/ec2/instance-types/f1/
Dongarraxz, Jack, et al. "A sparse matrix library in C++ for high performance architectures." Proc. 2nd Object Oriented Numerics Conf. 1994.
Pal, Subhankar, et al. "Outerspace: An outer product based sparse matrix multiplication accelerator." 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2018.
Zhang, Zhekai, et al. "Sparch: Efficient architecture for sparse matrix multiplication." 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020
Hojabr, Reza, et al. "Spaghetti: Streaming accelerators for highly sparse gemm on fpgas." 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2021.
Shabani, Hesam, et al. "Hirac: A hierarchical accelerator with sorting-based packing for spgemms in dnn applications." 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2023.
Zhuang, Jinming, et al. "CHARM: Composing Heterogeneous Accele Rators for Matrix Multiply on Versal ACAP Architecture." Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 2023.
Xu, Shiyao, et al. "Sparkle: A high efficient sparse matrix multiplication accelerator for deep learning." 2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 2022.

한국정보처리학회:학술대회논문집 (Proceedings of the Korea Information Processing Society Conference)

다중 희소 행렬-행렬 곱셈 하드웨어 가속기 연구

Study on Multiple sparse matrix-matrix multiplication hardware accelerator

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)