DOI QR코드

DOI QR Code

인공지능 프로세서 컴파일러 개발 동향

Trends of Compiler Development for AI Processor

  • 김진규 (인공지능프로세서연구실) ;
  • 김혜지 (인공지능프로세서연구실) ;
  • 조용철 (인공지능프로세서연구실) ;
  • 김현미 (인공지능프로세서연구실) ;
  • 여준기 (인공지능프로세서연구실) ;
  • 한진호 (인공지능프로세서연구실) ;
  • 권영수 (지능형반도체연구본부)
  • 발행 : 2021.04.01

초록

The rapid growth of deep-learning applications has invoked the R&D of artificial intelligence (AI) processors. A dedicated software framework such as a compiler and runtime APIs is required to achieve maximum processor performance. There are various compilers and frameworks for AI training and inference. In this study, we present the features and characteristics of AI compilers, training frameworks, and inference engines. In addition, we focus on the internals of compiler frameworks, which are based on either basic linear algebra subprograms or intermediate representation. For an in-depth insight, we present the compiler infrastructure, internal components, and operation flow of ETRI's "AI-Ware." The software framework's significant role is evidenced from the optimized neural processing unit code produced by the compiler after various optimization passes, such as scheduling, architecture-considering optimization, schedule selection, and power optimization. We conclude the study with thoughts about the future of state-of-the-art AI compilers.

키워드

참고문헌

  1. Y. Lecun et al., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, 1998, pp. 2278-2324. https://doi.org/10.1109/5.726791
  2. A. Krizhevsky, I. Sutskever, and G.E. Hinton, "Imagenet classification with deep convolutional neural networks," Adv. Neural Inf. Process. Syst. vol. 25, 2012, pp. 1097-1105.
  3. K. He et al., "Deep residual learning for image recognition," in Proc. Conf. Comput. Vis. Pattern Recognit. Las Vegas, NV, USA, June 2016.
  4. H. Pham et al., "Meta pseudo labels," arXiv preprint, CoRR, 2020, arXiv:2003.10580
  5. https://mlcommons.org/en/
  6. Arm, "Arm NN future roadmap," https://developer.arm.com/ip-products/processors/machine-learning/arm-nn
  7. https://github.com/ARM-software/ComputeLibrary
  8. https://github.com/NervanaSystems/ngraph
  9. Intel, "nGrapph developer guide," https://docs.openvinotoolkit.org/latest/openvino_docs_nGraph_DG_Introduction.html
  10. Android, https://developer.android.com/ndk/guides/neuralnetworks
  11. https://github.com/RadeonOpenCompute/ROCm
  12. https://github.com/NVIDIA/TensorRT
  13. https://github.com/Xilinx/Vitis-AI
  14. https://github.com/tensorflow/tensorflow
  15. https://github.com/pytorch/pytorch
  16. https://github.com/onnx/onnx
  17. https://github.com/microsoft/onnxruntime
  18. https://github.com/xianyi/OpenBLAS
  19. https://github.com/math-atlas/math-atlas
  20. https://github.com/oneapi-src/oneDNN
  21. https://developer.nvidia.com/cuda-toolkit
  22. https://developer.nvidia.com/CUDnn
  23. C. Nugteren, "CLBlast: A tuned OpenCL BLAS library," in Proc. Int. Workshop OpenCL, Oxford, UK, May 2018, 5:1-10.
  24. https://github.com/NVIDIA/cutlass
  25. https://www.tensorflow.org/xla
  26. C. Lattner et al., "MLIR: A compiler infrastructure for the end of Moore's law," arXiv preprint, CoRR, 2020, arXiv:2002.11054
  27. https://github.com/onnx/onnx-mlir
  28. T.D. Le et al., "Compiling ONNX neural network models using MLIR," arXiv preprint, CoRR, 2020, arXiv:2008.08272
  29. Y.C.P. Cho et al., "AB9: A neural processor for inference acceleration," ETRI J. vol. 42, no. 4, 2020, pp. 491-504. https://doi.org/10.4218/etrij.2020-0134
  30. J. Han, M. Choi, and Y. Kwon, "40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D," ETRI J. vol. 42, no. 4, 2020, pp. 468-479. https://doi.org/10.4218/etrij.2020-0128
  31. H.M. Kim, C.G. Lyuh, and Y. Kwon, "Automated optimization for memory-efficient high-performance deep neural network accelerators," ETRI J. vol. 42, no. 4, 2020, pp. 505-517. https://doi.org/10.4218/etrij.2020-0125
  32. 이미영 외, "인공지능프로세서 기술 동향," 전자통신동향분석, 제35권 제3호, 2020, pp. 66-75. https://doi.org/10.22648/ETRI.2020.J.350307
  33. 한진호, 권영수, "병렬 컴퓨팅 기반 인공지능 프로세서 기술동향," IITP 주간기술동향, 제1964호, 2020, pp. 16-29.