과제정보
연구 과제번호 : 인공지능 시스템을 위한 뉴로모픽 컴퓨팅 SW 플랫폼 기술 개발
연구 과제 주관 기관 : 정보통신기술진흥센터
참고문헌
- M. Abadi et al., "TensorFlow: A System for Large-scale Machine Learning," In Proc. USENIX Conf. Oper. Syst. Des. Implementation, Berkeley, CA, USA, 2016, pp. 265-283.
- Y. Jia et al., "Caffe: Convolutional Architecture for Fast Feature Embedding," In Proc. ACM Int. Conf. Multimedia, New York, NY, USA, 2014, pp. 675-678.
- A. Paszke et al., "Automatic Differentiation in PyTorch," In Conf. Neural Inform. Process. Syst., Long Beach, CA, USA, 2017.
- Tensorflow, "XLA: Domain-Specific Compiler for Linear Algebra to Optimizes Tensorflow Computations," https://www.tensorflow.org/performance/xla/
- T. Chen et al., "TVM: An Automated End-to-End Optimizing Compiler for Deep Learning," Feb. 2018, arXiv: 1802.04799v2.
- N. Rotem et al., "Glow: Graph Lowering Compiler Techniques for Neural Networks," May 2018, arXiv: 1805.00907.
- P.G. Allen et al., "NNVM Compiler: Open Compiler for AI Frameworks," 2017. http://tvmlang.org/2017/10/06/nnvmcompiler-announcement.html
- R. Wei, L. Schwartz, and V. Adve, "DLVM: A Modern Compiler Infrastructure for Deep Learning Systems," Nov. 2017, arXiv: 1711.03016v5.
- C. Lattner and V. Adve, "LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation," In Proc. Int. Symp. Code Generation Optimization, San Jose, CA, USA, Mar. 20-24, 2004, pp. 75-86.
- J. Ragan-Kelley et al., "Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines," In Proc. ACM SIGPLAN Conf. Programming Language Des. Implement., Seattle, WA, USA, June 2013, pp. 519-530.
- F. Seide and A. Agarwal, "CNTK: Microsoft'S Opensource Deep-Learning Toolkit," In Proc. ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 2135-2135.
- ONNX. https://onnx.ai/
- D. Kirk, "NVIDIA Cuda Software and GPU Parallel Computing Architecture," In Proc. Int. Symp. Memory Manag., Montreal, Canada, Oct. 2007, pp. 103-104.
- J.E. Stone, D. Gohara, and G. Shi, "OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems," Comput. Sci. Eng., vol. 12, no. 3, 2010, pp. 66-73.
- L.S. Blackford et al., "An Updated Set of Basic Linear Algebra Subprograms (BLAS)," ACM Trans. Math. Softw., vol. 28, no. 2, 2002, pp. 135-151. https://doi.org/10.1145/567806.567807
- NVIDIA, "cuBLAS," https://developer.nvidia.com/cublas
- NVIDIA, "cuDNN," https://developer.nvidia.com/cudnn
- C. Nugteren, "CLBlast: A Tuned OpenCL BLAS Library," In Proc. Int. Workshop OpenCL, Oxford, UK, May 2018, pp. 5:1-5:10.
- Intel, "Intel Open Sources OpenCL Deep Neural Network Library for Intel GPUs," 2017. https://software.intel.com/en-us/forums/opencl/topic/735271
- C. Nugteren and V. Codreanu, "CLTune: A Generic Auto-Tuner for OpenCL Kernels," In Proc. Int. Symp. Embedded Multicore/Many-Core Syst.-on-Chip, Turin, Italy, Sept. 23-25, 2015, pp. 195-202.
- ANACONDA, "Library to Manipulate Arrays on GPU," https://anaconda.org/anaconda/libgpuarray