Acknowledgement
This work was supported by the ICT R&D program of MSIT/IITP[2018-0-00195, Artificial Intelligence Processor Research Laboratory].
References
- A. Radford et al., "Improving language understanding by generative pre-training," OpenAI Blog, 2018.
- J. Devlin et al., "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint, CoRR, 2018, arXiv: 1810.04805.
- A. Radford et al., "Language models are unsupervised multitask learners," OpenAI Blog, 2019.
- C. Raffel et al., "Exploring the limits of transfer learning with a unified text-to-text transformer," arXiv preprint, CoRR, 2019, arXiv: 1910.10683.
- T.B. Brown et al., "Language models are few-shot learners," arXiv preprint, CoRR, 2020, arXiv: 2005.14165.
- W. Fedus, B. Zoph, and N. Shazeer, "Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity," arXiv preprint, CoRR, 2021, arXiv:2101.03961.
- A. Vaswani et al., "Attention is all you need," in Proc. Conf. Neural Inf. Process. Syst., (Long Beach, CA, USA), Dec. 2017, pp. 5998-6008.
- https://paperswithcode.com/sota/image-classification-on-imagenet
- https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf
- N. Wang et al., "Training deep neural networks with 8-bit floating point numbers," in Proc. Int. Conf. Neural Inf. Proc. Syst., (Montreal, Canada), Dec. 2018, pp. 7686-7695.
- X. Sun et al., "Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks," in Proc. Int. Conf. Neural Inf. Proc. Syst., (Vancouver, Canada), Dec. 2019, pp. 4900-4909.
- N.J. Higham, "The accuracy of floating point summation," SIAM J. Sci. Comput., vol. 14, no. 4, 1993, pp. 783-799. https://doi.org/10.1137/0914050
- J. Choi et al., "Pact: Parameterized clipping activation for quantized neural networks," arXiv preprint, CoRR, 2018, arXiv: 1805.06085.
- S.K. Esser et al., "Learned Step Size Quantization," in Proc. Int. Conf. Learn. Represent., (Addis Ababa, Ethiopia), Feb. 2020.
- D. Zhang et al., "Lq-nets: Learned quantization for highly accurate and compact deep neural networks," in Proc. Eur. Conf. Comput. Vis. (ECCV), (Munich, Germany), Sept. 2018, pp. 365-382.
- X. Sun et al., "Ultra-low precision 4-bit training of deep neural networks," in Proc. Conf. Neural Inf. Process. Syst., (Vancouver, Canada), Dec. 2020.
- A. Agrawal et al., "A 7nm 4-core AI chip with 25.6 TFLOPS hybrid FP8 training, 102.4 TOPS INT4 inference and workload-aware throttling," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), (San Francisco, CA, USA), Feb. 2021, pp. 144-146.
- S. Venkataramani et al., "RaPiD: AI accelerator for ultra-low precision training and inference," in Proc. ACM/IEEE Annu. Int. Symp. Comput. Archit. (ISCA), (Valencia, Spain), June 2021, pp. 153-166.
- J. Park, S. Lee, and D. Jeon, "A 40nm 4.81 TFLOPS/W 8b floating-point training processor for non-sparse neural networks using shared exponent bias and 24-way fused multiply-add tree," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), (San Francisco, CA, USA), Feb. 2021, pp. 1-3.
- J. Lee et al., "LNPU: A 25.3 TFLOPS/W sparse deep-neural-network learning processor with fine-grained mixed precision of FP8-FP16," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), (San Francisco, CA, USA), Feb. 2019, pp. 142-144.
- N. Shah et al., "9.4 PIU: A 248GOPS/W stream-based processor for irregular probabilistic inference networks using precision-scalable posit arithmetic in 28nm," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), (San Francisco, CA, USA), Feb. 2021, pp. 150-152.