Technical Trends in On-device Small Language Model Technology Development

G. Kim;K. Yoon;R. Kim;J. H. Ryu;S. C. Kim;

doi:10.22648/ETRI.2024.J.390409

Electronics and Telecommunications Trends (전자통신동향분석)

Volume 39 Issue 4
/
Pages.82-92
/
2024
/
1225-6455(pISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Technical Trends in On-device Small Language Model Technology Development

온디바이스 소형언어모델 기술개발 동향

G. Kim ;
K. Yoon ;
R. Kim ;
J. H. Ryu ;
S. C. Kim

김근용 (엣지컴퓨팅응용서비스연구실) ;
윤기하 (엣지컴퓨팅응용서비스연구실) ;
김량수 (엣지컴퓨팅응용서비스연구실) ;
류지형 (엣지컴퓨팅응용서비스연구실) ;
김성창 (엣지컴퓨팅응용서비스연구실)

Published : 2024.08.01

https://doi.org/10.22648/ETRI.2024.J.390409 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

This paper introduces the technological development trends in on-device SLMs (Small Language Models). Large Language Models (LLMs) based on the transformer model have gained global attention with the emergence of ChatGPT, providing detailed and sophisticated responses across various knowledge domains, thereby increasing their impact across society. While major global tech companies are continuously announcing new LLMs or enhancing their capabilities, the development of SLMs, which are lightweight versions of LLMs, is intensely progressing. SLMs have the advantage of being able to run as on-device AI on smartphones or edge devices with limited memory and computing resources, enabling their application in various fields from a commercialization perspective. This paper examines the technical features for developing SLMs, lightweight technologies, semiconductor technology development trends for on-device AI, and potential applications across various industries.

Keywords

온디바이스 AI;

Acknowledgement

본 연구는 산업통상자원부(MOTIE)와 한국에너지기술평가원(KETEP)의 지원을 받아 수행한 연구 과제입니다[No. 2021202090053B].

References

A. Vaswani et al., "Attention is all you need," in Proc. NeurIPS, (Long Beach, CA, USA), Dec. 2017.
Samsung Newsroom, "삼성전자, '삼성 AI 포럼'서 자체 개발 생성형 AI '삼성 가우스' 공개," 2023. 11. 8.
D. Hendrycks et al., "Measuring massive multitask language understanding," in Proc. ICLR, (Virtual Only), May 2021.
L. Zheng et al., "Judging LLM-as-a-judge with MTbench and chatbot arena," in Proc. NeurIPS, (New Orleans, LA, USA), Dec. 2023.
Meta, Introducing Meta Llama 3: The Most Capable Openly Available LLM to Date, https://ai.meta.com/blog/meta-llama-3/
J. Ainslie et al., "GQA: Training generalized multi-query transformer models from multi-head checkpoints," in Proc. EMNLP, (Singapore, Singapore), Dec. 2023.
A.Q. Jiang et al., "Mistral 7B," arXiv preprint, CoRR, 2023, arXiv: 2310.06825.
Google Gemma Team, "Gemma: Open models based on gemini research and technology," arXiv preprint, CoRR, 2024, arXiv: 2403.08295.
M. Abdin et al., "Phi-3 technical report: A highly capable language model locally on your phone," arXiv preprint, CoRR, 2024, arXiv: 2404.14219.
S. Gunasekar et al., "Textbooks are all you need," arXiv preprint, CoRR, 2023, arXiv: 2306.11644.
X. Ma et al., "LLM-Pruner: On the structural pruning of large language models," in Proc. NeurIPS, (New Orleans, LA, USA), Dec. 2023.
M. Sun et al., "A simple and effective pruning approach for large language models," in Proc. ICLR, (Vienna, Austria), May 2024.
Y. Gu et al., "MiniLLM: Knowledge distillation of large language models," in Proc. ICLR, (Vienna, Austria), May 2024.
W. Liu et al., "Mind's mirror: Distilling self-evaluation capability and comprehensive thinking from large language models," arXiv preprint, CoRR, 2024, arXiv: 2311.09214v3.
A.Q. Jiang et al., "Mixtral of experts," arXiv preprint, CoRR, 2024, arXiv: 2401.04088.
전원, 여준기, "초거대 인공지능 프로세서 반도체 기술 개발동향," 전자통신동향분석, 제38권 제5호, 2023.
Qualcomm, Unlocking On-Device Generative AI With an NPU and Heterogeneous Computing, 2024. 2.
CNET, Apple A17 Pro: The New Chip Brain in the iPhone 15 Pro, Pro Max, 2023. 9. 12.
Samsung Newsroom, "삼성전자, 업계 최초 'CXL 2.0 D램' 개발," 2023. 5. 12.
SK하이닉스 뉴스룸, "[2023 AI 메모리 결산] HBM.PIM.CXL 라인업 '탄탄' SK하이닉스, Global No.1 AI Company로 도약한다," 2023. 12. 20.
S. Jung et al., "A crossbar array of magnetoresistive memory devices for in-memory computing," Nature, vol. 601, 2022, pp. 211-216.
H. Liu et al., "Visual instruction tuning," in Proc. NeurIPS, (New Orleans, LA, USA), Dec. 2023.
AI-RAN alliance, https://ai-ran.org/news/industryleaders-in-ai-and-wireless-form-ai-ran-alliance/

Electronics and Telecommunications Trends (전자통신동향분석)

Technical Trends in On-device Small Language Model Technology Development

온디바이스 소형언어모델 기술개발 동향

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)