DOI QR코드

DOI QR Code

Evaluation Standard for Performance of Artificial Intelligence Systems: ISO/IEC TR 24029-1

인공지능 시스템의 성능 평가 표준: ISO/IEC TR 24029-1

  • Seongsoo Lee (School of Electronic Engineering and Department of Intelligent Semiconductor, Soongsil University)
  • Received : 2023.08.28
  • Accepted : 2023.09.18
  • Published : 2023.09.30

Abstract

This paper describes ISO/IEC TR 24029-1, an international standard to evaluate the performance of artificial intelligence systems. ISO/IEC TR 24029-1 defines the performance measures of artificial intelligence systems in two categories, i.e. interpolation and classificiation. Performance measures in the interpolation categories mean how much the predicted values of the artificial intelligence system is close to the real values. Performance measures in the classification categories mean how much the predicted classes of the artificial intelligence system is equal to the real classes. Based on these performance measures, performance of artificial intelligence systems can be evaluated and performance of different artificial intelligence systems can be compared.

본 논문에서는 인공지능 시스템의 성능을 평가하기 위해 새로 개발된 국제 표준인 ISO/IEC TR 24029-1에 대해 다룬다. ISO/IEC TR 24029-1에서는 인공지능 시스템의 성능 지표를 Interpolation과 Classification의 두 가지 카테고리로 나누어 규정한다. Interpolation 카테고리에 해당하는 성능 지표는 인공지능 시스템이 예측한 값이 실제 값과 얼마만큼 가까운지 그 성능을 평가하는 지표이며 Classification 카테고리에 해당하는 성능 지표는 인공지능 시스템이 분류한 종류가 실제 종류와 얼마만큼 일치하는지 그 성능을 평가하는 지표이다. 이들 지표를 사용하면 인공지능 시스템의 성능을 평가하고 서로 다른 인공지능 시스템의 성능을 비교할 수 있다.

Keywords

Acknowledgement

This work was supported by the R&D Program of the Ministry of Trade, Industry, and Energy (MOTIE) and Korea Evaluation Institute of Industrial Technology (KEIT). (20023805, RS-2022-00155731, RS-2023-00232192)

References

  1. ISO/IEC AWI TR 42106, "Information Technology - Artificial Intelligence - Overview of Differentiated Benchmarking of AI System Quality Characteristics," https://www.iso.org/standard/86903.html
  2. ISO/IEC 25059:2023, "Software Engineering - Systems and Software Quality Requirements and Evaluation (SQuaRE) - Quality Model for AI Systems," https://www.iso.org/standard/80655.html
  3. ISO/IEC TR 29119-11:2020, "Software and Systems Engineering - Software Testing - Part 11: Guidelines on the Testing of AI-based Systems," https://www.iso.org/standard/79016.html
  4. DIN SPEC 92001-2, "Artificial Intelligence - Life Cycle Processes and Quality Requirements - Part 2: Robustness," https://www.din.de/en/wdc-beuth:din21:330011015
  5. IEEE 2937-2022, "IEEE Standard for Performance Benchmarking for Artificial Intelligence Server Systems," https://standards.ieee.org/ieee/2937/10376
  6. ISO/IEC TR 24029-1:2021, "Artificial Intelligence (AI) - Assessment of the Robustness of Neural Networks - Part 1: Overview," https://www.iso.org/standard/77609.html
  7. J. Ha, J. Seo, and S. Lee, "Living Lab and Confusion Matrix for Performance Improvement and Evaluation of Artificial Intelligence System in Life Environment," j.inst.Korean. electr.electron.eng., vol.24, no.4, pp.1180-1183, 2020. DOI: 10.7471/ikeee.2020.24.4.1180