• Title/Summary/Keyword: AI Evaluation

Search Result 478, Processing Time 0.025 seconds

A Study on Evaluation Methods for Interpreting AI Results in Malware Analysis (악성코드 분석에서의 AI 결과해석에 대한 평가방안 연구)

  • Kim, Jin-gang;Hwang, Chan-woong;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1193-1204
    • /
    • 2021
  • In information security, AI technology is used to detect unknown malware. Although AI technology guarantees high accuracy, it inevitably entails false positives, so we are considering introducing XAI to interpret the results predicted by AI. However, XAI evaluation studies that evaluate or verify the interpretation only provide simple interpretation results are lacking. XAI evaluation is essential to ensure safety which technique is more accurate. In this paper, we interpret AI results as features that have significantly contributed to AI prediction in the field of malware, and present an evaluation method for the interpretation of AI results. Interpretation of results is performed using two XAI techniques on a tree-based AI model with an accuracy of about 94%, and interpretation of AI results is evaluated by analyzing descriptive accuracy and sparsity. As a result of the experiment, it was confirmed that the AI result interpretation was properly calculated. In the future, it is expected that the adoption and utilization of XAI will gradually increase due to XAI evaluation, and the reliability and transparency of AI will be greatly improved.

A Study on Analysis Criteria for AI Service Impact Assessment (인공지능 서비스 영향성 평가를 위한 분석 기준 연구)

  • Soonduck, Yoo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.1
    • /
    • pp.7-13
    • /
    • 2023
  • This study investigated the criteria for evaluating the impact of artificial intelligence services. The study classified AI evaluation targets into two areas: AI service and AI technology, and identified influence, sustainability, efficiency, effectiveness, and appropriateness as potential evaluation criteria. The time aspect of AI service evaluation was divided into pre-evaluation and post-evaluation, with pre-evaluation focused on reviewing items during development and design. The AI service area was classified into public, private, and mixed forms, and the impact assessment was classified as vertical or horizontal. The application of AI services was divided into normative and regulatory aspects, and the purpose of the evaluation could be impact or process evaluation. The subject and field of the AI service could also be used for classification purposes. The results of this study can be used to support the creation of AI service impact policies and countermeasures. However, further research is needed to develop specific indicators based on the criteria identified in this study to evaluate the impact of AI services.

Test and Evaluation Procedures of Defense AI System linked to the ROK Defense Acquisition System (국방획득체계와 연계한 국방 인공지능(AI) 체계 시험평가 방안)

  • Yong-Bok Lee;Min-Woo Choi;Min-ho Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.229-237
    • /
    • 2023
  • In this research, a new Test and Evaluation (T&E) procedure for defense AI systems is proposed to fill the existing gap in established methodologies. This proposed concept incorporates a data-based performance evaluation, allowing for independent assessment of AI model efficacy. It then follows with an on-site T&E using the actual AI system. The performance evaluation approach adopts the project promotion framework from the defense acquisition system, outlining 10 steps for R&D projects and 9 steps for procurement projects. This procedure was crafted after examining AI system testing standards and guidelines from both domestic and international civilian sectors. The validity of each step in the procedure was confirmed using real-world data. This study's findings aim to offer insightful guidance in defense T&E, particularly in developing robust T&E procedures for defense AI systems.

A Study on the Effectiveness of AI-based Learner-led Assessment in Elementary Software Education (초등 소프트웨어 교육에서 AI기반의 학습자 주도 평가의 효과성 고찰)

  • Shin, Heenam;Ahn, Sung Hun
    • Journal of Creative Information Culture
    • /
    • v.7 no.3
    • /
    • pp.177-185
    • /
    • 2021
  • In future education, the paradigm of education is changing due to changes in learner-led and assessment methods. In addition, AI-based learning infrastructure and software education are increasingly needed. Thus, this study aims to examine the effectiveness of AI-based evaluation in future education by combining it with learner-led assessment. Using AI education and evaluation literature and Step 7 of the Learner-Driven Software Assessment Method, we sought to extract evaluation elements tailored to elementary school level in conjunction with the 2015 revised elementary practical course content elements, software understanding, procedural problem solving, and structural evaluation elements. In the future, we will develop a grading system that applies AI-based learner-led evaluation elements in software education and continuously demonstrate its effectiveness, and help the school site prepare for future education independently through AI-based learner-led assessment in software education.

Development of Guideline for Heuristic Based Usability Evaluation on SaMD (SaMD에 대한 휴리스틱 기반 사용적합성 평가 가이드라인 개발)

  • Jong Yeop Kim;Junghyun Kim;Zero Kim;Myung Jin Chung
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.6
    • /
    • pp.428-442
    • /
    • 2023
  • In this study, we have a goal to develop usability evaluation guidelines for heuristic-based artificial intelligence-based Software as a Medical Device (SaMD) in the medical field. We conducted a gap analysis between medical hardware (H/W) and non-medical software (S/W) based on ten heuristic principles. Through severity assessments, we identified 69 evaluation domains and 112 evaluation criteria aligned with the ten heuristic principles. Subsequently, we categorized each evaluation domain into five types, including user safety, data integrity, regulatory compliance, patient therapeutic effectiveness, and user convenience. We proposed usability evaluation guidelines that apply the newly derived heuristic-based Software as a Medical Device (SaMD) evaluation factors to the risk management process. In the discussion, we also have proposed the potential applications of the research findings and directions for future research. We have emphasized the importance of the judicious application of AI technology in the medical field and the evaluation of usability evaluation and offered valuable guidelines for various stakeholders, including medical device manufacturers, healthcare professionals, and regulatory authorities.

Designing the Framework of Evaluation on Learner's Cognitive Skill for Artificial Intelligence Education through Computational Thinking (Computational Thinking 기반 인공지능교육을 통한 학습자의 인지적역량 평가 프레임워크 설계)

  • Shin, Seungki
    • Journal of The Korean Association of Information Education
    • /
    • v.24 no.1
    • /
    • pp.59-69
    • /
    • 2020
  • The purpose of this study is to design the framework of evaluation on learner's cognitive skill for artificial intelligence(AI) education through computational thinking. To design the rubric and framework for evaluating the change of leaner's intrinsic thinking, the evaluation process was consisted of a sequential stage with a) agency that cognitive learning assistance for data collection, b) abstraction that recognizes the pattern of data and performs the categorization process by decomposing the characteristics of collected data, and c) modeling that constructing algorithms based on refined data through abstraction. The evaluating framework was designed for not only the cognitive domain of learners' perceptions, learning, behaviors, and outcomes but also the areas of knowledge, competencies, and attitudes about the problem-solving process and results of learners to evaluate the changes of inherent cognitive learning about AI education. The results of the research are meaningful in that the evaluating framework for AI education was developed for the development of individualized evaluation tools according to the context of teaching and learning, and it could be used as a standard in various areas of AI education in the future.

Design and Implementation of ELAS in AI education (Experiential K-12 AI education Learning Assessment System)

  • Moon, Seok-Jae;Lee, Kibbm
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.62-68
    • /
    • 2022
  • Evaluation as learning is important for the learner competency test, and the applicable method is studied. Assessment is the role of diagnosing the current learner's status and facilitating learning through appropriate feedback. The system is insufficient to enable process-oriented evaluation in small educational institute. Focusing on becoming familiar with the AI through experience can end up simply learning how to use the tools or just playing with them rather than achieving ultimate goals of AI education. In a previous study, the experience way of AI education with PLAY model was proposed, but the assessment stage is insufficient. In this paper, we propose ELAS (Experiential K-12 AI education Learning Assessment System) for small educational institute. In order to apply the Assessment factor in in this system, the AI-factor is selected by researching the goals of the current SW education and AI education. The proposed system consists of 4 modules as Assessment-factor agent, Self-assessment agent, Question-bank agent and Assessment -analysis agent. Self-assessment learning is a powerful mechanism for improving learning for students. ELAS is extended with the experiential way of AI education model of previous study, and the teacher designs the assessment through the ELAS system. ELAS enables teachers of small institutes to automate analysis and manage data accumulation following their learning purpose. With this, it is possible to adjust the learning difficulty in curriculum design to make better for your purpose.

Proposal Self-Assessment System of AI Experience Way Education

  • Lee, Kibbm;Moon, Seok-Jae;Lee, Jong-Yong
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.274-281
    • /
    • 2021
  • In the field of artificial intelligence education, discussions on the direction of artificial intelligence education are actively underway, and it is necessary to establish a foundation for future information education. It is necessary to design a creative convergence teaching-learning and evaluation method. Although AI experience coding education has been applied, the evaluation stage is insufficient. In this paper, we propose an evaluation system that can verify the validity of the proposed education model to find a way to supplement the existing learning module. The core components of this proposed system are Assessment-Factor, Self-Diagnosis, Item Bank, and Evaluation Result modules, which are designed to enable system access according to the roles of administrator, instructors and learners. This system enables individualized learning through online and offline connection.

A Study on Major Characteristic Analysis and Quality Evaluation Attributes of Artificial Intelligence Service (인공지능서비스의 특성분석과 품질평가속성에 대한 연구)

  • Baek, Chang Hwa;Lim, Sung Uk;Choe, Jae Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.47 no.4
    • /
    • pp.837-846
    • /
    • 2019
  • Purpose: The purpose of this study is to define various concepts, features, and scopes by examining various previous studies on AI services that are completely different from existing services. It also examines the limitations of existing service quality evaluation methods and studies the characteristics by combining them with various cases of new AI services. And this is to derive and propose quality evaluation attributes of AI service. Methods: The concept and characteristics of artificial intelligence were derived through research and analysis of various previous studies related to artificial intelligence. The key characteristics and quality evaluation items were derived through the KJ method and matching based on the keywords and characteristics derived from previous studies and various cases. Results: Based on the review of various previous studies on the quality of artificial intelligence services, this study presents the main characteristics and quality evaluation items of new artificial intelligence services, which are completely different from existing service quality evaluations. Conclusion: The quality measurement model of AI service is very useful when planning and developing AI-based new products or services because it can accurately evaluate the requirements of consumers using the services of the new AI era. In addition, consumers can be recommended a customized service according to the situation or taste, and can be provided with a customized service based on this.

FlappyBird Competition System: A Competition-Based Assessment System for AI Course (FlappyBird Competition System: 인공지능 수업의 경쟁 기반 평가 시스템의 구현)

  • Sohn, Eisung;Kim, Jaekyung
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.4
    • /
    • pp.593-600
    • /
    • 2021
  • In this paper, we present the FlappyBird Competition System (FCS) implementation, a competition-based automated assessment system used in an entry-level artificial intelligence (AI) course at a university. The proposed system provides an evaluation method suitable for AI courses while taking advantage of automated assessment methods. Students are to design a neural network structure, train the weights, and tune hyperparameters using the given reinforcement learning code to improve the overall performance of game AI. Students participate using the resulting trained model during the competition, and the system automatically calculates the final score based on the ranking. The user evaluation conducted after the semester ends shows that our competition-based automated assessment system promotes active participation and inspires students to be interested and motivated to learn AI. Using FCS, the instructor significantly reduces the amount of time required for assessment.