• Title/Summary/Keyword: 모델 평가 지표

Search Result 824, Processing Time 0.023 seconds

A Study on Evaluating Summarization Performance using Generative Al Model (생성형 AI 모델을 활용한 요약 성능 평가 연구 )

  • Gyuri Choi;Seoyoon Park;Yejee Kang;Hansaem Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.228-233
    • /
    • 2023
  • 인간의 수동 평가 시 시간과 비용의 소모, 주석자 간의 의견 불일치, 평가 결과의 품질 등 불가피한 한계가 발생한다. 본 논문에서는 맥락을 고려하고 긴 문장 입출력이 가능한 ChatGPT를 활용한 한국어 요약문 평가가 인간 평가를 대체하거나 보조하는 것이 가능한가에 대해 살펴보았다. 이를 위해 ChatGPT가 생성한 요약문에 정량적 평가와 정성적 평가를 진행하였으며 정량적 지표로 BERTScore, 정성적 지표로는 일관성, 관련성, 문법성, 유창성을 사용하였다. 평가 결과 ChatGPT4의 경우 인간 수동 평가를 보조할 수 있는 가능성이 있음을 확인하였다. ChatGPT가 영어 기반으로 학습된 모델임을 고려하여 오류 발견 성능을 검증하고자 한국어 오류 요약문으로 추가 평가를 진행하였다. 그 결과 ChatGPT3.5와 ChatGPT4의 오류 요약 평가 성능은 불안정하여 인간을 보조하기에는 아직 어려움이 있음을 확인하였다.

  • PDF

Development of an Evaluation Model for the Implementation of IMO Instruments (IMO 협약이행에 대한 평가모델 개발)

  • Choi, Choong-Jung;Jung, Jung-Sik;An, Kwang
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.4
    • /
    • pp.542-548
    • /
    • 2022
  • In order to reduce marine accidents, each contracting Government needs to implement the instruments enacted and amended by the International Maritime Organization (IMO). The III Code requires each administration of the government to have a system for improvement through periodic review and evaluation and to include performance indicators in its evaluation methods. Thus, each IMO Member State needs to develop its own performance indicators. The purpose of this paper is to develop and present an evaluation model using the Balanced Scorecard (BSC) and Key Performance Indicators (KPI) in order to quantify and evaluate the level of implementation of the instruments by the administrations. From the perspective of 'III-BSC', which applies the BSC concept to the III code requirements, the Critical Success Factors (CSF) that must be secured to achieve the established vision were drawn up, and candidate KPIs for each evaluation area were developed to measure the derived key success factors and an initial study model was designed composed of four levels. The validity of the KPIs was verified and the study model was finalized using the survey design using the SMART technique. Furthermore, based on the developed study model, an evaluation model for the implementation of the BSC-based IMO instruments was developed by deriving the weights of elements for each level through AHP analysis. The developed evaluation model is expected to contribute toward improving the administrations' level of implementation of the IMO instruments as a tool for quantitatively grasping the level of performance of the implementation.

ISO/IEC 9126 Quality Model-based Assessment Criteria for Measuring the Quality of Big Data Analysis Platform (빅데이터 분석 플랫폼 평가를 위한 ISO/IEC 9126 품질 모델 기반 평가준거 개발)

  • Lee, Jong Yun
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.459-467
    • /
    • 2015
  • The analysis platform of remote-sensing big data is a system that downloads data from satellites, transforms it to a data type of L3, and then analyzes it and produces its analysis results. The objective of this paper is to develop ISO/IEC 9126-1 software quality model-based assessment criteria, in order to evaluate the quality of remote-sensing big data analysis platform. Its detailed research contents are as follows. First, the ISO/IEC 9216 standards and previous software evaluation models will be reviewed. Second, this paper will define evaluation areas, evaluation elements, and evaluation items for measuring the quality of big data analysis platform. Third, the validity of the assessment criteria will be verified by statistical experiments through content validity, reliability validity, and construct validity, by using SPSS 20.0 and Amos 20.0 software. The construct validity will also be conducted by performing the confirmatory factor analysis and path analysis. Lastly, it is significant that our research result demonstrates the first evaluation criteria in measuring the quality of big data analysis platform. It is also expected that our assessment criteria could be used as the basis information for evaluation criteria in the platforms that will be developed in the future.

Development of the Carrying Capacity Indicators Management Program based on VERP model in Hallasan National Park (VERP 모델을 이용한 한라산국립공원 수용력 지표관리프로그램 개발)

  • Kwon, Heon-Gyo;Shin, Won-Sop;Han, Sang-Yeol
    • Journal of Korean Society of Forest Science
    • /
    • v.99 no.4
    • /
    • pp.508-516
    • /
    • 2010
  • Hallasan National Park is facing dramatic increase of visitors since no entrance fee charge effected in January of 2007 and it has created a concern about appropriate use levels. The overall objective of this study is development of the carrying capacity indicators management program using indicator and standard based on visitor experience and resource protection (VERP) model. The result of delphi survey identified eight potential indicators of resource and experiential conditions, including quality of valley water, visitor counts, trail impacts, crowding etc. Also, Data were also gathered to help provide an empirical foundation for setting standards for these indicator variables. The carrying capacity indicators management program based on VERP model estimates sustainability of national park and analyze scientifically change about resources and visitor's behavior. Also, it systematically manage and use united data, it supports operation accomplishment respected rational decision.

A Study on the Performance Evaluation Model for Successful Introduction and Operations for IPP Program (IPP 제도의 성공적 도입 및 운영을 위한 성과평가 모델에 관한 연구)

  • Lee, Moonsu;Oh, Chang-Heon;Kim, Namho;Ha, Joonhong
    • The Journal of Korean Institute for Practical Engineering Education
    • /
    • v.4 no.1
    • /
    • pp.86-92
    • /
    • 2012
  • For the successful operation of IPP program which is a unique Korean Co-op education program designed and implemented by Korea Tech, it is very crucial to have both a reasonable performance evaluation system and a systematic feedback and upgrading system for the program. In this paper, we will provide the logic model for long-term performance evaluation of Korea Tech's IPP program. Since the critical success factors(CSF) and Key Performance Index(KPI) are very important for the IPP program implementation, they are also provided and discussed in detail. In addition, we will discuss and analyze about the student and the industry survey results for IPP program's sucess factors.

  • PDF

A Study on Development of Digital Curation Maturity Models and Indicators: Focusing on KISTI (디지털 큐레이션 성숙도 모델 및 지표 개발에 관한 연구: 한국과학기술정보연구원 디지털큐레이션센터를 중심으로)

  • Seonghun, Kim;Suelki, Do;Sangeun, Han;Jayhoon, Kim;Seokjong, Lim;Jinho, Park
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.4
    • /
    • pp.269-306
    • /
    • 2022
  • This study aimed to develop indicators that can measure the digital transformation performance of science and technology information construction and sharing systems by utilizing the Digital Curation Maturity Models. For digital transformation, it is necessary to consider not only simple service improvement but also organizational and business changes. In this study, we aimed to develop a model for measuring the digital transformation of KISTI, Korea's representative science and technology information service organization. KISTI has already carried out BPR work for digital transformation and borrowed the concept of a maturity model. However, in BPR, there is no method to measure the result. Therefore, in this paper, we developed an index to measure digital transformation based on the maturity model. Indicator development was carried out in two ways: model development and evaluation. Cases for model construction were made through a comprehensive review of existing KISTI and various domestic and foreign cases. The models before verification were technology (37), data (45), strategy (18), organization (36), and (social)influence (14) based on the major categories. After verification using confirmatory factor analysis, the model is classified as technology (20 / 17 indicators dropped), data (36 / 9 indicators dropped), strategy (18 / maintenance), organization(30 / 6 indicators dropped), and (social) influence (13 indicators / 1 indicator dropped).

Development of a Model for Estimating Leaf Area and the Number of Flower Using Leaf Length and Width of Farfugium japonicum Kitam. (털머위(Farfugium japonicum Kitam.)의 엽장과 엽폭을 이용한 엽면적 및 개화 수 추정 모델 개발)

  • Dae Ho Jung;Yong Suk Chung;Hyunseung Hwang
    • Journal of Bio-Environment Control
    • /
    • v.32 no.2
    • /
    • pp.115-121
    • /
    • 2023
  • The leopard plant has the characteristic of being used for ornamental purposes when there are yellow spots on the leaves, and is widely used as a bed plant for viewing flowers. To set several indicators to predict the growth of crops with ornamental value, and to quantitatively express the relationship between the indicators are necessary. In this study, we determine a model that estimates the leaf area and the number of flower of Farfugium japonicum Kitam. using leaf length and width, and conducting a regression analysis on some regression models. As an indicator for estimating the leaf area and the number of flower, the leaf length and width of F. japonicum were measured and applied to 8 regression models. As a result of regression analysis of 8 models that estimated leaf area and the number of flower, R2 values of the linear models were all higher than 0.84 and 0.80. As a result of validation, using the most reliable model among the models for estimating the leaf area and the number of flowering, R2 was 0.90 and 0.82, respectively. Using a model that estimates various indicators that can be used for quality evaluation from easy-to-measure morphological factors, the evaluation of ornamental plants will be facilitated.

A study on the Interface Evaluation Guidelines for Integrated Information Retrieval System (통합정보검색시스템의 인터페이스 평가지표에 관한 연구)

  • Lee, Too-Young;Yoon, Dae-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.20 no.3
    • /
    • pp.177-197
    • /
    • 2003
  • The purpose of this study is to suggest the subject and standards of evaluation on integrated IR interface. For this study, we studied the preceding research about major IR interface models. We took the survey for interface elements which were verified by experts. These interface elements are divided twe viewpoints. One is the cognitive viewpoints which are the page design, content design, site design, output form, usability and aesthelic facet. The other is the objective viewpoints which are page design, dontent design, site design, output for and usability. We found that these evaluation elements have a crediblilty.

Performance Evaluation of Truck Haulage Operations in an Underground Mine using GMG's Time Usage Model and Key Performance Indicators (GMG 시간 사용 모델 및 핵심성과지표를 이용한 지하 광산 트럭 운반 작업 성능 평가)

  • Park, Sebeom;Choi, Yosoon
    • Tunnel and Underground Space
    • /
    • v.32 no.4
    • /
    • pp.254-271
    • /
    • 2022
  • The performance of truck haulage operations in an underground mine was evaluated using the time usage model and key performance indicators (KPIs) proposed by Global Mining Guidelines Group (GMG). An underground mine that mainly produces iron and titanium iron was selected as a study area, and truck haulage data were collected using Bluetooth beacons and tablet PCs. The collected data were analyzed to identify unit operations, activities, events, and required time of truck haulage operations, and time categories were classified based on the time usage model. The performance of the haulage operations was evaluated using nine indicators in terms of availability, utilization, and effectiveness. As a result, in terms of availability, uptime was 33.9%, physical availability was 95.7%, and mechanical availability was 94.9%. In the case of utilization, use of availability was 83.1%, asset utilization was 28.1%, and operating and effective utilization were 79.6% and 77.7%, respectively. Also, in terms of efficiency, operating efficiency was high at 97.6%, and production effectiveness was found to be 49%.