• Title/Summary/Keyword: Validation Metrics

Search Result 69, Processing Time 0.032 seconds

On the Evaluation of In-Vehicle Dynamic Characteristics and On-Road Dynamic Stability(Angle of Rotation) of Rearview Mirror (리어뷰 미러의 실차 동특성 및 주행시 동적 안정성(회전각)에 대한 평가)

  • Jung, Seung-Kyun;Lee, Keun-Soo;Kim, Jeung-Han
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2008.11a
    • /
    • pp.385-386
    • /
    • 2008
  • Dynamic stability of the vehicle rearview mirror is an important factor for the driver's visual perception (image blur) when driving down the road and regarded as one of the vehicle level N&V performance of visible component vibration. Several projects within GM identified a set of objective metrics and validation methods that can replace current existing subjective evaluation of mirror stability. This paper presents objective evaluation results for assessing dynamic stability (angle of rotation) of the vehicle rearview mirrors using both in-lab FRF measurements and on-road testing.

  • PDF

An Empirical Validation of Complexity Metrics for Java Programs (Java 프로그램에 대한 복잡도 척도들의 실험적 검증)

  • Kim, Jae-Woong;Yu, Cheol-Jung;Jang, Ok-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.12
    • /
    • pp.1141-1154
    • /
    • 2000
  • 본 논문에서는 Java 프로그램의 복잡도를 측정하기 위해 필요한 인자들을 제안하였다. 이러한 인자들을 추출하기 위해 Java 프로그램을 분석하여 객체지향 설계 척도 값들을 계산하고 통계적 분석을 수행하였다. 그 결과 기존의 연구에서 발견되었던 클래스의 크기 인자 외에도 메소드 호출 빈도, 응집도, 자식 클래스의 수, 내부 클래스 및 상속 계층의 깊이가 주요 인자임이 파악되었다. 클래스의 크기 척도로 분류되었던 자식 클래스의 수는 다른 크기 척도들과 다른 성질을 가진다는 것을 발견하였다. 또한 프로그램의 크기가 커지고 결합도가 높아질수록 응집도가 떨어진다는 것을 입증하였다. 그리고 인자 분석을 바탕으로 인간의 인지 능력과 인자의 상관관계를 고려한 가중치를 적용하기 위해 인자별로 회귀분석을 수행하였다. 보다 적은 척도를 가지고 인자를 설명할 수 있는 회귀식을 도출하였다. 두 그룹에 대한 교차 검증 결과 회귀식이 높은 신뢰도를 가지는 것으로 나타났다. 따라서 본 논문에서 제안한 인자들을 이용하는 경우 Java 프로그램의 복잡도를 측정할 수 있는 새로운 척도로 사용할 수 있다.

  • PDF

BSC Perspective of an Exploratory study of Developing CSF/KPI Pool in Korean Construction Industry (균형성과표(BSC)에 의한 건설산업의 주요성공요인과 성과지표개발에 관한 연구)

  • Oh, Ic-Jin;Lee, Jung-Hoon;Lee, Choong-C.
    • Journal of Information Technology Services
    • /
    • v.5 no.1
    • /
    • pp.35-46
    • /
    • 2006
  • In recent years, academic scholars and practitioners have given increasing attention to the importance of strategic performance measurement systems including both financial and non-financial performance metrics. The Balanced Scorecard (BSC) is known as integrated performance management framework that helps an enterprise to translate strategic objectives into relevant performance within an organization. While the current literatures and management articles offer BSC design and implementation. there are few reports of detailed validation of using the rationalized sets of CSF (Critical Success Factors) and KPI (Key Performance Indicators) for the Korean construction industry. This paper first propose the perceived sets of CSF/KPI using current literatures and validate with a major construction company's executives and senior managers in Korea. The paper then examines whether the perceived sets of CSF/KPI have co-relationships with the firm performances. The results of the research contribute in heightening of competitiveness of the Korean construction companies in strategic and performance management.

A Comprehensive Review on Regression Test Case Prioritization Techniques for Web Services

  • Hasnain, Muhammad;Ghani, Imran;Pasha, Muhammad Fermi;Lim, Chern Hong;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.5
    • /
    • pp.1861-1885
    • /
    • 2020
  • Test Case Prioritization (TCP) involves the rearrangement of test cases on a prioritized basis for various services. This research work focuses on TCP in web services, as it has been a growing challenge for researchers. Web services continuously evolve and hence require reforming and re-execution of test cases to ensure the accurate working of web services. This study aims to investigate gaps, issues, and existing solutions related to test case prioritization. This study examines research publications within popular selected databases. We perform a meticulous screening of research publications and selected 65 papers through which to answer the proposed research questions. The results show that criteria-based test case prioritization techniques are reported mainly in 41 primary studies. Test case prioritization models, frameworks, and related algorithms are also reported in primary studies. In addition, there are eight issues related to TCP techniques. Among these eight issues, optimization and high effectiveness are most discussed within primary studies. This systematic review has identified that a significant proportion of primary studies are not involved in the use of statistical methods in measuring or comparing the effectiveness of TCP techniques. However, a large number of primary studies use 'Average Percentage of Faults Detected' (APFD) or extended APFD metrics to compute the performance of techniques for web services.

MODIFIED CONVOLUTIONAL NEURAL NETWORK WITH TRANSFER LEARNING FOR SOLAR FLARE PREDICTION

  • Zheng, Yanfang;Li, Xuebao;Wang, Xinshuo;Zhou, Ta
    • Journal of The Korean Astronomical Society
    • /
    • v.52 no.6
    • /
    • pp.217-225
    • /
    • 2019
  • We apply a modified Convolutional Neural Network (CNN) model in conjunction with transfer learning to predict whether an active region (AR) would produce a ≥C-class or ≥M-class flare within the next 24 hours. We collect line-of-sight magnetogram samples of ARs provided by the SHARP from May 2010 to September 2018, which is a new data product from the HMI onboard the SDO. Based on these AR samples, we adopt the approach of shuffle-and-split cross-validation (CV) to build a database that includes 10 separate data sets. Each of the 10 data sets is segregated by NOAA AR number into a training and a testing data set. After training, validating, and testing our model, we compare the results with previous studies using predictive performance metrics, with a focus on the true skill statistic (TSS). The main results from this study are summarized as follows. First, to the best of our knowledge, this is the first time that the CNN model with transfer learning is used in solar physics to make binary class predictions for both ≥C-class and ≥M-class flares, without manually engineered features extracted from the observational data. Second, our model achieves relatively high scores of TSS = 0.640±0.075 and TSS = 0.526±0.052 for ≥M-class prediction and ≥C-class prediction, respectively, which is comparable to that of previous models. Third, our model also obtains quite good scores in five other metrics for both ≥C-class and ≥M-class flare prediction. Our results demonstrate that our modified CNN model with transfer learning is an effective method for flare forecasting with reasonable prediction performance.

An Experimental Study of Generality of Software Defects Prediction Models based on Object Oriented Metrics (객체지향 메트릭 기반인 결함 예측 모형의 범용성에 관한 실험적 연구)

  • Kim, Tae-Yeon;Kim, Yun-Kyu;Chae, Heung-Seok
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.407-416
    • /
    • 2009
  • To support an efficient management of software verification and validation activities, much research has been conducted to predict defects in early phase. And defect prediction models have been proposed to predict defects. But the generality of the models has not been experimentally studied for other software system. In other words, most of prediction models were applied only to the same system that had been used to build the prediction models themselves. Therefore, we performed an experiment to explore generality of major prediction models. In the experiment, we applied three defects prediction models to three different systems. As a result, we cannot find their generality of defect prediction capability. The cause is analyzed to result from a different metric distribution between the systems.

A Study on the Bleeding Detection Using Artificial Intelligence in Surgery Video (수술 동영상에서의 인공지능을 사용한 출혈 검출 연구)

  • Si Yeon Jeong;Young Jae Kim;Kwang Gi Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.3
    • /
    • pp.211-217
    • /
    • 2023
  • Recently, many studies have introduced artificial intelligence systems in the surgical process to reduce the incidence and mortality of complications in patients. Bleeding is a major cause of operative mortality and complications. However, there have been few studies conducted on detecting bleeding in surgical videos. To advance the development of deep learning models for detecting intraoperative hemorrhage, three models have been trained and compared; such as, YOLOv5, RetinaNet50, and RetinaNet101. We collected 1,016 bleeding images extracted from five surgical videos. The ground truths were labeled based on agreement from two specialists. To train and evaluate models, we divided the datasets into training data, validation data, and test data. For training, 812 images (80%) were selected from the dataset. Another 102 images (10%) were used for evaluation and the remaining 102 images (10%) were used as the evaluation data. The three main metrics used to evaluate performance are precision, recall, and false positive per image (FPPI). Based on the evaluation metrics, RetinaNet101 achieved the best detection results out of the three models (Precision rate of 0.99±0.01, Recall rate of 0.93±0.02, and FPPI of 0.01±0.01). The information on the bleeding detected in surgical videos can be quickly transmitted to the operating room, improving patient outcomes.

Assessment of compressive strength of high-performance concrete using soft computing approaches

  • Chukwuemeka Daniel;Jitendra Khatti;Kamaldeep Singh Grover
    • Computers and Concrete
    • /
    • v.33 no.1
    • /
    • pp.55-75
    • /
    • 2024
  • The present study introduces an optimum performance soft computing model for predicting the compressive strength of high-performance concrete (HPC) by comparing models based on conventional (kernel-based, covariance function-based, and tree-based), advanced machine (least square support vector machine-LSSVM and minimax probability machine regressor-MPMR), and deep (artificial neural network-ANN) learning approaches using a common database for the first time. A compressive strength database, having results of 1030 concrete samples, has been compiled from the literature and preprocessed. For the purpose of training, testing, and validation of soft computing models, 803, 101, and 101 data points have been selected arbitrarily from preprocessed data points, i.e., 1005. Thirteen performance metrics, including three new metrics, i.e., a20-index, index of agreement, and index of scatter, have been implemented for each model. The performance comparison reveals that the SVM (kernel-based), ET (tree-based), MPMR (advanced), and ANN (deep) models have achieved higher performance in predicting the compressive strength of HPC. From the overall analysis of performance, accuracy, Taylor plot, accuracy metric, regression error characteristics curve, Anderson-Darling, Wilcoxon, Uncertainty, and reliability, it has been observed that model CS4 based on the ensemble tree has been recognized as an optimum performance model with higher performance, i.e., a correlation coefficient of 0.9352, root mean square error of 5.76 MPa, and mean absolute error of 4.1069 MPa. The present study also reveals that multicollinearity affects the prediction accuracy of Gaussian process regression, decision tree, multilinear regression, and adaptive boosting regressor models, novel research in compressive strength prediction of HPC. The cosine sensitivity analysis reveals that the prediction of compressive strength of HPC is highly affected by cement content, fine aggregate, coarse aggregate, and water content.

A Study on Establishment of Evaluation Criteria for Mobile Anti-Birus Performance Test (모바일 Anti-Virus 성능 시험을 위한 평가 기준 수립 연구)

  • Jeongho Lee;Kangsik Shin;Youngrak Ryu;Dong-Jae Jung;Ho-Mook Cho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.5
    • /
    • pp.1101-1113
    • /
    • 2024
  • With the recent increase in hacks targeting smartphones, users are becoming increasingly anxious about the security of their smartphones. Using a mobile anti-virus is one of the best ways to reduce this anxiety. However, there aren't many ways for users to learn about mobile anti-virus performance and features. While there are certification organizations that conduct annual performance evaluations of mobile anti-virus products and make them publicly available, they don't disclose the specifics of their testing methods and detailed results. In addition, previous quality evaluation studies are not suitable for evaluating modern mobile anti-viruses due to the existence of evaluation criteria that are not suitable for mobile anti-virus product evaluation or lack of validation. Therefore, this paper establishes detailed mobile anti-virus evaluation metrics suitable for the evaluation of modern mobile anti-viruses and applies them to 10 domestic and international mobile anti-virus products to verify the validity of the established evaluation metrics.

Detecting Errors in POS-Tagged Corpus on XGBoost and Cross Validation (XGBoost와 교차검증을 이용한 품사부착말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Park, Ho-Min;Cheon, Min-Ah;Yoon, Ho;Namgoong, Young;Kim, Jae-Kyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.7
    • /
    • pp.221-228
    • /
    • 2020
  • Part-of-Speech (POS) tagged corpus is a collection of electronic text in which each word is annotated with a tag as the corresponding POS and is widely used for various training data for natural language processing. The training data generally assumes that there are no errors, but in reality they include various types of errors, which cause performance degradation of systems trained using the data. To alleviate this problem, we propose a novel method for detecting errors in the existing POS tagged corpus using the classifier of XGBoost and cross-validation as evaluation techniques. We first train a classifier of a POS tagger using the POS-tagged corpus with some errors and then detect errors from the POS-tagged corpus using cross-validation, but the classifier cannot detect errors because there is no training data for detecting POS tagged errors. We thus detect errors by comparing the outputs (probabilities of POS) of the classifier, adjusting hyperparameters. The hyperparameters is estimated by a small scale error-tagged corpus, in which text is sampled from a POS-tagged corpus and which is marked up POS errors by experts. In this paper, we use recall and precision as evaluation metrics which are widely used in information retrieval. We have shown that the proposed method is valid by comparing two distributions of the sample (the error-tagged corpus) and the population (the POS-tagged corpus) because all detected errors cannot be checked. In the near future, we will apply the proposed method to a dependency tree-tagged corpus and a semantic role tagged corpus.