• Title/Summary/Keyword: Validation Metrics

Search Result 69, Processing Time 0.029 seconds

Automatic Detection and Classification of Rib Fractures on Thoracic CT Using Convolutional Neural Network: Accuracy and Feasibility

  • Qing-Qing Zhou;Jiashuo Wang;Wen Tang;Zhang-Chun Hu;Zi-Yi Xia;Xue-Song Li;Rongguo Zhang;Xindao Yin;Bing Zhang;Hong Zhang
    • Korean Journal of Radiology
    • /
    • v.21 no.7
    • /
    • pp.869-879
    • /
    • 2020
  • Objective: To evaluate the performance of a convolutional neural network (CNN) model that can automatically detect and classify rib fractures, and output structured reports from computed tomography (CT) images. Materials and Methods: This study included 1079 patients (median age, 55 years; men, 718) from three hospitals, between January 2011 and January 2019, who were divided into a monocentric training set (n = 876; median age, 55 years; men, 582), five multicenter/multiparameter validation sets (n = 173; median age, 59 years; men, 118) with different slice thicknesses and image pixels, and a normal control set (n = 30; median age, 53 years; men, 18). Three classifications (fresh, healing, and old fracture) combined with fracture location (corresponding CT layers) were detected automatically and delivered in a structured report. Precision, recall, and F1-score were selected as metrics to measure the optimum CNN model. Detection/diagnosis time, precision, and sensitivity were employed to compare the diagnostic efficiency of the structured report and that of experienced radiologists. Results: A total of 25054 annotations (fresh fracture, 10089; healing fracture, 10922; old fracture, 4043) were labelled for training (18584) and validation (6470). The detection efficiency was higher for fresh fractures and healing fractures than for old fractures (F1-scores, 0.849, 0.856, 0.770, respectively, p = 0.023 for each), and the robustness of the model was good in the five multicenter/multiparameter validation sets (all mean F1-scores > 0.8 except validation set 5 [512 x 512 pixels; F1-score = 0.757]). The precision of the five radiologists improved from 80.3% to 91.1%, and the sensitivity increased from 62.4% to 86.3% with artificial intelligence-assisted diagnosis. On average, the diagnosis time of the radiologists was reduced by 73.9 seconds. Conclusion: Our CNN model for automatic rib fracture detection could assist radiologists in improving diagnostic efficiency, reducing diagnosis time and radiologists' workload.

Criticality benchmarking of ENDF/B-VIII.0 and JEFF-3.3 neutron data libraries with RMC code

  • Zheng, Lei;Huang, Shanfang;Wang, Kan
    • Nuclear Engineering and Technology
    • /
    • v.52 no.9
    • /
    • pp.1917-1925
    • /
    • 2020
  • New versions of ENDF/B and JEFF data libraries have been released during the past two years with significant updates in the neutron reaction sublibrary and the thermal neutron scattering sublibrary. In order to get a more comprehensive impression of the criticality quality of these two latest neutron data libraries, and to provide reference for the selection of the evaluated nuclear data libraries for the science and engineering applications of the Reactor Monte Carlo code RMC, the criticality benchmarking of the two latest neutron data libraries has been performed. RMC was employed as the computational tools, whose processing capability for the continuous representation ENDF/B-VIII.0 thermal neutron scattering laws was developed. The RMC criticality validation suite consisting of 116 benchmarks was established for the benchmarking work. The latest ACE format data libraries of the neutron reaction and the thermal neutron scattering laws for ENDF/B-VIII.0, ENDF/B-VII.1, and JEFF-3.3 were downloaded from the corresponding official sites. The ENDF/B-VII.0 data library was also employed to provide code-to-code validation for RMC. All the calculations for the four different data libraries were performed by using a parallel version of RMC, and all the calculated standard deviations are lower than 30pcm. Comprehensive analyses including the C/E values with uncertainties, the δk/σ values, and the metrics of χ2 and < |Δ| >, were conducted and presented. The calculated keff eigenvalues based on the four data libraries generally agree well with the benchmark evaluations for most cases. Among the 116 criticality benchmarks, the numbers of the calculated keff eigenvalues which agree with the benchmark evaluations within 3σ interval (with a confidence level of 99.6%) are 107, 109, 112, and 113 for ENDF/B-VII.0, ENDF/B-VII.1, ENDF/B-VIII.0 and JEFF-3.3, respectively. The present results indicate that the ENDF/B-VIII.0 neutron data library has a better performance on average.

Assessment and Validation of New Global Grid-based CHIRPS Satellite Rainfall Products Over Korea (전지구 격자형 CHIRPS 위성 강우자료의 한반도 적용성 분석)

  • Jeon, Min-Gi;Nam, Won-Ho;Mun, Young-Sik;Kim, Han-Joong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.62 no.2
    • /
    • pp.39-52
    • /
    • 2020
  • A high quality, long-term, high-resolution precipitation dataset is an essential in climate analyses and global water cycles. Rainfall data from station observations are inadequate over many parts of the world, especially North Korea, due to non-existent observation networks, or limited reporting of gauge observations. As a result, satellite-based rainfall estimates have been used as an alternative as a supplement to station observations. The Climate Hazards Group Infrared Precipitation (CHIRP) and CHIRP combined with station observations (CHIRPS) are recently produced satellite-based rainfall products with relatively high spatial and temporal resolutions and global coverage. CHIRPS is a global precipitation product and is made available at daily to seasonal time scales with a spatial resolution of 0.05° and a 1981 to near real-time period of record. In this study, we analyze the applicability of CHIRPS data on the Korean Peninsula by supplementing the lack of precipitation data of North Korea. We compared the daily precipitation estimates from CHIRPS with 81 rain gauges across Korea using several statistical metrics in the long-term period of 1981-2017. To summarize the results, the CHIRPS product for the Korean Peninsula was shown an acceptable performance when it is used for hydrological applications based on monthly rainfall amounts. Overall, this study concludes that CHIRPS can be a valuable complement to gauge precipitation data for estimating precipitation and climate, hydrological application, for example, drought monitoring in this region.

Assessing the Metric to Measuring Land-Use Change Suitability (토지 이용 변화 예측 모형의 정확도 검정을 위한 통계량 연구)

  • Kim, Oh Seok
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.16 no.3
    • /
    • pp.458-471
    • /
    • 2013
  • This paper addresses the limitation of a map comparison metric entitled Figure of Merit through employing a simple land change model. The metric was originally designed to overcome limitations of other existing statistics, such as Kappa, when assessing predictive accuracy of land change models. A series of comparisons between null and predicted outcomes at multiple resolutions as well as a multi-resolution Figure of Merit analysis techniques of validation are compared for spatially segregated calibration and validation datasets. The Figure of Merit at the null resolution in this paper was 57%, although future research must be done to determine if this was simply a coincidence. A Figure of Merit greater than 50% would seem to represent a "Resolution of Merit" in that the Figure of Merit at that resolution becomes greater than the error. Thus, these two metrics should be used in tandem to assess predictive accuracy of a land change model.

  • PDF

Classification of mandibular molar furcation involvement in periapical radiographs by deep learning

  • Katerina Vilkomir;Cody Phen;Fiondra Baldwin;Jared Cole;Nic Herndon;Wenjian Zhang
    • Imaging Science in Dentistry
    • /
    • v.54 no.3
    • /
    • pp.257-263
    • /
    • 2024
  • Purpose: The purpose of this study was to classify mandibular molar furcation involvement (FI) in periapical radiographs using a deep learning algorithm. Materials and Methods: Full mouth series taken at East Carolina University School of Dental Medicine from 2011-2023 were screened. Diagnostic-quality mandibular premolar and molar periapical radiographs with healthy or FI mandibular molars were included. The radiographs were cropped into individual molar images, annotated as "healthy" or "FI," and divided into training, validation, and testing datasets. The images were preprocessed by PyTorch transformations. ResNet-18, a convolutional neural network model, was refined using the PyTorch deep learning framework for the specific imaging classification task. CrossEntropyLoss and the AdamW optimizer were employed for loss function training and optimizing the learning rate, respectively. The images were loaded by PyTorch DataLoader for efficiency. The performance of ResNet-18 algorithm was evaluated with multiple metrics, including training and validation losses, confusion matrix, accuracy, sensitivity, specificity, the receiver operating characteristic (ROC) curve, and the area under the ROC curve. Results: After adequate training, ResNet-18 classified healthy vs. FI molars in the testing set with an accuracy of 96.47%, indicating its suitability for image classification. Conclusion: The deep learning algorithm developed in this study was shown to be promising for classifying mandibular molar FI. It could serve as a valuable supplemental tool for detecting and managing periodontal diseases.

VALIDATION OF ON-LINE MONITORING TECHNIQUES TO NUCLEAR PLANT DATA

  • Garvey, Jamie;Garvey, Dustin;Seibert, Rebecca;Hines, J. Wesley
    • Nuclear Engineering and Technology
    • /
    • v.39 no.2
    • /
    • pp.133-142
    • /
    • 2007
  • The Electric Power Research Institute (EPRI) demonstrated a method for monitoring the performance of instrument channels in Topical Report (TR) 104965, 'On-Line Monitoring of Instrument Channel Performance.' This paper presents the results of several models originally developed by EPRI to monitor three nuclear plant sensor sets: Pressurizer Level, Reactor Protection System (RPS) Loop A, and Reactor Coolant System (RCS) Loop A Steam Generator (SG) Level. The sensor sets investigated include one redundant sensor model and two non-redundant sensor models. Each model employs an Auto-Associative Kernel Regression (AAKR) model architecture to predict correct sensor behavior. Performance of each of the developed models is evaluated using four metrics: accuracy, auto-sensitivity, cross-sensitivity, and newly developed Error Uncertainty Limit Monitoring (EULM) detectability. The uncertainty estimate for each model is also calculated through two methods: analytic formulas and Monte Carlo estimation. The uncertainty estimates are verified by calculating confidence interval coverages to assure that 95% of the measured data fall within the confidence intervals. The model performance evaluation identified the Pressurizer Level model as acceptable for on-line monitoring (OLM) implementation. The other two models, RPS Loop A and RCS Loop A SG Level, highlight two common problems that occur in model development and evaluation, namely faulty data and poor signal selection

A Study on the Application of Systems Engineering for Systems of Systems (복합시스템을 위한 시스템엔지니어링 적용방안에 대한 연구)

  • Kim, Jong Yoel;Ham, Bum Sik;Cho, Tae Hyoung;Lee, Bum Sug
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.6 no.1
    • /
    • pp.15-23
    • /
    • 2010
  • This paper describes the application method of systems engineering for systems of systems and addressed the evolution in approach as organizations have evolved from standalone platform providers to architect of SoS. There is an increasing need to perform Systems of Systems Engineering(SoSE) in a global environment This has stimulated the development of new, innovative SoSE approaches that surpass current military and industry state-of-the-art systems engineering processes. The simple scaling of the former process was ineffective in dealing with global distributed large-scale integration efforts of network-centric systems of systems. A new SoSE process has been developed which is a significant breakthrough in the development of large complex systems and net-centric systems of systems. The SoSE process provides a complete, detailed and systematic development approach for military and civil SoS. In this point of view, this paper analyzes functionally the architecture framework, process and integrated verification & validation of SoSE. From results of this analysis, perspectives on the application of SoSE suitable for Korea aquisition environment are presented.

  • PDF

Validation of Availability, Maintainability and Safety for Subway Elevating Equipment (도시철도 승강설비의 가용성, 유지보수성 및 안전성 입증)

  • Lee, Hwan-Deok;Jung, Won
    • Journal of Applied Reliability
    • /
    • v.16 no.4
    • /
    • pp.272-279
    • /
    • 2016
  • Purpose: In order to fulfill the RAMS (Reliability Availability Maintainability and Safety) requirements in Korean railway systems, specific target metrics should be measured and met the requirements when ordering a new system. This paper presents a procedure to predict and demonstrate the availability, maintainability and safety of subway elevators, escalators and PSD (platform screen doors) systems. Methods: The system manufacturer predicts availability and maintainability with lab and field tested data. After installation, availability and maintainability are demonstrated based on the actual operational data. The data was collected from the FRACAS (Failure Report and Corrective Action System). Results: Methods and process of assessing and analyzing the availability, maintainability and safety are presented for elevating services and PSD systems. The data obtained through the actual operation of the equipment is analyzed and maintained to predict the RAMS based on the component and system level failure data acquired. Conclusion: This study presented an application using IEC 62278 and operational data which can be used in the design and development stage to achieve the RAMS target value of the subway elevators, escalators and PSD systems.

A Reusability Measurement of the Reused Component by Employing Rough and Fuzzy Sets (러프와 퍼지 집합을 이용한 재사용 컴포넌트의 재사용도 측정)

  • Kim, Hye-Gyeong;Choe, Wan-Gyu;Lee, Seong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2365-2372
    • /
    • 1999
  • The reusability measurement model should satisfy the following conditions : 1) can insert and delete metrics and components easily, 2) can compare and evaluate components quantitatively on the basis of validation, 3) don't require certain preassumed knowledge, and 4) can compute significance of each measurement attribute objectively. Therefore, in this paper, we propose a new reusability measurement model that can satisfy the above requirements. Our model selects the appropriate measurement attributes and calculates the relative significance of them by using rough set. Then, in order to measure the reusability of component, it integrates the significance of attributes and the measured value of them by using fuzzy integral. Finally, we apply our model to the reusability measurement of the function-oriented components and validate our model through statistical technique.

  • PDF

A Performance Analysis Model of PC-based Software Router Supporting IPv6-IPv4 Translation for Residential Gateway

  • Seo, Ssang-Hee;Kong, In-Yeup
    • Journal of Information Processing Systems
    • /
    • v.1 no.1 s.1
    • /
    • pp.62-69
    • /
    • 2005
  • This paper presents a queuing analysis model of a PC-based software router supporting IPv6-IPv4 translation for residential gateway. The proposed models are M/G/1/K or MMPP-2/G/1/K by arrival process of the software PC router. M/G/1/K is a model of normal traffic and MMPP-2/G/1/K is a model of burst traffic. In M/G/1/K, the arriving process is assumed to be a Poisson process, which is independent and identically distributed. In MMPP-2/G/1/K, the arriving process is assumed to be two-state Markov Modulated Poisson Process (MMPP) which is changed from one state to another state with intensity. The service time distribution is general distribution and the service discipline of the server is processor sharing. Also, the total number of packets that can be processed at one time is limited to K. We obtain performance metrics of PC-based software router for residential gateway such as system sojourn time blocking probability and throughput based on the proposed model. Compared to other models, our model is simpler and it is easier to estimate model parameters. Validation results show that the model estimates the performance of the target system.