• Title/Summary/Keyword: Receiver Operating Characteristic Curve

Search Result 547, Processing Time 0.03 seconds

Nomogram comparison conducted by logistic regression and naïve Bayesian classifier using type 2 diabetes mellitus (T2D) (제 2형 당뇨병을 이용한 로지스틱과 베이지안 노모그램 구축 및 비교)

  • Park, Jae-Cheol;Kim, Min-Ho;Lee, Jea-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.573-585
    • /
    • 2018
  • In this study, we fit the logistic regression model and naïve Bayesian classifier model using 11 risk factors to predict the incidence rate probability for type 2 diabetes mellitus. We then introduce how to construct a nomogram that can help people visually understand it. We use data from the 2013-2015 Korean National Health and Nutrition Examination Survey (KNHANES). We take 3 interactions in the logistic regression model to improve the quality of the analysis and facilitate the application of the left-aligned method to the Bayesian nomogram. Finally, we compare the two nomograms and examine their utility. Then we verify the nomogram using the ROC curve.

A Logistic Model Including Risk Factors for Lymph Node Metastasis Can Improve the Accuracy of Magnetic Resonance Imaging Diagnosis of Rectal Cancer

  • Ogawa, Shimpei;Itabashi, Michio;Hirosawa, Tomoichiro;Hashimoto, Takuzo;Bamba, Yoshiko;Kameoka, Shingo
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.2
    • /
    • pp.707-712
    • /
    • 2015
  • Background: To evaluate use of magnetic resonance imaging (MRI) and a logistic model including risk factors for lymph node metastasis for improved diagnosis. Materials and Methods: The subjects were 176 patients with rectal cancer who underwent preoperative MRI. The longest lymph node diameter was measured and a cut-off value for positive lymph node metastasis was established based on a receiver operating characteristic (ROC) curve. A logistic model was constructed based on MRI findings and risk factors for lymph node metastasis extracted from logistic-regression analysis. The diagnostic capabilities of MRI alone and those of the logistic model were compared using the area under the curve (AUC) of the ROC curve. Results: The cut-off value was a diameter of 5.47 mm. Diagnosis using MRI had an accuracy of 65.9%, sensitivity 73.5%, specificity 61.3%, positive predictive value (PPV) 62.9%, and negative predictive value (NPV) 72.2% [AUC: 0.6739 (95%CI: 0.6016-0.7388)]. Age (<59) (p=0.0163), pT (T3+T4) (p=0.0001), and BMI (<23.5) (p=0.0003) were extracted as independent risk factors for lymph node metastasis. Diagnosis using MRI with the logistic model had an accuracy of 75.0%, sensitivity 72.3%, specificity 77.4%, PPV 74.1%, and NPV 75.8% [AUC: 0.7853 (95%CI: 0.7098-0.8454)], showing a significantly improved diagnostic capacity using the logistic model (p=0.0002). Conclusions: A logistic model including risk factors for lymph node metastasis can improve the accuracy of MRI diagnosis of rectal cancer.

Model Based on Alkaline Phosphatase and Gamma-Glutamyltransferase for Gallbladder Cancer Prognosis

  • Xu, Xin-Sen;Miao, Run-Chen;Zhang, Ling-Qiang;Wang, Rui-Tao;Qu, Kai;Pang, Qing;Liu, Chang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.15
    • /
    • pp.6255-6259
    • /
    • 2015
  • Purpose: To evaluate the prognostic value of alkaline phosphatase (ALP) and gamma-glutamyltransferase (GGT) in gallbladder cancer (GBC). Materials and Methods: Serum ALP and GGT levels and clinicopathological parameters were retrospectively evaluated in 199 GBC patients. Receiver operating characteristic (ROC) curve analysis was performed to determine the cut-off values of ALP and GGT. Then, associations with overall survival were assessed by multivariate analysis. Based on the significant factors, a prognostic score model was established. Results: By ROC curve analysis, $ALP{\geq}210U/L$ and $GGT{\geq}43U/L$ were considered elevated. Overall survival for patients with elevated ALP and GGT was significantly worse than for patients within the normal range. Multivariate analysis showed that the elevated ALP, GGT and tumor stage were independent prognostic factors. Giving each positive factor a score of 1, we established a preoperative prognostic score model. Varied outcomes would be significantly distinguished by the different score groups. By further ROC curve analysis, the simple score showed great superiority compared with the widely used TNM staging, each of the ALP or GGT alone, or traditional tumor markers such as CEA, AFP, CA125 and CA199. Conclusions: Elevated ALP and GGT levels were risk predictors in GBC patients. Our prognostic model provides infomration on varied outcomes of patients from different score groups.

Comparison of machine learning algorithms for Chl-a prediction in the middle of Nakdong River (focusing on water quality and quantity factors) (머신러닝 기법을 활용한 낙동강 중류 지역의 Chl-a 예측 알고리즘 비교 연구(수질인자 및 수량 중심으로))

  • Lee, Sang-Min;Park, Kyeong-Deok;Kim, Il-Kyu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.34 no.4
    • /
    • pp.277-288
    • /
    • 2020
  • In this study, we performed algorithms to predict algae of Chlorophyll-a (Chl-a). Water quality and quantity data of the middle Nakdong River area were used. At first, the correlation analysis between Chl-a and water quality and quantity data was studied. We extracted ten factors of high importance for water quality and quantity data about the two weirs. Algorithms predicted how ten factors affected Chl-a occurrence. We performed algorithms about decision tree, random forest, elastic net, gradient boosting with Python. The root mean square error (RMSE) value was used to evaluate excellent algorithms. The gradient boosting showed 10.55 of RMSE value for the Gangjeonggoryeong (GG) site and 11.43 of RMSE value for the Dalsung (DS) site. The gradient boosting algorithm showed excellent results for GG and DS sites. Prediction value for the four algorithms was also evaluated through the Receiver operating characteristic (ROC) curve and Area under curve (AUC). As a result of the evaluation, the AUC value was 0.877 at GG site and the AUC value was 0.951 at DS site. So the algorithm's ability to interpret seemed to be excellent.

Cross Validation of Attention-Deficit/Hyperactivity Disorder-After School Checklist

  • Lee, Sukhyun;Kim, Bongseog;Yoo, Hanik K.;Huh, Hannah;Roh, Jaewoo
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.29 no.3
    • /
    • pp.129-136
    • /
    • 2018
  • Objectives: This study aimed to evaluate the efficacy of the attention-deficit/hyperactivity disorder (ADHD)-After School Checklist (ASK) by comparing the results of the Comprehensive Attention Test (CAT) and Clinical Global Impression-Severity (CGI-S) Scale and then by calculating the area under the receiver operating characteristic (ROC) curve. Methods: We performed correlation analyses on the ASK and CAT results and then the ASK and CGI-S results. We created a ROC curve and evaluated performance on the ASK as a diagnostic tool. We then analyzed the test results of 1348 subjects (male 56.8%), including 1201 subjects in the general population and 147 ADHD subjects, aged 6-15 years, from kindergarten to middle school in Seoul and Gyeonggi province, South Korea. Results: According to the correlation analyses, ASK scores and the Attention Quotient (AQ) of CAT scores showed a significant correlation of -0.20--0.29 (p<0.05). The t-test between ADHD scores and CGI-S also showed a significant correlation (t=-2.55, p<0.05). The area under the ROC curve was calculated as 0.81, indicating good efficacy of the ASK, and the cut-off score was calculated as 15.5. Conclusion: The ASK can be used as a valid tool not only to evaluate functional impairment of ADHD children and adolescents but also to screen ADHD.

Functional Prediction of Hypothetical Proteins from Shigella flexneri and Validation of the Predicted Models by Using ROC Curve Analysis

  • Gazi, Md. Amran;Mahmud, Sultan;Fahim, Shah Mohammad;Kibria, Mohammad Golam;Palit, Parag;Islam, Md. Rezaul;Rashid, Humaira;Das, Subhasish;Mahfuz, Mustafa;Ahmeed, Tahmeed
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.26.1-26.12
    • /
    • 2018
  • Shigella spp. constitutes some of the key pathogens responsible for the global burden of diarrhoeal disease. With over 164 million reported cases per annum, shigellosis accounts for 1.1 million deaths each year. Majority of these cases occur among the children of the developing nations and the emergence of multi-drug resistance Shigella strains in clinical isolates demands the development of better/new drugs against this pathogen. The genome of Shigella flexneri was extensively analyzed and found 4,362 proteins among which the functions of 674 proteins, termed as hypothetical proteins (HPs) had not been previously elucidated. Amino acid sequences of all these 674 HPs were studied and the functions of a total of 39 HPs have been assigned with high level of confidence. Here we have utilized a combination of the latest versions of databases to assign the precise function of HPs for which no experimental information is available. These HPs were found to belong to various classes of proteins such as enzymes, binding proteins, signal transducers, lipoprotein, transporters, virulence and other proteins. Evaluation of the performance of the various computational tools conducted using receiver operating characteristic curve analysis and a resoundingly high average accuracy of 93.6% were obtained. Our comprehensive analysis will help to gain greater understanding for the development of many novel potential therapeutic interventions to defeat Shigella infection.

An RNN-based Fault Detection Scheme for Digital Sensor (RNN 기반 디지털 센서의 Rising time과 Falling time 고장 검출 기법)

  • Lee, Gyu-Hyung;Lee, Young-Doo;Koo, In-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.29-35
    • /
    • 2019
  • As the fourth industrial revolution is emerging, many companies are increasingly interested in smart factories and the importance of sensors is being emphasized. In the case that sensors for collecting sensing data fail, the plant could not be optimized and further it could not be operated properly, which may incur a financial loss. For this purpose, it is necessary to diagnose the status of sensors to prevent sensor' fault. In the paper, we propose a scheme to diagnose digital-sensor' fault by analyzing the rising time and falling time of digital sensors through the LSTM(Long Short Term Memory) of Deep Learning RNN algorithm. Experimental results of the proposed scheme are compared with those of rule-based fault diagnosis algorithm in terms of AUC(Area Under the Curve) of accuracy and ROC(Receiver Operating Characteristic) curve. Experimental results show that the proposed system has better and more stable performance than the rule-based fault diagnosis algorithm.

Determination of Urinary Cotinine Cut-Off Point for Discriminating Smokers and Non-Smokers among Adolescents: The Third Cycle of the Korean National Environmental Health Survey (2015~2017) (청소년의 흡연자 선별을 위한 소변 중 코티닌 절사점 결정: 제3기 국민환경보건 기초조사(2015~2017))

  • Jung, Sunkyoung;Park, Sangshin
    • Journal of Environmental Health Sciences
    • /
    • v.47 no.4
    • /
    • pp.320-329
    • /
    • 2021
  • Background: Smoking exposure may be objectively assessed through specific biomarkers. The most common biomarker for smoking is cotinine concentration in urine, and setting an optimal cut-off point can accurately classify smoking status. Such a cut-off point for Korean adolescents has never been studied. Objectives: The aim of this study was to determine a cut-off point for urinary cotinine concentration for the discrimination of smoking in adolescents. Methods: Participants were adolescents aged 13~18 years who participated in the third cycle of the Korean National Environmental Health Survey. We used urine samples to confirm the level of cotinine concentrations. Smoking status was determined by self-reported questionnaire. We identified the optimal cotinine cut-off point for discriminating smoking status using receiver operating characteristic curve analysis. Results: Of the 904 participants, 28 (3.1%) were smokers, among whom 20 (71.4%) were male. The median urinary cotinine concentrations in smokers was 218 ㎍/L (male: 215 ㎍/L, female: 303 ㎍/L), and that in non-smokers was 1.31 ㎍/L (male: 1.46 ㎍/L, female: 1.18 ㎍/L). We found significant differences in urinary cotinine concentration according to smoking status and sex (p<0.001). Urinary cotinine concentrations performed well for identifying smoking adolescents [area under the curve: 0.954 (male: 0.963, female: 0.908)]. The cut-off that optimally distinguished smokers from non-smokers was 39.85 ㎍/L (sensitivity: 89.3%, specificity: 97.4%). Male [39.85 ㎍/L (sensitivity: 90.0%, specificity: 94.9%)] had a different optimal cut-off point than female [26.26 ㎍/L (sensitivity: 87.5%, specificity: 99.6%)]. Conclusions: This study determined a cut-off point for urinary cotinine of 39.85 ㎍/L (male: 39.85 ㎍/L, female: 26.26 ㎍/L) to distinguish smokers from non-smokers in adolescents.

Landslide Risk Assessment of Cropland and Man-made Infrastructures using Bayesian Predictive Model (베이지안 예측모델을 활용한 농업 및 인공 인프라의 산사태 재해 위험 평가)

  • Al, Mamun;Jang, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.27 no.3
    • /
    • pp.87-103
    • /
    • 2020
  • The purpose of this study is to evaluate the risk of cropland and man-made infrastructures in a landslide-prone area using a GIS-based method. To achieve this goal, a landslide inventory map was prepared based on aerial photograph analysis as well as field observations. A total of 550 landslides have been counted in the entire study area. For model analysis and validation, extracted landslides were randomly selected and divided into two groups. The landslide causative factors such as slope, aspect, curvature, topographic wetness index, elevation, forest type, forest crown density, geology, land-use, soil drainage, and soil texture were used in the analysis. Moreover, to identify the correlation between landslides and causative factors, pixels were divided into several classes and frequency ratio was also extracted. A landslide susceptibility map was constructed using a bayesian predictive model (BPM) based on the entire events. In the cross validation process, the landslide susceptibility map as well as observation data were plotted with a receiver operating characteristic (ROC) curve then the area under the curve (AUC) was calculated and tried to extract a success rate curve. The results showed that, the BPM produced 85.8% accuracy. We believed that the model was acceptable for the landslide susceptibility analysis of the study area. In addition, for risk assessment, monetary value (local) and vulnerability scale were added for each social thematic data layers, which were then converted into US dollar considering landslide occurrence time. Moreover, the total number of the study area pixels and predictive landslide affected pixels were considered for making a probability table. Matching with the affected number, 5,000 landslide pixels were assumed to run for final calculation. Based on the result, cropland showed the estimated total risk as US $ 35.4 million and man-made infrastructure risk amounted to US $ 39.3 million.

Mapping Landslide Susceptibility Based on Spatial Prediction Modeling Approach and Quality Assessment (공간예측모형에 기반한 산사태 취약성 지도 작성과 품질 평가)

  • Al, Mamun;Park, Hyun-Su;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.26 no.3
    • /
    • pp.53-67
    • /
    • 2019
  • The purpose of this study is to identify the quality of landslide susceptibility in a landslide-prone area (Jinbu-myeon, Gangwon-do, South Korea) by spatial prediction modeling approach and compare the results obtained. For this goal, a landslide inventory map was prepared mainly based on past historical information and aerial photographs analysis (Daum Map, 2008), as well as some field observation. Altogether, 550 landslides were counted at the whole study area. Among them, 182 landslides are debris flow and each group of landslides was constructed in the inventory map separately. Then, the landslide inventory was randomly selected through Excel; 50% landslide was used for model analysis and the remaining 50% was used for validation purpose. Total 12 contributing factors, such as slope, aspect, curvature, topographic wetness index (TWI), elevation, forest type, forest timber diameter, forest crown density, geology, landuse, soil depth, and soil drainage were used in the analysis. Moreover, to find out the co-relation between landslide causative factors and incidents landslide, pixels were divided into several classes and frequency ratio for individual class was extracted. Eventually, six landslide susceptibility maps were constructed using the Bayesian Predictive Discriminant (BPD), Empirical Likelihood Ratio (ELR), and Linear Regression Method (LRM) models based on different category dada. Finally, in the cross validation process, landslide susceptibility map was plotted with a receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC) and tried to extract success rate curve. The result showed that Bayesian, likelihood and linear models were of 85.52%, 85.23%, and 83.49% accuracy respectively for total data. Subsequently, in the category of debris flow landslide, results are little better compare with total data and its contained 86.33%, 85.53% and 84.17% accuracy. It means all three models were reasonable methods for landslide susceptibility analysis. The models have proved to produce reliable predictions for regional spatial planning or land-use planning.