• 제목/요약/키워드: time-dependent ROC

검색결과 9건 처리시간 0.028초

생존 분석 자료에서 적용되는 시간 가변 ROC 분석에 대한 리뷰 (Review for time-dependent ROC analysis under diverse survival models)

  • 김양진
    • 응용통계연구
    • /
    • 제35권1호
    • /
    • pp.35-47
    • /
    • 2022
  • Receiver operating characteristic (ROC) 곡선은 이항 반응 자료에 대한 마커의 분류 예측력을 측정하기 위해 널리 적용되어왔으며 최근에는 생존 분석에서도 매우 중요한 역할을 하고 있다. 여러 가지 유형의 중도 절단과 원인 불명 등 다양한 종류의 결측 자료를 포함한 생존 자료 분석에서 마커의 사건 발생 여부에 대한 예측력을 판단하기 위해 기존의 통계량을 확장하였다. 생존 분석 자료는 각 시점에서의 사건 발생 여부로 이해할 수 있으며, 따라서 시점마다 ROC 곡선과 AUC를 구할 수 있다. 본 논문에서는 우중도 절단과 경쟁 위험 모형하에서 사용되는 다양한 방법론과 관련 R 패키지를 소개하고 각 방법의 특성을 설명하고 비교하였으며 이를 검토하기 위해 간단한 모의실험을 시행하였다. 또한, 프랑스에서 수집된 치매 자료의 마커 분석을 시행하였다.

Estimation of the time-dependent AUC for cure rate model with covariate dependent censoring

  • Yang-Jin Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제31권4호
    • /
    • pp.365-375
    • /
    • 2024
  • Diverse methods to evaluate the prediction model of a time to event have been proposed in the context of right censored data where all subjects are subject to be susceptible. A time-dependent AUC (area under curve) measures the predictive ability of a marker based on case group and control one which are varying over time. When a substantial portion of subjects are event-free, a population consists of a susceptible group and a cured one. An uncertain curability of censored subjects makes it difficult to define both case group and control one. In this paper, our goal is to propose a time-dependent AUC for a cure rate model when a censoring distribution is related with covariates. A class of inverse probability of censoring weighted (IPCW) AUC estimators is proposed to adjust the possible sampling bias. We evaluate the finite sample performance of the suggested methods with diverse simulation schemes and the application to the melanoma dataset is presented to compare with other methods.

Machine Learning-Based Prediction of COVID-19 Severity and Progression to Critical Illness Using CT Imaging and Clinical Data

  • Subhanik Purkayastha;Yanhe Xiao;Zhicheng Jiao;Rujapa Thepumnoeysuk;Kasey Halsey;Jing Wu;Thi My Linh Tran;Ben Hsieh;Ji Whae Choi;Dongcui Wang;Martin Vallieres;Robin Wang;Scott Collins;Xue Feng;Michael Feldman;Paul J. Zhang;Michael Atalay;Ronnie Sebro;Li Yang;Yong Fan;Wei-hua Liao;Harrison X. Bai
    • Korean Journal of Radiology
    • /
    • 제22권7호
    • /
    • pp.1213-1224
    • /
    • 2021
  • Objective: To develop a machine learning (ML) pipeline based on radiomics to predict Coronavirus Disease 2019 (COVID-19) severity and the future deterioration to critical illness using CT and clinical variables. Materials and Methods: Clinical data were collected from 981 patients from a multi-institutional international cohort with real-time polymerase chain reaction-confirmed COVID-19. Radiomics features were extracted from chest CT of the patients. The data of the cohort were randomly divided into training, validation, and test sets using a 7:1:2 ratio. A ML pipeline consisting of a model to predict severity and time-to-event model to predict progression to critical illness were trained on radiomics features and clinical variables. The receiver operating characteristic area under the curve (ROC-AUC), concordance index (C-index), and time-dependent ROC-AUC were calculated to determine model performance, which was compared with consensus CT severity scores obtained by visual interpretation by radiologists. Results: Among 981 patients with confirmed COVID-19, 274 patients developed critical illness. Radiomics features and clinical variables resulted in the best performance for the prediction of disease severity with a highest test ROC-AUC of 0.76 compared with 0.70 (0.76 vs. 0.70, p = 0.023) for visual CT severity score and clinical variables. The progression prediction model achieved a test C-index of 0.868 when it was based on the combination of CT radiomics and clinical variables compared with 0.767 when based on CT radiomics features alone (p < 0.001), 0.847 when based on clinical variables alone (p = 0.110), and 0.860 when based on the combination of visual CT severity scores and clinical variables (p = 0.549). Furthermore, the model based on the combination of CT radiomics and clinical variables achieved time-dependent ROC-AUCs of 0.897, 0.933, and 0.927 for the prediction of progression risks at 3, 5 and 7 days, respectively. Conclusion: CT radiomics features combined with clinical variables were predictive of COVID-19 severity and progression to critical illness with fairly high accuracy.

Effects of a Five Times Sit to Stand Test on the Daily Life Independence of Korean Elderly and Cut-Off Analysis

  • Nam, Seung-Min;Kim, Seong-Gil
    • 대한물리의학회지
    • /
    • 제14권4호
    • /
    • pp.29-35
    • /
    • 2019
  • PURPOSE: The aim of this study was to provide the standard value of the Five Times Sit to Stand Test (FTSST) measurement on the daily life independence of the elderly in Korea and examine the effects of this test on their daily lives. METHODS: This study was conducted on elderly people over 65 years of age living in Gyeongsangbuk-do, Korea. FTSST was performed while sitting position on a chair. The subjects were classified into independent and dependent living groups according to their lifestyle, and their influence was then examined through logistic regression analysis. To determine the usefulness and cut-off value of the FTSST, the analysis was performed using the ROC curve. RESULTS: The elderly were more likely to live in a group rather than independently as the FTSST time increased (p<.05) (OR=1.098). The area of the lower part of the ROC curve was .707, and as the FTSST increased, a subject was more likely to live in a group rather than independently (p<.05). The cut-off value was assigned to the point where both the specificity and sensitivity were at the coordinates. The sensitivity and specificity were .626 and .753, respectively at 15.62 seconds. CONCLUSION: The elderly in Korea are more likely to live a group-dependent lifestyle than live independently; the likelihood of this outcome is increased further for every additional second beyond 15.62 seconds. The loss of independence of daily life could be predicted based on the status of a subject's lower leg strength using the FTSST.

Cox 비례위험모형을 이용한 우측 대장암 3기 자료 분석 (Analysis of stage III proximal colon cancer using the Cox proportional hazards model)

  • 이태섭;이민정
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권2호
    • /
    • pp.349-359
    • /
    • 2017
  • 본 논문에서는 미국 국립암연구소의 SEER 프로그램에서 제공하는 우측 대장암 3기 자료에 Cox 비례위험모형을 적합하여 생존분석을 하였다. 우측 대장암 3기 환자의 사망률에 유의한 영향을 미치는 공변량들을 파악하고, 관심있는 공변량들을 가진 환자의 생존율을 추정하였다. Schoenfeld 잔차를 기반한 검정과 Schoenfeld 잔차 도표, $log[-log\{{\hat{S}}(t)\}]$ 도표를 이용하여 분석에 사용된 공변량들이 비례위험 가정을 만족함을 확인하였다. 적합된 Cox 비례위험모형의 타당성을 검증하기 위해 10-fold 교차 검증을 이용하여 calibration 도표와 시간에 의존하는 ROC 곡선 아래 면적을 계산하였다. 이를 통해 적합된 Cox 비례위험모형의 타당성을 확인하였다.

Prognostic Value of 18F-FDG PET/CT Radiomics in Extranodal Nasal-Type NK/T Cell Lymphoma

  • Yu Luo;Zhun Huang;Zihan Gao;Bingbing Wang;Yanwei Zhang;Yan Bai;Qingxia Wu;Meiyun Wang
    • Korean Journal of Radiology
    • /
    • 제25권2호
    • /
    • pp.189-198
    • /
    • 2024
  • Objective: To investigate the prognostic utility of radiomics features extracted from 18F-fluorodeoxyglucose (FDG) PET/CT combined with clinical factors and metabolic parameters in predicting progression-free survival (PFS) and overall survival (OS) in individuals diagnosed with extranodal nasal-type NK/T cell lymphoma (ENKTCL). Materials and Methods: A total of 126 adults with ENKTCL who underwent 18F-FDG PET/CT examination before treatment were retrospectively included and randomly divided into training (n = 88) and validation cohorts (n = 38) at a ratio of 7:3. Least absolute shrinkage and selection operation Cox regression analysis was used to select the best radiomics features and calculate each patient's radiomics scores (RadPFS and RadOS). Kaplan-Meier curve and Log-rank test were used to compare survival between patient groups risk-stratified by the radiomics scores. Various models to predict PFS and OS were constructed, including clinical, metabolic, clinical + metabolic, and clinical + metabolic + radiomics models. The discriminative ability of each model was evaluated using Harrell's C index. The performance of each model in predicting PFS and OS for 1-, 3-, and 5-years was evaluated using the time-dependent receiver operating characteristic (ROC) curve. Results: Kaplan-Meier curve analysis demonstrated that the radiomics scores effectively identified high- and low-risk patients (all P < 0.05). Multivariable Cox analysis showed that the Ann Arbor stage, maximum standardized uptake value (SUVmax), and RadPFS were independent risk factors associated with PFS. Further, β2-microglobulin, Eastern Cooperative Oncology Group performance status score, SUVmax, and RadOS were independent risk factors for OS. The clinical + metabolic + radiomics model exhibited the greatest discriminative ability for both PFS (Harrell's C-index: 0.805 in the validation cohort) and OS (Harrell's C-index: 0.833 in the validation cohort). The time-dependent ROC analysis indicated that the clinical + metabolic + radiomics model had the best predictive performance. Conclusion: The PET/CT-based clinical + metabolic + radiomics model can enhance prognostication among patients with ENKTCL and may be a non-invasive and efficient risk stratification tool for clinical practice.

데이터 마이닝을 이용한 시멘트 소성공정 질소산화물(NOx)배출 관리 방법에 관한 연구 (A Study on NOx Emission Control Methods in the Cement Firing Process Using Data Mining Techniques)

  • 박철홍;김용수
    • 품질경영학회지
    • /
    • 제46권3호
    • /
    • pp.739-752
    • /
    • 2018
  • Purpose: The purpose of this study was to investigate the relationship between kiln processing parameters and NOx emissions that occur in the sintering and calcination steps of the cement manufacturing process and to derive the main factors responsible for producing emissions outside emission limit criteria, as determined by category models and classification rules, using data mining techniques. The results from this study are expected to be useful as guidelines for NOx emission control standards. Methods: Data were collected from Precalciner Kiln No.3 used in one of the domestic cement plants in Korea. Thirty-four independent variables affecting NOx generation and dependent variables that exceeded or were below the NOx emiision limit (>1 and <0, respectively) were examined during kiln processing. These data were used to construct a detection model of NOx emission, in which emissions exceeded or were below the set limits. The model was validated using SPSS MODELER 18.0, artificial neural network, decision treee (C5.0), and logistic regression analysis data mining techniques. Results: The decision tree (C5.0) algorithm best represented NOx emission behavior and was used to identify 10 processing variables that resulted in NOx emissions outside limit criteria. Conclusion: The results of this study indicate that the decision tree (C5.0) can be applied for real-time monitoring and management of NOx emissions during the cement firing process to satisfy NOx emission control standards and to provide for a more eco-friendly cement product.

Performances of Prognostic Models in Stratifying Patients with Advanced Gastric Cancer Receiving First-line Chemotherapy: a Validation Study in a Chinese Cohort

  • Xu, Hui;Zhang, Xiaopeng;Wu, Zhijun;Feng, Ying;Zhang, Cheng;Xie, Minmin;Yang, Yahui;Zhang, Yi;Feng, Chong;Ma, Tai
    • Journal of Gastric Cancer
    • /
    • 제21권3호
    • /
    • pp.268-278
    • /
    • 2021
  • Purpose: While several prognostic models for the stratification of death risk have been developed for patients with advanced gastric cancer receiving first-line chemotherapy, they have seldom been tested in the Chinese population. This study investigated the performance of these models and identified the optimal tools for Chinese patients. Materials and Methods: Patients diagnosed with metastatic or recurrent gastric adenocarcinoma who received first-line chemotherapy were eligible for inclusion in the validation cohort. Their clinical data and survival outcomes were retrieved and documented. Time-dependent receiver operating characteristic (ROC) and calibration curves were used to evaluate the predictive ability of the models. Kaplan-Meier curves were plotted for patients in different risk groups divided by 7 published stratification tools. Log-rank tests with pairwise comparisons were used to compare survival differences. Results: The analysis included a total of 346 patients with metastatic or recurrent disease. The median overall survival time was 11.9 months. The patients were different into different risk groups according to the prognostic stratification models, which showed variability in distinguishing mortality risk in these patients. The model proposed by Kim et al. showed relative higher predicting abilities compared to the other models, with the highest χ2 (25.8) value in log-rank tests across subgroups, and areas under the curve values at 6, 12, and 24 months of 0.65 (95% confidence interval [CI]: 0.59-0.72), 0.60 (0.54-0.65), and 0.63 (0.56-0.69), respectively. Conclusions: Among existing prognostic tools, the models constructed by Kim et al., which incorporated performance status score, neutrophil-to-lymphocyte ratio, alkaline phosphatase, albumin, and tumor differentiation, were more effective in stratifying Chinese patients with gastric cancer receiving first-line chemotherapy.

효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용 (A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market)

  • 이모세;안현철
    • 지능정보연구
    • /
    • 제24권1호
    • /
    • pp.167-181
    • /
    • 2018
  • 지난 10여 년간 딥러닝(Deep Learning)은 다양한 기계학습 알고리즘 중에서 많은 주목을 받아 왔다. 특히 이미지를 인식하고 분류하는데 효과적인 알고리즘으로 알려져 있는 합성곱 신경망(Convolutional Neural Network, CNN)은 여러 분야의 분류 및 예측 문제에 널리 응용되고 있다. 본 연구에서는 기계학습 연구에서 가장 어려운 예측 문제 중 하나인 주식시장 예측에 합성곱 신경망을 적용하고자 한다. 구체적으로 본 연구에서는 그래프를 입력값으로 사용하여 주식시장의 방향(상승 또는 하락)을 예측하는 이진분류기로써 합성곱 신경망을 적용하였다. 이는 그래프를 보고 주가지수가 오를 것인지 내릴 것인지에 대해 경향을 예측하는 이른바 기술적 분석가를 모방하는 기계학습 알고리즘을 개발하는 과제라 할 수 있다. 본 연구는 크게 다음의 네 단계로 수행된다. 첫 번째 단계에서는 데이터 세트를 5일 단위로 나눈다. 두 번째 단계에서는 5일 단위로 나눈 데이터에 대하여 그래프를 만든다. 세 번째 단계에서는 이전 단계에서 생성된 그래프를 사용하여 학습용과 검증용 데이터 세트를 나누고 합성곱 신경망 분류기를 학습시킨다. 네 번째 단계에서는 검증용 데이터 세트를 사용하여 다른 분류 모형들과 성과를 비교한다. 제안한 모델의 유효성을 검증하기 위해 2009년 1월부터 2017년 2월까지의 약 8년간의 KOSPI200 데이터 2,026건의 실험 데이터를 사용하였다. 실험 데이터 세트는 CCI, 모멘텀, ROC 등 한국 주식시장에서 사용하는 대표적인 기술지표 12개로 구성되었다. 결과적으로 실험 데이터 세트에 합성곱 신경망 알고리즘을 적용하였을 때 로지스틱회귀모형, 단일계층신경망, SVM과 비교하여 제안모형인 CNN이 통계적으로 유의한 수준의 예측 정확도를 나타냈다.