• 제목/요약/키워드: Survival variable

검색결과 188건 처리시간 0.023초

Fitting Cure Rate Model to Breast Cancer Data of Cancer Research Center

  • Baghestani, Ahmad Reza;Zayeri, Farid;Akbari, Mohammad Esmaeil;Shojaee, Leyla;Khadembashi, Naghmeh;Shahmirzalou, Parviz
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권17호
    • /
    • pp.7923-7927
    • /
    • 2015
  • Background: The Cox PH model is one of the most significant statistical models in studying survival of patients. But, in the case of patients with long-term survival, it may not be the most appropriate. In such cases, a cure rate model seems more suitable. The purpose of this study was to determine clinical factors associated with cure rate of patients with breast cancer. Materials and Methods: In order to find factors affecting cure rate (response), a non-mixed cure rate model with negative binomial distribution for latent variable was used. Variables selected were recurrence cancer, status for HER2, estrogen receptor (ER) and progesterone receptor (PR), size of tumor, grade of cancer, stage of cancer, type of surgery, age at the diagnosis time and number of removed positive lymph nodes. All analyses were performed using PROC MCMC processes in the SAS 9.2 program. Results: The mean (SD) age of patients was equal to 48.9 (11.1) months. For these patients, 1, 5 and 10-year survival rates were 95, 79 and 50 percent respectively. All of the mentioned variables were effective in cure fraction. Kaplan-Meier curve showed cure model's use competence. Conclusions: Unlike other variables, existence of ER and PR positivity will increase probability of cure in patients. In the present study, Weibull distribution was used for the purpose of analysing survival times. Model fitness with other distributions such as log-N and log-logistic and other distributions for latent variable is recommended.

건강근로자효과의 최소화 방안과 보정 방법 (Methods to Minimize or Adjust for Healthy Worker Effect in Occupational Epidemiology)

  • 이경무;전재범;박동욱;이원진
    • 한국환경보건학회지
    • /
    • 제37권5호
    • /
    • pp.342-347
    • /
    • 2011
  • Healthy worker effect (HWE) refers to the consistent tendency for actively employed individuals to have a more favorable mortality experience than the population at large. Although HWE has been well known since the 1970s, only a few studies in occupational epidemiology have attempted to fully define and evaluate HWE. HWE can be separated into effects on the initial hiring into the workforce (healthy worker hire effect) and those on continuing employment (healthy worker survival effect). In this review, we summarize the methods for minimizingor adjusting for the healthy worker effect available in occupational epidemiology. It is noteworthy that healthy worker survival effect appears complicated, considering that employment status plays simultaneous roles as a counfounding variable and intermediate variable, whereas healthy worker hire effect may be adjusted by incorporating health status at baseline into the statistical model. In addition, two retrospective cohort studies for workers in the semiconductor industry and Vietnam veterans in Korea, respectively, were introduced, and their results were explained in terms of healthy worker effect.

Bayesian Variable Selection in the Proportional Hazard Model with Application to Microarray Data

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 춘계 학술발표회 논문집
    • /
    • pp.17-23
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions(covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enables us to estimate the survival curve when n ${\ll}$p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA (cDNA) data and Breast Carcinomas data.

  • PDF

Breeding of Bivoltine Breeds of Bombyx mori L Suitable for Variable Climatic Conditions of the Tropics

  • Moorthy, S. M.;Das, S. K.;Kar, N. B.;Urs, S. Raje
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제14권2호
    • /
    • pp.99-105
    • /
    • 2007
  • The success of rearing with presently available conventional bivoltine is unpredictable in some seasons of the tropical regions due to highly fluctuating adverse climatic conditions. Thus, in order to popularize bivoltine breeds in tropical parts of India, it is very much essential to have a bivoltine breed(s), which can give stable cocoon crop under variable environments. With this objective a breeding programme was undertaken to improve the survival trait in bivoltine silkworm by introducing multivoltine genes into bivoltine through back crossing. Resultant bivoltine lines showed significantly higher survival in compared to the receptor (Bivoltine) parent and control bivoltine breed. Esterase isozyme analysis revealed similar banding pattern in the developed bivoltine and in the donor multivoltine, which predicts the introgression of multivoltine character into evolved bivoltine.

H-likelihood approach for variable selection in gamma frailty models

  • Ha, Il-Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권1호
    • /
    • pp.199-207
    • /
    • 2012
  • Recently, variable selection methods using penalized likelihood with a shrink penalty function have been widely studied in various statistical models including generalized linear models and survival models. In particular, they select important variables and estimate coefficients of covariates simultaneously. In this paper, we develop a penalize h-likelihood method for variable selection in gamma frailty models. For this we use the smoothly clipped absolute deviation (SCAD) penalty function, which satisfies a good property in variable selection. The proposed method is illustrated using simulation study and a practical data set.

Ensemble variable selection using genetic algorithm

  • Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
    • Communications for Statistical Applications and Methods
    • /
    • 제29권6호
    • /
    • pp.629-640
    • /
    • 2022
  • Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.

ROC Curve for Multivariate Random Variables

  • Hong, Chong Sun
    • Communications for Statistical Applications and Methods
    • /
    • 제20권3호
    • /
    • pp.169-174
    • /
    • 2013
  • The ROC curve is drawn with two conditional cumulative distribution functions (or survival functions) of the univariate random variable. In this work, we consider joint cumulative distribution functions of k random variables, and suggest a ROC curve for multivariate random variables. With regard to the values on the line, which passes through two mean vectors of dichotomous states, a joint cumulative distribution function can be regarded as a function of the univariate variable. After this function is modified to satisfy the properties of the cumulative distribution function, a ROC curve might be derived; moreover, some illustrative examples are demonstrated.

생존분석에서의 기계학습 (Machine learning in survival analysis)

  • 백재욱
    • 산업진흥연구
    • /
    • 제7권1호
    • /
    • pp.1-8
    • /
    • 2022
  • 본 논문은 중도중단 데이터가 포함된 생존데이터의 경우 적용할 수 있는 기계학습 방법에 대해 살펴보았다. 우선 탐색적인 자료분석으로 각 특성에 대한 분포, 여러 특성들 간의 관계 및 중요도 순위를 파악할 수 있었다. 다음으로 독립변수에 해당하는 여러 특성들과 종속변수에 해당하는 특성(사망여부) 간의 관계를 분류문제로 보고 logistic regression, K nearest neighbor 등의 기계학습 방법들을 적용해본 결과 적은 수의 데이터이지만 통상적인 기계학습 결과에서와 같이 logistic regression보다는 random forest가 성능이 더 좋게 나왔다. 하지만 근래에 성능이 좋다고 하는 artificial neural network나 gradient boost와 같은 기계학습 방법은 성능이 월등히 좋게 나오지 않았는데, 그 이유는 주어진 데이터가 빅데이터가 아니기 때문인 것으로 판명된다. 마지막으로 Kaplan-Meier나 Cox의 비례위험모델과 같은 통상적인 생존분석 방법을 적용하여 어떤 독립변수가 종속변수 (ti, δi)에 결정적인 영향을 미치는지 살펴볼 수 있었으며, 기계학습 방법에 속하는 random forest를 중도중단 데이터가 포함된 생존데이터에도 적용하여 성능을 평가할 수 있었다.

Multiple Gamma Knife Radiosurgery for Multiple Metachronous Brain Metastases Associated with Lung Cancer : Survival Time

  • Kim, Hyung-Seok;Koh, Eun-Jeong;Choi, Ha-Young
    • Journal of Korean Neurosurgical Society
    • /
    • 제52권4호
    • /
    • pp.334-338
    • /
    • 2012
  • Objective : We compared the survival time between patients with multiple gamma knife radiosurgery (GKRS) and patients with a single GKRS plus whole brain radiation therapy (WBRT), in patients with multiple metachronous brain metastases from lung cancer. Methods : From May 2006 to July 2010, we analyzed 31 patients out of 112 patients who showed multiple metachronous brain metastases. 20 out of 31 patients underwent multiple GKRS (group A) and 11 patients underwent a single GKRS plus WBRT (group B). We compared the survival time between group A and B. Kaplan-Meier method and Cox proportional hazards were used to analyze relationship between survival and 1) the number of lesions in each patient, 2) the average volume of lesions in each patient, 3) the number of repeated GKRS, and 4) the interval of development of new lesions, respectively. Results : Median survival time was 18 months (range 6-50 months) in group A and 6 months (range 3-18 months) in group B. Only the average volume of individual lesion (over 10 cc) was negatively related with survival time according to Kaplan-Meier method. Cox-proportional hazard ratio of each variable was 1.1559 for the number of lesions, 1.0005 for the average volume of lesions, 0.0894 for the numbers of repeated GKRS, and 0.5970 for the interval of development of new lesions. Conclusion : This study showed extended survival time in group A compared with group B. Our result supports that multiple GKRS is of value in extending the survival time in patients with multiple metachronous brain metastases, and that the number of the lesions and the frequency of development of new lesions are not an obstacle in treating patients with GKRS.

Using SEER Data to Quantify Effects of Low Income Neighborhoods on Cause Specific Survival of Skin Melanoma

  • Cheung, Min Rex
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권5호
    • /
    • pp.3219-3221
    • /
    • 2013
  • Background: This study used receiver operating characteristic (ROC) curves to screen Surveillance, Epidemiology and End Results (SEER) skin melanoma data to identify and quantify the effects of socioeconomic factors on cause specific survival. Methods: 'SEER cause-specific death classification' used as the outcome variable. The area under the ROC curve was to select best pretreatment predictors for further multivariate analysis with socioeconomic factors. Race and other socioeconomic factors including rural-urban residence, county level % college graduate and county level family income were used as predictors. Univariate and multivariate analyses were performed to identify and quantify the independent socioeconomic predictors. Results: This study included 49,999 parients. The mean follow up time (SD) was 59.4 (17.1) months. SEER staging (ROC area of 0.08) was the most predictive foctor. Race, lower county family income, rural residence, and lower county education attainment were significant univariates, but rural residence was not significant under multivariate analysis. Living in poor neighborhoods was associated with a 2-4% disadvantage in actuarial cause specific survival. Conclusions: Racial and socioeconomic factors have a significant impact on the survival of melanoma patients. This generates the hypothesis that ensuring access to cancer care may eliminate these outcome disparities.