Proceedings of the Korean Society for Bioinformatics Conference (한국생물정보학회:학술대회논문집)
- 2005.09a
- /
- Pages.357-360
- /
- 2005
Bayesian Variable Selection in the Proportional Hazard Model with Application to DNA Microarray Data
- Lee, Kyeon-Eun (Department of Statistics, Kyungpook National University) ;
- Mallick, Bani K. (Department of Statistics, Texas A&M Univesrity)
- Published : 2005.09.22
Abstract
In this paper we consider the well-known semiparametric proportional hazards (PH) models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enable us to estimate the survival curve when n < < p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty. To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA(cDNA) data.
Keywords