A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment

Kim, Byung-Soo;Kim, Sang-Cheol;

doi:10.5351/KJAS.2009.22.1.115

응용통계연구 (The Korean Journal of Applied Statistics)

제22권1호
/
Pages.115-127
/
2009
/
1225-066X(pISSN)
/
2383-5818(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment

Kim, Byung-Soo (Dept of Applied Statistics, Yonsei University) ;
Kim, Sang-Cheol (Dept. of Applied Statistics, Yonsei University)

발행 : 2009.02.28

https://doi.org/10.5351/KJAS.2009.22.1.115 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

The purpose of statistical analyses of array-CGH experiment data is to divide the whole genome into regions of equal copy number, to quantify the copy number in each region and finally to evaluate its significance of being different from two. Several statistical procedures have been proposed which include the circular binary segmentation, and a Gaussian based local regression for detecting break points (GLAD) by estimating a piecewise constant function. We propose in this note a penalized spline regression and its simultaneous confidence band(SCB) approach to evaluate the statistical significance of regions of genetic gain/loss. The region of which the simultaneous confidence band stays above 0 or below 0 can be considered as a region of genetic gain or loss. We compare the performance of the SCB procedure with GLAD and hidden Markov model approaches through a simulation study in which the data were generated from AR(1) and AR(2) models to reflect spatial dependence of the array-CGH data in addition to the independence model. We found that the SCB method is more sensitive in detecting the low level copy number alterations.

키워드

참고문헌

Barry, D. and Hartigan, J. A. (1993). A Bayesian analysis for change point problems, Journal of the American Statistical Association, 88, 309-319 https://doi.org/10.2307/2290726
Broet, P. and Richardson, S. (2006). Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model, Bioinformatics, 22, 911-918 https://doi.org/10.1093/bioinformatics/btl035
Chari, R., Lockwood, W. W. and Lam, W. L. (2006). Computational methods for the analysis of array comparative genomic hybridization, Cancer Informatics, 2, 48-58
Eilers, P. H. C and de Menezes, R X. (2005). Quantile smoothing of array CGH data, Bioinformatics, 21, 1146-1153 https://doi.org/10.1093/bioinformatics/bti148
Fan, J. and Niu, Y. (2007). Selection and validation of normalization methods fore-DNA microarrays using within-array replications, Bioinformatics, 23, 2391-2398 https://doi.org/10.1093/bioinformatics/btm361
Fridlyand, J., Snijders, A. M., Pinkel, D., Albertson, D. G. and Jain, A. N. (2004). Hidden Markov models approach to the annlysis of array CGH data, Journal of Multivariate Analysis, 90, 132-153 https://doi.org/10.1016/j.jmva.2004.02.008
Henderson, C R. (1975). Best linear unbiased estimation and prediction under a selection model, Biometrics, 31, 423-447 https://doi.org/10.2307/2529430
Hsu, L., Self, S. G., Grove, D., Randolf, T., Wang, K., Delrow, J. J., Loo, L. and Porter, P. (2005). Denoising array-based comparative genomic hybridization data using wavelets, Biostatistics, 6, 211-226 https://doi.org/10.1093/biostatistics/kxi004
Huang, T., Wu, B., Lizardi, P. and Zhao, H. (2005). Detection of DNA copy number alterations using penalized least squares regression, Bioinformatics, 21, 3811-3817 https://doi.org/10.1093/bioinformatics/bti646
Hupe, P., Stransky, N., Thiery, J. P., Radvanyi, F. and Barillot, E. (2004). Analysis of array CGH data: From signal ratio to gain and loss of DNA regions, Bioinformatics, 20, 3413-3422 https://doi.org/10.1093/bioinformatics/bth418
Jong, K., Marchiori, E., Meijer, G., Vaart, A. V. D. and Ylstra, B. (2004). Breakpoint identification and smoothing of array comparative genomic hybridization data, Bioinformatics, 20, 3636-3637 https://doi.org/10.1093/bioinformatics/bth355
Kim, B. S., Kim, I., Lee, S., Kim, S., Rha, S. Y. and Chung, C H. (2005). Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer, Bioinformatics, 21, 517-528 https://doi.org/10.1093/bioinformatics/bti029
Lai, W. R., Johnson, M. D., Kucherlapati, R. and Park, P. J. (2005). Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data, Bioinformatics, 21, 3763-3770 https://doi.org/10.1093/bioinformatics/bti611
Li, Y. and Zhu, J. (2007). Analysis of array CGH data for cancer studies using fused quantile regression, Bioinformatics, 23, 2470-2476 https://doi.org/10.1093/bioinformatics/btm364
Mestre-Escorihuela, C, Rubio-Moscardo, F., Richter, J. A., Seibert, R, Clement, J., Fresquet, V., Beltran, E., Agirre, X., Marugan, I., Marin, M., Rosenwald, A., Sugimoto, K. J., Wheat, L. M., Karran, E. L., Garcia, J. F., Sanchez. L., Prosper, F., Staudt, L. M., Pinkel, D., Dyer, M. J. and Martinez-Climent, J. A. (2007). Homozygous deletions localize novel tumor suppressor gene in B-cell lymphoma, Blood, 109, 271-280 https://doi.org/10.1182/blood-2006-06-026500
Myers, C L., Dunham, M. J., Kung, S. Y. and Troyanskaya, O. G. (2004). Accurate detection of aneuploidies in array CGH and gene expression microaray data, Bioinformatics, 20, 3533-3543 https://doi.org/10.1093/bioinformatics/bth440
Olshen, A. B., Venkatraman, E. S., Lucito, Rand Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics; 5, 557-572 https://doi.org/10.1093/biostatistics/kxh008
Picard, F., Robin,S., lebarbier, E. and Daudin, J-.J. (2007). A segmentation/clustering model for the analysis of array CGH data, Biometrics, 63, 758-766 https://doi.org/10.1111/j.1541-0420.2006.00729.x
Pinkel, D. and Albertson, D. G. (2005). Array comparative genomic hybridization and its applications in cancer, Nature Genetics, 37, S11-S17 https://doi.org/10.1038/ng1569
Pollack, J. R, Sorlie, T., Perou, C M., Rees, C A., Jeffrey, S. S., Lonning, P. E., Tibshirani, R, Botstein, D., Borresen-Dale, A. L. and Brown, P. O. (2002). Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors, Proceedings of the National Academy of Sciences, 99, 12963-12968 https://doi.org/10.1073/pnas.162471999
Rabiner, L. R (1989). A tutorial on hidden Markov models and selected applications in speech recognition, In Proceedings of the IEEE, 77, 257-286 https://doi.org/10.1109/5.18626
Rigaill, G., Hupe, P., LaRosa, P., Meyniel, J-.P., Decraene, C, Almeida, A. and Barillot, E. (2008). ITALICS: An algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays, Bioinformatics, 24, 768-774 https://doi.org/10.1093/bioinformatics/btn048
Rouveirol, C, Stransky, N., Hupe, P., Rosa, P. L., Viara, E., Barillot, E. and Radvanyi, F. (2006). Computation of recurrent minimal genomic alterations from array-CGH data, Bioinformatics, 22, 849-856 https://doi.org/10.1093/bioinformatics/btl004
Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression, Cambridge University Press, New York
Scheel, I., Aldrin, M., Glad, I. K., Sorum, R., Lying, H, and Frigessi, A. (2005). The inference of missing value imputation on detection of differentially expressed genes from microarray data, Bioinformatics, 21, 4272-4279 https://doi.org/10.1093/bioinformatics/bti708
Shah, S. P., Lam, W. L., Ng, R. T. and Murphy, K. P. (2007). Modeling recurrent DNA copy number alterations in array CGH data, Bioinformatics, 23, i450-i458 https://doi.org/10.1093/bioinformatics/btm221
Stjernqvist, S., Ryden, T., Skold, M. and Staaf, J. (2007). Continuous-index hidden Markov modelling of array CGH copy number data, Bioinformatics. 23, 1006-1014 https://doi.org/10.1093/bioinformatics/btm059
Tibshirani, R. and Wang, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso, Biostatistics, 9, 18-29 https://doi.org/10.1093/biostatistics/kxm013
Venkatraman, E. S. and Olshen, A. B. (2007). A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 23, 657-663 https://doi.org/10.1093/bioinformatics/btl646
Wen, C.C., Wu, Y-J., Huang, Y-H., Chen, W-C., Liu, S-C., Jiang, S. S., Juang, J. L., Lin, C. Y., Fang, W. T., Hsiung, C. A. and Chang, I. S. (2006). A Bayes regression approach to array-CGH data, Statistical Applications in Genetics and Molecular Biology, 5, Article 3 https://doi.org/10.2202/1544-6115.1149
Yang, S. (2007). Gene amplifications at chromosome 7 of the human gastric cancer genome, International Journal of Molecular Medicine, 20, 225-231
Yang,S., Jeung, H. C., Choi, Y. H., Kim, J. E., Jung, J-J., Jeong, H. J., Rha, S. Y., Yang, W. I. and Chung, H. C. (2007). Identification of genes with correlated patterns of variations in DNA copy number and gene expression level in gastric cancer, Genomics, 89, 451-459 https://doi.org/10.1016/j.ygeno.2006.12.001
Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J. and Speed, T. P. (2002). Normalization for cDNA rnicroarray data: A robust composite method addressing single and multiple slide systematic variation, Nucleic Acid Research, 30, e15 https://doi.org/10.1093/nar/30.4.e15
Yistra, B., van der lJssel, P., Carvalho, B., Brakenhoff, R. H. and Meijer, G. A. (2006). BAC to the future! or oligonucleotides: A perspective for micro array comparative genomic hybridization(array CGH), Nucleic Acid Research, 34, 445-450 https://doi.org/10.1093/nar/gkj456

응용통계연구 (The Korean Journal of Applied Statistics)

A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment

초록

키워드

참고문헌

자세히 찾기