A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets

Kim, Byung-Soo;Jang, Jee-Sun;Kim, Sang-Cheol;Lim, Jo-Han;

doi:10.5351/KJAS.2009.22.3.617

응용통계연구 (The Korean Journal of Applied Statistics)

제22권3호
/
Pages.617-626
/
2009
/
1225-066X(pISSN)
/
2383-5818(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

cDNA 마이크로어레이에서 유전자간 상관 관계에 대한 보고

A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets

김병수 (연세대학교 응용통계학과) ;
장지선 (한국 경제 연구원) ;
김상철 (연세대학교 응용통계학과) ;
임요한 (서울대학교 통계학과)

Kim, Byung-Soo (Department of Applied Statistics, Yonsei University) ;
Jang, Jee-Sun (Korea Economic Research Institute) ;
Kim, Sang-Cheol (Department of Applied Statistics, Yonsei University) ;
Lim, Jo-Han (Department of Statistics, Seoul National University)

발행 : 2009.06.30

https://doi.org/10.5351/KJAS.2009.22.3.617 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

최근에 보고되는 일련의 연구는 Affymetrix 마이크로어레이 자료에서 유전자간 상관관계가 강하고 장범위(長範圍)(long-ranged)로 나타나고 있으며, 기존의 "편한" 가정, 즉 유전자간 상관관계가 매우 약하며, 따라서 유전자간 유사 독립성을 가정할 수 있다는 주장이 비현실적이라는 것을 보고하고 있다. Qui 등 (2005b)은 각 유전자의 검정통계량을 병합하여 통계적 추론을 하는 이른바 비모수적 경험적 베이즈 방법을 적용하면 검색된 특이발현 유전자수의 분산이 커진다는 것을 보고하고 있고, 이러한 분산의 불안전성 이유로서 유전자간 강한 상관관계를 지적하고 있다. 또한 Klebanov와 Yakovlev (2007)는 유전자간 상관관계가 통계적 분석을 어렵게 하는 요인이라기 보다는 유용한 정보의 원천이고 적정한 변환을 통하여 근사 독립을 유지할 수 있는 급수를 만들 수 있으며 이 급수를 ${\delta}$-급수라고 불렀다. 본 보고에서는 국내에서 생산된 2조의 cDNA 마이크로어레이 자료에서 유전자간 상관관계가 비교적 강하며, 장범위(長範圍)로 나타나는 것을 확인하며, 유사 독립성을 전제할 수 있는 ${\delta}$-급수가 cDNA 마이크로어레이에서도 발견되는 것을 보고하고자 한다, 동 보고는 추후 cDNA 마이크로어레이 자료의 분석에서도 유전자간 상관관계를 고려하여야 함을 강조하고 있다.

A series of recent papers reported that the inter-gene correlations in Affymetrix microarray data sets were strong and long-ranged, and the assumption of independence or weak dependence among gene expression signals which was often employed without justification was in conflict with actual data. Qui et al. (2005) indicated that applying the nonparametric empirical Bayes method in which test statistics were pooled across genes for performing the statistical inference resulted in the large variance of the number of differentially expressed genes. Qui et al. (2005) attributed this effect to strong and long-ranged inter-gene correlations. Klebanov and Yakovlev (2007) demonstrated that the inter-gene correlations provided a rich source of information rather than being a nuisance in the statistical analysis and they developed, by transforming the original gene expression sequence, a sequence of independent random variables which they referred to as a ${\delta}$-sequence. We note in this report using two cDNA microarray data sets experimented in this country that the strong and long-ranged inter-gene correlations were still valid in cDNA microarray data and also the ${\delta}$-sequence of independence could be derived from the cDNA microarray data. This note suggests that the inter-gene correlations be considered in the future analysis of the cDNA microarray data sets.

키워드

참고문헌

Efron, B. (2003). Robbins, empirical Bayes and microarrays, The Annals of Statistics, 31, 366-378 https://doi.org/10.1214/aos/1051027871
Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis, Journal of the American Statistical Association, 99, 96-104 https://doi.org/10.1198/016214504000000089
Efron, B. (2007). Correlation and large-scale simultaneous significance testing, Journal of the American Statistical Association, 102, 93-103 https://doi.org/10.1198/016214506000001211
Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, 1151-1160 https://doi.org/10.1198/016214501753382129
Frantz,S. (2005). An array of problems, Nature Reviews Drug Discovery, 4, 302-303 https://doi.org/10.1038/nrd1746
Kim, B. S., Kim, I., Lee, S., Kim, S., Rha, S. Y. and Chung, H. C. (2005). Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer, Bioinformatics, 21, 517-528 https://doi.org/10.1093/bioinformatics/bti029
Klebanov, L., Jordan, C. and Yakovlev, A. (2006). A new type of stochastic dependence revealed in gene expression data, Statistical Applications in Genetics and Molecular Biology, 5, Ariticle 7
Klebanov, L. and Yakovlev, A. (2006). Treating expression levels of different genes as a sample in microarray data analysis: Is it worth a risk?, Statistical Applications in Genetics and Molecular Biology, 5, Ariticle 9
Klebanov, L. and Yakovlev, A. (2007). Diverse correlation structures in gene expression data and their utility in improving statistical inference, The Annals oj Applied Statistics, 1, 538-559 https://doi.org/10.1214/07-AOAS120
Marshall, E. (2004). Getting the noise out of gene arrays, Science, 306, 630-631 https://doi.org/10.1126/science.306.5696.630
Qui, X., Brooks, A. I., Klebanov, L. and Yakovlev, A. (2005a). The effects of normalization on the correlation structure of microarray data, BMC Bioinformatics, 6, 120 https://doi.org/10.1186/1471-2105-6-120
Qui, X., Klebanov, L. and Yakovlev, A. (2005b). Correlation between gene expression levels and limitations of the empirical Bayes methodology for finding differentially expressed genes, Statistical Applications in Genetics and Molecular Biology, 4, Ariticle 34
Qui, X., Xiao, Y., Gordon, A. and Yakovlev, A. (2006). Assessing stability of gene selection in microarray data analysis, BMC Bioinformatics, 7, 50 https://doi.org/10.1186/1471-2105-7-50
Qui, X. and Yakovlev, A. (2006). Some comments on instability of false discovery rate estimation, Journal of Bioinformatics and Computational Biology, 4, 1057-1068 https://doi.org/10.1142/S0219720006002338
Stolovitzky, G. (2003). Gene selection in microarray data: The elephant, the blind men and our algorithm, Current Opinions in Structural Biology, 13, 370-376 https://doi.org/10.1016/S0959-440X(03)00078-2
Yang,S., Jeung, H. C., Jeong, H. J., Choi, Y. H., Kim, J. E., Jung, J. J., Rha, S. Y., Yang, W. I. and Chung, H. C. (2007a). Identification of genes with correlated patterns of variations in DNA copy number and gene expression level in gastric cancer, Genomics, 89, 451-459 https://doi.org/10.1016/j.ygeno.2006.12.001
Yang, S., Shin, J., Park, K. H., Jeung, H-C., Rha, S. Y., Noh, S. H., Yang, W. I. and Chung, H. C. (2007b). Molecular basis of the difference between normal and tumor tissues of gastric cancer, Biochimica et Biophysica Acta, 1772, 1033-1040 https://doi.org/10.1016/j.bbadis.2007.05.005

피인용 문헌

Identifying statistically significant gene sets based on differential expression and differential coexpression vol.29, pp.3, 2016, https://doi.org/10.5351/KJAS.2016.29.3.437

응용통계연구 (The Korean Journal of Applied Statistics)

cDNA 마이크로어레이에서 유전자간 상관 관계에 대한 보고

A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets

초록

키워드

참고문헌

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)