Identification of Novel Universal Housekeeping Genes by Statistical Analysis of Microarray Data

  • Published : 2007.03.31


Housekeeping genes are widely used as internal controls in a variety of study types, including real time RT-PCR, microarrays, Northern analysis and RNase protection assays. However, even commonly used housekeeping genes may vary in stability depending on the cell type or disease being studied. Thus, it is necessary to identify additional housekeeping-type genes that show sample-independent stability. Here, we used statistical analysis to examine a large human microarray database, seeking genes that were stably expressed in various tissues, disease states and cell lines. We further selected genes that were expressed at different levels, because reference and target genes should be present in similar copy numbers to achieve reliable quantitative results. Real time RT-PCR amplification of three newly identified reference genes, CGI-119, CTBP1 and GOLGAl, alongside three well-known housekeeping genes, B2M, GAPD, and TUBB, confirmed that the newly identified genes were more stably expressed in individual samples with similar ranges. These results collectively suggest that statistical analysis of microarray data can be used to identify new candidate housekeeping genes showing consistent expression across tissues and diseases. Our analysis identified three novel candidate housekeeping genes (CGI-119, GOLGA1, and CTBP1) that could prove useful for normalization across a variety of RNA-based techniques.



  1. Chen, X., Cheung, S. T., So, S., Fan, S. T., Barry, C., Higgins, J., Lai, K. M., Ji, J., Dudoit, S., Ng, I. O., Van De Rijn, M., Botstein, D. and Brown, P. O. (2002) Gene expression patterns in human liver cancers. Mol. Biol. Cell 13, 1929-1939.
  2. Gibson, U. E., Heid, C. A. and Williams, P. M. (1996) A novel method for real time quantitative RT-PCR. Genome Res. 6, 995-1001.
  3. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. and Lander, E. S. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531-537.
  4. Graveel, C. R., Jatkoe, T., Madore, S. J., Holt, A. L. and Farnham, P. J. (2001) Expression profiling and identification of novel genes in hepatocellular carcinomas. Oncogene 20, 2704-2712.
  5. Hamadeh, H. K., Bushel, P. R., Jayadev, S., DiSorbo, O., Bennett, L., Li, L., Tennant, R., Stoll, R., Barrett, J. C., Paules, R. S., Blanchard, K. and Afshari, C. A. (2002) Prediction of compound signature using high density gene expression profiling. Toxicol. Sci. 67, 232-240.
  6. Heid, C. A., Stevens, J., Livak, K. J. and Williams, P. M. (1996) Real time quantitative PCR. Genome Res. 6, 986-994.
  7. Khimani, A. H., Mhashilkar, A. M., Mikulskis, A., O'Malley, M., Liao, J., Golenko, E. E., Mayer, P., Chada, S., Killian, J. B. and Lott, S. T. (2005) Housekeeping genes in cancer: normalization of array data. Biotechniques 38, 739-745.
  8. Kim, J. W. and Wang, X. W. (2003) Gene expression profiling of preneoplastic liver disease and liver cancer: a new era for improved early detection and treatment of these deadly diseases? Carcinogenesis 24, 363-369.
  9. Kim, S. and Kim, T. (2003) Selection of optimal internal controls for gene expression profiling of liver disease. Biotechniques 35, 456-460.
  10. Kim, S. and Park, Y. M. (2005) Specific gene expression patterns in liver cirrhosis. Biochem. Biophys. Res. Commun. 334, 681-688.
  11. Kim, S., Shi, H., Lee, D. K. and Lis, J. T. (2003) Specific SR protein-dependent splicing substrates identified through genomic SELEX. Nucleic Acids Res. 31, 1955-1961.
  12. Lee, J. S. and Thorgeirsson, S. S. (2002) Functional and genomic implications of global gene expression profiles in cell lines from human hepatocellular cancer. Hepatology 35, 1134-1143.
  13. Schena, M., Shalon, D., Davis, R. W. and Brown, P. O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467-470.
  14. Schmitt, A. O., Specht, T., Beckmann, G., Dahl, E., Pilarsky, C. P., Hinzmann, B. and Rosenthal, A. (1999) Exhaustive mining of EST libraries for genes differentially expressed in normal and tumour tissues. Nucleic Acids Res. 27, 4251-4260.
  15. Suh, Y. J., Yang, M. H., Yoon, S. J. and Park, J. H. (2006) GEDA: new knowledge base of gene expression in drug addiction. J. Biochem. Mol. Biol. 39, 441-447.
  16. Suzuki, T., Higgins, P. J. and Crawford, D. R. (2000) Control selection for RNA quantitation. Biotechniques 29, 332-337.
  17. Szabo, A., Perou, C. M., Karaca, M., Perreard, L., Quackenbush, J. F. and Bernard, P. S. (2004) Statistical modeling for selecting housekeeper genes. Genome Biol. 5, 59.
  18. Thellin, O., Zorzi, W., Lakaye, B., De Borman, B., Coumans, B., Hennen, G., Grisar, T., Igout, A. and Heinen, E. (1999) Housekeeping genes as internal standards: use and limits. J. Biotechnol. 75, 291-295.
  19. Vandesompele, J., De Preter, K., Pattyn, F., Poppe, B., Van Roy, N., De Paepe, A. and Speleman, F. (2002) Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3, 34.
  20. Warrington, J. A., Nair, A., Mahadevappa, M. and Tsyganskaya, M. (2000) Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol. Genomics 2, 143-147.
  21. Zhong, J., Wang, Y., Qiu, X., Mo, X., Liu, Y., Li, T., Song, Q., Ma, D. and Han, W. (2006) Characterization and expression profile of CMTM3/CKLFSF3. J. Biochem. Mol. Biol. 39, 537-545.

Cited by

  1. Consensus reference gene(s) for gene expression studies in human cancers: end of the tunnel visible? vol.38, pp.6, 2015,
  2. Transcriptome Analysis ofSpermophilus lateralisandSpermophilus tridecemlineatusLiver Does Not Suggest the Presence of Spermophilus-Liver-Specific Reference Genes vol.2013, 2013,
  3. Golgi Anti-apoptotic Proteins Are Highly Conserved Ion Channels That Affect Apoptosis and Cell Migration vol.290, pp.18, 2015,
  4. Genomic selection of reference genes for real-time PCR in human myocardium vol.1, pp.1, 2008,
  5. Selection of proper reference genes for the cyanobacteriumSynechococcusPCC 7002 using real-time quantitative PCR vol.359, pp.1, 2014,
  6. Microarray-driven validation of reference genes for quantitative real-time polymerase chain reaction in a rat vocal fold model of mucosal injury vol.406, pp.2, 2010,
  7. Tissue Banking of Diagnostic Lung Cancer Biopsies for Extraction of High Quality RNA vol.5, pp.7, 2010,
  8. Down-regulation of ATM protein in HRS cells of nodular sclerosis Hodgkin's lymphoma in children occurs in the absence ofATMgene inactivation vol.213, pp.3, 2007,
  9. A strategy to identify housekeeping genes suitable for analysis in breast cancer diseases vol.17, pp.1, 2016,
  10. Food additive carrageenan: Part I: A critical review of carrageenanin vitrostudies, potential pitfalls, and implications for human health and safety vol.44, pp.3, 2014,
  11. Reference genes for quantitative, reverse-transcription PCR in Bacillus cereus group strains throughout the bacterial life cycle vol.86, pp.2, 2011,
  12. Selection of novel reference genes for use in the human central nervous system: a BrainNet Europe Study vol.124, pp.6, 2012,
  13. Identification of housekeeping genes suitable for gene expression analysis in the zebrafish vol.11, pp.3-4, 2011,
  14. Ensuring the safety of vaccine cell substrates by massively parallel sequencing of the transcriptome vol.29, pp.41, 2011,
  15. Matrix metalloproteinases and their inhibitors in canine mammary tumors vol.7, pp.1, 2011,
  16. A sequence-based approach to identify reference genes for gene expression analysis vol.3, pp.1, 2010,
  17. Expression of the aryl hydrocarbon receptor pathway and cyclooxygenase-2 in dog tumors vol.94, pp.1, 2013,
  18. The role of vascular endothelial growth factor and matrix metalloproteinases in canine lymphoma: in vivo and in vitro study vol.9, pp.1, 2013,
  19. Beta-2-microglobulin (B2M) expression in the urinary sediment correlates with clinical markers of kidney disease in patients with type 1 diabetes vol.65, pp.6, 2016,
  20. Golgi anti-apoptotic protein: a tale of camels, calcium, channels and cancer vol.7, pp.5, 2017,
  21. A transcriptome atlas of rice cell types uncovers cellular, functional and developmental hierarchies vol.41, pp.2, 2009,
  22. Measuring ‘normalcy’ in plant gene expression after herbivore attack vol.11, pp.2, 2011,
  23. Verification of reference genes for relative quantification of gene expression by real-time reverse transcription PCR in the pig vol.49, pp.3, 2008,
  24. Human housekeeping genes, revisited vol.29, pp.10, 2013,
  25. A Simple Method for Optimization of Reference Gene Identification and Normalization in DNA Microarray Analysis vol.22, 2016,
  26. Association of interleukin-1 receptor antagonist and interleukin-6 polymorphisms with osteolysis after total hip arthroplasty: Comment on the article by Gordon et al vol.60, pp.5, 2009,
  27. Using microarray technology to select housekeeping genes in Chinese hamster ovary cells vol.104, pp.5, 2009,
  28. Modifications in stromal extracellular matrix of aged corneas can be induced by ultraviolet A irradiation vol.14, pp.3, 2015,
  29. The transcriptome of retinal Müller glial cells vol.509, pp.2, 2008,
  30. hGAAP promotes cell adhesion and migration via the stimulation of store-operated Ca2+entry and calpain 2 vol.202, pp.4, 2013,
  31. LFG: a candidate apoptosis regulatory gene family vol.14, pp.11, 2009,
  32. C-terminal binding protein: A metabolic sensor implicated in regulating adipogenesis vol.43, pp.5, 2011,
  33. Host cell Golgi anti-apoptotic protein (GAAP) and growth of Chlamydia pneumoniae vol.54, 2013,
  34. Identification of reference genes for quantitative real-time PCR studies in human cell lines under copper and zinc exposure vol.29, pp.5, 2016,
  35. A comprehensive functional analysis of tissue specificity of human gene expression vol.6, pp.1, 2008,