Ovarian Cancer Prognostic Prediction Model Using RNA Sequencing Data

  • Jeong, Seokho (Department of Statistics, Seoul National University) ;
  • Mok, Lydia (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Kim, Se Ik (Department of Obstetrics and Gynecology, Seoul National University College of Medicine) ;
  • Ahn, TaeJin (Department of Life Science, Handong Global University) ;
  • Song, Yong-Sang (Department of Obstetrics and Gynecology, Seoul National University College of Medicine) ;
  • Park, Taesung (Department of Statistics, Seoul National University)
  • Received : 2018.12.10
  • Accepted : 2018.12.16
  • Published : 2018.12.31


Ovarian cancer is one of the leading causes of cancer-related deaths in gynecological malignancies. Over 70% of ovarian cancer cases are high-grade serous ovarian cancers and have high death rates due to their resistance to chemotherapy. Despite advances in surgical and pharmaceutical therapies, overall survival rates are not good, and making an accurate prediction of the prognosis is not easy because of the highly heterogeneous nature of ovarian cancer. To improve the patient's prognosis through proper treatment, we present a prognostic prediction model by integrating high-dimensional RNA sequencing data with their clinical data through the following steps: gene filtration, pre-screening, gene marker selection, integrated study of selected gene markers and prediction model building. These steps of the prognostic prediction model can be applied to other types of cancer besides ovarian cancer.


Supported by : Korea Health Industry Development Institute (KHIDI)


  1. Welsh JB, Zarrinkar PP, Sapinoso LM, Kern SG, Behling CA, Monk BJ, et al. Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc Natl Acad Sci U S A 2001;98:1176-1181.
  2. Kristiansen G, Denkert C, Schluns K, Dahl E, Pilarsky C, Hauptmann S. CD24 is expressed in ovarian cancer and is a new independent prognostic marker of patient survival. Am J Pathol 2002;161:1215-1221.
  3. Au KK, Josahkian JA, Francis JA, Squire JA, Koti M. Current state of biomarkers in ovarian cancer prognosis. Future Oncol 2015;11:3187-3195.
  4. Nowsheen S, Aziz K, Panayiotidis MI, Georgakilas AG. Molecular markers for cancer prognosis and treatment: have we struck gold? Cancer Lett 2012;327:142-152.
  5. Tothill RW, Tinker AV, George J, Brown R, Fox SB, Lade S, et al. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin Cancer Res 2008;14:5198-5208.
  6. Yoshihara K, Tajima A, Yahata T, Kodama S, Fujiwara H, Suzuki M, et al. Gene expression profile for predicting survival in advanced-stage serous ovarian cancer across two independent datasets. PLoS One 2010;5:e9615.
  7. Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 2011;12:87-98.
  8. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15:550.
  9. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139-140.
  10. Maza E. In Papyro Comparison of TMM (edgeR), RLE (DESeq2), and MRN normalization methods for a simple two-conditions-without-replicates RNA-Seq experimental design. Front Genet 2016;7:164.
  11. van Buuren S, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. J Stat Softw 2011;45:1-67.
  12. Bourgon R, Gentleman R, Huber W. Independent filtering increases detection power for high-throughput experiments. Proc Natl Acad Sci U S A 2010;107:9546-9551.
  13. Grimes T, Walker AR, Datta S, Datta S. Predicting survival times for neuroblastoma patients using RNA-seq expression profiles. Biol Direct 2018;13:11.
  14. Leys C, Ley C, Klein O, Bernard P, Licata L. Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 2013;49:764-766.
  15. Friedman J, Hastie T, Tibshirani R, Simon N, Narasimhan B, Qian J. Package 'glmnet': Lasso and elastic-net regularized generalized linear models. R package version, 1.4 [software]. The Comprehensive R Archive Network; 2009.
  16. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30.
  17. Ino K, Shibata K, Kajiyama H, Yamamoto E, Nagasaka T, Nawa A, et al. Angiotensin II type 1 receptor expression in ovarian cancer and its correlation with tumour angiogenesis and patient survival. Br J Cancer 2006;94:552-560.
  18. Cha Y, Kim DK, Hyun J, Kim SJ, Park KS. TCEA3 binds to TGF-beta receptor I and induces Smad-independent, JNK-dependent apoptosis in ovarian cancer cells. Cell Signal 2013;25:1245-1251.
  19. Nam S, Long X, Kwon C, Kim S, Nephew KP. An integrative analysis of cellular contexts, miRNAs and mRNAs reveals network clusters associated with antiestrogen-resistant breast cancer cells. BMC Genomics 2012;13:732.
  20. Ponten F, Jirstrom K, Uhlen M. The Human Protein Atlas: a tool for pathology. J Pathol 2008;216:387-393.
  21. Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361-387.<361::AID-SIM168>3.0.CO;2-4
  22. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000;56:337-344.