• 제목/요약/키워드: ranking genes

검색결과 21건 처리시간 0.02초

Applying a modified AUC to gene ranking

  • Yu, Wenbao;Chang, Yuan-Chin Ivan;Park, Eunsik
    • Communications for Statistical Applications and Methods
    • /
    • 제25권3호
    • /
    • pp.307-319
    • /
    • 2018
  • High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.

Statistical Method of Ranking Candidate Genes for the Biomarker

  • Kim, Byung-Soo;Kim, In-Young;Lee, Sun-Ho;Rha, Sun-Young
    • Communications for Statistical Applications and Methods
    • /
    • 제14권1호
    • /
    • pp.169-182
    • /
    • 2007
  • Receive operating characteristic (ROC) approach can be employed to rank candidate genes from a microarray experiment, in particular, for the biomarker development with the purpose of population screening of a cancer. In the cancer microarray experiment based on n patients the researcher often wants to compare the tumor tissue with the normal tissue within the same individual using a common reference RNA. Ideally, this experiment produces n pairs of microarray data. However, it is often the case that there are missing values either in the normal or tumor tissue data. Practically, we have $n_1$ pairs of complete observations, $n_2$ "normal only" and $n_3$ "tumor only" data for the microarray. We refer to this data set as a mixed data set. We develop a ROC approach on the mixed data set to rank candidate genes for the biomarker development for the colorectal cancer screening. It turns out that the correlation between two ranks in terms of ROC and t statistics based on the top 50 genes of ROC rank is less than 0.6. This result indicates that employing a right approach of ranking candidate genes for the biomarker development is important for the allocation of resources.

Fisher Criterion을 이용한 Gene Set Enrichment Analysis 기반 유의 유전자 집합의 검출 방법 연구 (Identifying Statistically Significant Gene-Sets by Gene Set Enrichment Analysis Using Fisher Criterion)

  • 김재영;신미영
    • 전자공학회논문지CI
    • /
    • 제45권4호
    • /
    • pp.19-26
    • /
    • 2008
  • Gene set enrichment analysis (GSEA)는 두 개의 클래스를 가지는 마이크로어레이 실험 데이터 분석을 위해 생물학적 특징을 기반으로 구성된 다양한 유전자-집합 중에서 두 클래스의 발현값들이 통계적으로 중요한 차이를 나타내는 유의한 유전자-집합을 추출하기 위한 분석 방법이다. 특히, 유전자에 대한 다양한 생물학적인 정보를 지닌 유전자 주석 데이터베이스(Cytogenetic Band, KEGG pathway, Gene Ontology 등)를 이용하여 마이크로어레이 실험에 사용된 전체 유전자 중 특정 기능을 가지는 유전자들을 그룹화하여 다양한 유전자-집합을 발굴하고, 각 유전자-집합 내에서 두 클래스간에 발현값의 차이를 참조하여 유의한 유전자들을 결정하여, 이를 기반으로 통계적으로 유의한 유전자-집합들을 최종 검출하는 방법이다. 본 논문에서는 GSEA 분석 과정에서 현재 주로 사용되고 있는 signal-to-noise ratio 기반 유전자 서열화(gene ranking) 방법 대신에, Fisher criterion을 이용한 유전자 서열화 방법을 적용함으로써 기존의 GSEA 방법에서 추출하지 못한 생물학적으로 의미 있는 새로운 유의 유전자-집합을 추출하는 방법을 제안하고자 한다. 또한, 제안한 방법의 성능을 고찰하기 위하여 공개된 Leukemia 관련 마이크로어레이 실험 데이터 분석에 적용하였으며, 기존의 알려진 결과와 비교 분석함으로써 제안한 방법의 유용성을 검증하고자 하였다.

Deciphering Key Genes of Proliferative and Secretory Phase Using Integrated Transcriptomics and Network Analysis

  • Payal Gupta;Shriya Dube;Payal Priyadarshini;Shanvi Singh;Anasuya Pravallika R;Vijay Lakshmi Srivastava;Abhishek Sengupta;Priyanka Narad
    • 한국미생물·생명공학회지
    • /
    • 제51권3호
    • /
    • pp.317-324
    • /
    • 2023
  • Endometrium receptivity is a complex mechanism of intricate pathways that lead to the shift from the proliferative to the secretory phase. Our goal was to identify high-ranking differentially expressed genes and study the pathways associated with the phenomenon. Raw data were retrieved from six GEO datasets and 705 DEGs were identified through robust ranking aggregation after the integration of five datasets. 20 key genes were identified that were further re-validated in an additional dataset. Supporting evidence through the experimental references confirms them as major biomarkers of the shift from the proliferative to the secretory phase.

Evaluation of reference genes for RT-qPCR study in abalone Haliotis discus hannai during heavy metal overload stress

  • Lee, Sang Yoon;Nam, Yoon Kwon
    • Fisheries and Aquatic Sciences
    • /
    • 제19권4호
    • /
    • pp.21.1-21.11
    • /
    • 2016
  • Background: The evaluation of suitable reference genes as normalization controls is a prerequisite requirement for launching quantitative reverse transcription-PCR (RT-qPCR)-based expression study. In order to select the stable reference genes in abalone Haliotis discus hannai tissues (gill and hepatopancreas) under heavy metal exposure conditions (Cu, Zn, and Cd), 12 potential candidate housekeeping genes were subjected to expression stability based on the comprehensive ranking while integrating four different statistical algorithms (geNorm, NormFinder, BestKeeper, and ${\Delta}CT$ method). Results: Expression stability in the gill subset was determined as RPL7 > RPL8 > ACTB > RPL3 > PPIB > RPL7A > EF1A > RPL4 > GAPDH > RPL5 > UBE2 > B-TU. On the other hand, the ranking in the subset for hepatopancreas was RPL7 > RPL3 > RPL8 > ACTB > RPL4 > EF1A > RPL5 > RPL7A > B-TU > UBE2 > PPIB > GAPDH. The pairwise variation assessed by the geNorm program indicates that two reference genes could be sufficient for accurate normalization in both gill and hepatopancreas subsets. Overall, both gill and hepatopancreas subsets recommended ribosomal protein genes (particularly RPL7) as stable references, whereas traditional housekeepers such as ${\beta}-tubulin$ (B-TU) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) genes were ranked as unstable genes. The validation of reference gene selection was confirmed with the quantitative assay of MT transcripts. Conclusions: The present analysis showed the importance of validating reference genes with multiple algorithmic approaches to select genes that are truly stable. Our results indicate that expression stability of a given reference gene could not always have consensus across tissue types. The data from this study could be a good guide for the future design of RT-qPCR studies with respect to metal regulation/detoxification and other related physiologies in this abalone species.

Validation of housekeeping genes as candidate internal references for quantitative expression studies in healthy and nervous necrosis virus-infected seven-band grouper (Hyporthodus septemfasciatus)

  • Krishnan, Rahul;Qadiri, Syed Shariq Nazir;Kim, Jong-Oh;Kim, Jae-Ok;Oh, Myung-Joo
    • Fisheries and Aquatic Sciences
    • /
    • 제22권12호
    • /
    • pp.28.1-28.8
    • /
    • 2019
  • Background: In the present study, we evaluated four commonly used housekeeping genes, viz., actin-β, elongation factor-1α (EF1α), acidic ribosomal protein (ARP), and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as internal references for quantitative analysis of immune genes in nervous necrosis virus (NNV)-infected seven-band grouper, Hyporthodus septemfasciatus. Methods: Expression profiles of the four genes were estimated in 12 tissues of healthy and infected seven-band grouper. Expression stability of the genes was calculated using the delta Ct method, BestKeeper, NormFinder, and geNorm algorithms. Consensus ranking was performed using RefFinder, and statistical analysis was done using GraphpadPrism 5.0. Results: Tissue-specific variations were observed in the four tested housekeeping genes of healthy and NNV-infected seven-band grouper. Fold change calculation for interferon-1 and Mx expression using the four housekeeping genes as internal references presented varied profiles for each tissue. EF1α and actin-β was the most stable expressed gene in tissues of healthy and NNV-infected seven-band grouper, respectively. Consensus ranking using RefFinder suggested EF1α as the least variable and highly stable gene in the healthy and infected animals. Conclusions: These results suggest that EF1α can be a fairly better internal reference in comparison to other tested genes in this study during the NNV infection process. This forms the pilot study on the validation of reference genes in Hyporthodus septemfasciatus, in the context of NNV infection.

Assessment of Suitable Reference Genes for RT-qPCR Normalization with Developmental Samples in Pacific Abalone Haliotis discus hannai

  • Lee, Sang Yoon;Park, Choul-Ji;Nam, Yoon Kwon
    • 한국동물생명공학회지
    • /
    • 제34권4호
    • /
    • pp.280-291
    • /
    • 2019
  • Potential utility of 14 candidate housekeeping genes as normalization reference for RT-qPCR analysis with developmental samples (fertilized eggs to late veliger larvae) in Pacific abalone Haliotis discus hannai was evaluated using four different statistical algorithms (geNorm, NormFinder, BestKeeper and comparative ΔCT method). Different algorithms identified different genes as the best candidates, and geometric mean-based final ranking from the most to the least stable expression was as follow: RPL5, RPL4, RPS18, RPL8, RPL7, UBE2, RPL7A, GAPDH, RPL36, PPIB, EF1A, ACTB and B-TU. The findings were further validated via relative quantification of metallothionein (MT) transcripts using the stable and unstable reference genes, and expression levels of MT were greatly influenced according to the choice of reference genes. In overall, our data suggest that RPL5 and RPS18, either singly or in combination, are appropriate for normalizing gene expression in developmental samples of this abalone species, whereas ACTB, B-TU and EF1A are less stable and not recommended. In addition, our findings propose that standard deviations in geometric ranking as well as geometric mean itself should also be taken into account for the final selection of reference gene(s). This study could be a useful basis to facilitate the generation of accurate and reliable RT-qPCR data with developmental samples in this abalone species.

Ranking Candidate Genes for the Biomarker Development in a Cancer Diagnostics

  • Kim, In-Young;Lee, Sun-Ho;Rha, Sun-Young;Kim, Byung-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.272-278
    • /
    • 2004
  • Recently, Pepe et al. (2003) employed the receiver operating characteristic (ROC) approach to rank candidate genes from a microarray experiment that can be used for the biomarker development with the ultimate purpose of the population screening of a cancer, In the cancer microarray experiment based on n patients the researcher often wants to compare the tumor tissue with the normal tissue within the same individual using a common reference RNA. This design is referred to as a reference design or an indirect design. Ideally, this experiment produces n pairs of microarray data, where each pair consists of two sets of microarray data resulting from reference versus normal tissue and reference versus tumor tissue hybridizations. However, for certain individuals either normal tissue or tumor tissue is not large enough for the experimenter to extract enough RNA for conducting the microarray experiment, hence there are missing values either in the normal or tumor tissue data. Practically, we have $n_1$ pairs of complete observations, $n_2$ 'normal only' and $n_3$ 'tumor only' data for the microarray experiment with n patients, where n=$n_1$+$n_2$+$n_3$. We refer to this data set as a mixed data set, as it contains a mix of fully observed and partially observed pair data. This mixed data set was actually observed in the microarray experiment based on human tissues, where human tissues were obtained during the surgical operations of cancer patients. Pepe et al. (2003) provide the rationale of using ROC approach based on two independent samples for ranking candidate gene instead of using t or Mann -Whitney statistics. We first modify ROC approach of ranking genes to a paired data set and further extend it to a mixed data set by taking a weighted average of two ROC values obtained by the paired data set and two independent data sets.

  • PDF

Prediction of hub genes of Alzheimer's disease using a protein interaction network and functional enrichment analysis

  • Wee, Jia Jin;Kumar, Suresh
    • Genomics & Informatics
    • /
    • 제18권4호
    • /
    • pp.39.1-39.8
    • /
    • 2020
  • Alzheimer's disease (AD) is a chronic, progressive brain disorder that slowly destroys affected individuals' memory and reasoning faculties, and consequently, their ability to perform the simplest tasks. This study investigated the hub genes of AD. Proteins interact with other proteins and non-protein molecules, and these interactions play an important role in understanding protein function. Computational methods are useful for understanding biological problems, in particular, network analyses of protein-protein interactions. Through a protein network analysis, we identified the following top 10 hub genes associated with AD: PTGER3, C3AR1, NPY, ADCY2, CXCL12, CCR5, MTNR1A, CNR2, GRM2, and CXCL8. Through gene enrichment, it was identified that most gene functions could be classified as integral to the plasma membrane, G-protein coupled receptor activity, and cell communication under gene ontology, as well as involvement in signal transduction pathways. Based on the convergent functional genomics ranking, the prioritized genes were NPY, CXCL12, CCR5, and CNR2.

러시아 철갑상어(Acipenser gueldenstaedtii) 발생 시료의 RT-qPCR 분석을 위한 내재 대조군 유전자의 선정 (Evaluation of Candidate Housekeeping Genes for the Normalization of RT-qPCR Analysis using Developing Embryos and Prolarvae in Russian Sturgeon Acipenser gueldenstaedtii)

  • 남윤권;이상윤;김은정
    • 한국수산과학회지
    • /
    • 제51권1호
    • /
    • pp.95-106
    • /
    • 2018
  • To evaluate appropriate reference genes for the normalization of quantitative reverse transcription PCR (RT-qPCR) data with embryonic and larval samples from Russian sturgeon Acipenser gueldenstaedtii, the expression stability of eight candidate housekeeping genes, including beta-actin (ACTB), elongation factor-1A (EF1A), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), histone 2A (H2A), ribosomal protein L5 (RPL5), ribosomal protein L7 (RPL7), succinate dehydrogenase (SDHA), and ubiquitin-conjugating enzyme E2 (UBE2A), were tested using embryonic samples from 12 developmental stages and larval samples from 11 ontogenic stages. Based on the stability rankings from three statistic software packages, geNorm, NormFinder, and BestKeeper, the expression stability of the embryonic subset was ranked as UBE2A>H2A>SDHA>GAPDH>RPL5>EF1A>ACTB>RPL7. On the other hand, the ranking in the larval subset was determined as UBE2A>GAPDH>SDHA>RPL5>RPL7>H2A>EF1A>AC TB. When the two subsets were combined, the overall ranking was UBE2A>SDHA>H2A>RPL5>GAPDH>EF1A>ACTB>RPL7. Taken together, our data suggest that UBE2A and SDHA are recommended as suitable references for developmental and ontogenic samples of this sturgeon species, whereas traditional housekeepers such as ACTB and GAPDH may not be suitable candidates.