Search | Korea Science

Applying a modified AUC to gene ranking

Yu, Wenbao;Chang, Yuan-Chin Ivan;Park, Eunsik
- Communications for Statistical Applications and Methods
- /
- v.25 no.3
- /
- pp.307-319
- /
- 2018
High-throughput technologies enable the simultaneous evaluation of thousands of genes that could discriminate different subclasses of complex diseases. Ranking genes according to differential expression is an important screening step for follow-up analysis. Many statistical measures have been proposed for this purpose. A good ranked list should provide a stable rank (at least for top-ranked gene), and the top ranked genes should have a high power in differentiating different disease status. However, there is a lack of emphasis in the literature on ranking genes based on these two criteria simultaneously. To achieve the above two criteria simultaneously, we proposed to apply a previously reported metric, the modified area under the receiver operating characteristic cure, to gene ranking. The proposed ranking method is found to be promising in leading to a stable ranking list and good prediction performances of top ranked genes. The findings are illustrated through studies on both synthesized data and real microarray gene expression data. The proposed method is recommended for ranking genes or other biomarkers for high-dimensional omics studies.
https://doi.org/10.29220/CSAM.2018.25.3.307 인용 PDF KSCI

Identifying Statistically Significant Gene-Sets by Gene Set Enrichment Analysis Using Fisher Criterion (Fisher Criterion을 이용한 Gene Set Enrichment Analysis 기반 유의 유전자 집합의 검출 방법 연구)

Kim, Jae-Young;Shin, Mi-Young
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.45 no.4
- /
- pp.19-26
- /
- 2008
Gene set enrichment analysis (GSEA) is a computational method to identify statistically significant gene sets showing significant differences between two groups of microarray expression profiles and simultaneously uncover their biological meanings in an elegant way by employing gene annotation databases, such as Cytogenetic Band, KEGG pathways, gene ontology, and etc. For the gone set enrichment analysis, all the genes in a given dataset are first ordered by the signal-to-noise ratio between the groups and then further analyses are proceeded. Despite of its impressive results in several previous studies, however, gene ranking by the signal-to-noise ratio makes it difficult to consider highly up-regulated genes and highly down-regulated genes at the same time as the candidates of significant genes, which possibly reflect certain situations incurred in metabolic and signaling pathways. To deal with this problem, in this article, we investigate the gene set enrichment analysis method with Fisher criterion for gene ranking and also evaluate its effects in Leukemia related pathway analyses.
PDF KSCI

Evaluation of reference genes for RT-qPCR study in abalone Haliotis discus hannai during heavy metal overload stress

Lee, Sang Yoon;Nam, Yoon Kwon
- Fisheries and Aquatic Sciences
- /
- v.19 no.4
- /
- pp.21.1-21.11
- /
- 2016
Background: The evaluation of suitable reference genes as normalization controls is a prerequisite requirement for launching quantitative reverse transcription-PCR (RT-qPCR)-based expression study. In order to select the stable reference genes in abalone Haliotis discus hannai tissues (gill and hepatopancreas) under heavy metal exposure conditions (Cu, Zn, and Cd), 12 potential candidate housekeeping genes were subjected to expression stability based on the comprehensive ranking while integrating four different statistical algorithms (geNorm, NormFinder, BestKeeper, and ${\Delta}CT$ method). Results: Expression stability in the gill subset was determined as RPL7 > RPL8 > ACTB > RPL3 > PPIB > RPL7A > EF1A > RPL4 > GAPDH > RPL5 > UBE2 > B-TU. On the other hand, the ranking in the subset for hepatopancreas was RPL7 > RPL3 > RPL8 > ACTB > RPL4 > EF1A > RPL5 > RPL7A > B-TU > UBE2 > PPIB > GAPDH. The pairwise variation assessed by the geNorm program indicates that two reference genes could be sufficient for accurate normalization in both gill and hepatopancreas subsets. Overall, both gill and hepatopancreas subsets recommended ribosomal protein genes (particularly RPL7) as stable references, whereas traditional housekeepers such as ${\beta}-tubulin$ (B-TU) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) genes were ranked as unstable genes. The validation of reference gene selection was confirmed with the quantitative assay of MT transcripts. Conclusions: The present analysis showed the importance of validating reference genes with multiple algorithmic approaches to select genes that are truly stable. Our results indicate that expression stability of a given reference gene could not always have consensus across tissue types. The data from this study could be a good guide for the future design of RT-qPCR studies with respect to metal regulation/detoxification and other related physiologies in this abalone species.
https://doi.org/10.1186/s41240-016-0022-z 인용 PDF KSCI

Comparative Statistic Module (CSM) for Significant Gene Selection

Kim, Young-Jin;Kim, Hyo-Mi;Kim, Sang-Bae;Park, Chan;Kimm, Kuchan;Koh, InSong
- Genomics & Informatics
- /
- v.2 no.4
- /
- pp.180-183
- /
- 2004
Comparative Statistic Module(CSM) provides more reliable list of significant genes to genomics researchers by offering the commonly selected genes and a method of choice by calculating the rank of each statistical test based on the average ranking of common genes across the five statistical methods, i.e. t-test, Kruskal-Wallis (Wilcoxon signed rank) test, SAM, two sample multiple test, and Empirical Bayesian test. This statistical analysis module is implemented in Perl, and R languages.
PDF KSCI

Prediction of hub genes of Alzheimer's disease using a protein interaction network and functional enrichment analysis

Wee, Jia Jin;Kumar, Suresh
- Genomics & Informatics
- /
- v.18 no.4
- /
- pp.39.1-39.8
- /
- 2020
Alzheimer's disease (AD) is a chronic, progressive brain disorder that slowly destroys affected individuals' memory and reasoning faculties, and consequently, their ability to perform the simplest tasks. This study investigated the hub genes of AD. Proteins interact with other proteins and non-protein molecules, and these interactions play an important role in understanding protein function. Computational methods are useful for understanding biological problems, in particular, network analyses of protein-protein interactions. Through a protein network analysis, we identified the following top 10 hub genes associated with AD: PTGER3, C3AR1, NPY, ADCY2, CXCL12, CCR5, MTNR1A, CNR2, GRM2, and CXCL8. Through gene enrichment, it was identified that most gene functions could be classified as integral to the plasma membrane, G-protein coupled receptor activity, and cell communication under gene ontology, as well as involvement in signal transduction pathways. Based on the convergent functional genomics ranking, the prioritized genes were NPY, CXCL12, CCR5, and CNR2.
https://doi.org/10.5808/GI.2020.18.4.e39 인용 PDF KSCI

Validation of housekeeping genes as candidate internal references for quantitative expression studies in healthy and nervous necrosis virus-infected seven-band grouper (Hyporthodus septemfasciatus)

Krishnan, Rahul;Qadiri, Syed Shariq Nazir;Kim, Jong-Oh;Kim, Jae-Ok;Oh, Myung-Joo
- Fisheries and Aquatic Sciences
- /
- v.22 no.12
- /
- pp.28.1-28.8
- /
- 2019
Background: In the present study, we evaluated four commonly used housekeeping genes, viz., actin-β, elongation factor-1α (EF1α), acidic ribosomal protein (ARP), and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as internal references for quantitative analysis of immune genes in nervous necrosis virus (NNV)-infected seven-band grouper, Hyporthodus septemfasciatus. Methods: Expression profiles of the four genes were estimated in 12 tissues of healthy and infected seven-band grouper. Expression stability of the genes was calculated using the delta Ct method, BestKeeper, NormFinder, and geNorm algorithms. Consensus ranking was performed using RefFinder, and statistical analysis was done using GraphpadPrism 5.0. Results: Tissue-specific variations were observed in the four tested housekeeping genes of healthy and NNV-infected seven-band grouper. Fold change calculation for interferon-1 and Mx expression using the four housekeeping genes as internal references presented varied profiles for each tissue. EF1α and actin-β was the most stable expressed gene in tissues of healthy and NNV-infected seven-band grouper, respectively. Consensus ranking using RefFinder suggested EF1α as the least variable and highly stable gene in the healthy and infected animals. Conclusions: These results suggest that EF1α can be a fairly better internal reference in comparison to other tested genes in this study during the NNV infection process. This forms the pilot study on the validation of reference genes in Hyporthodus septemfasciatus, in the context of NNV infection.
https://doi.org/10.1186/s41240-019-0142-3 인용 PDF KSCI

Screening and Clustering for Time-course Yeast Microarray Gene Expression Data using Gaussian Process Regression (효모 마이크로어레이 유전자 발현데이터에 대한 가우시안 과정 회귀를 이용한 유전자 선별 및 군집화)

Kim, Jaehee;Kim, Taehoun
- The Korean Journal of Applied Statistics
- /
- v.26 no.3
- /
- pp.389-399
- /
- 2013
This article introduces Gaussian process regression and shows its application with time-course microarray gene expression data. Gene screening for yeast cell cycle microarray expression data is accomplished with a ratio of log marginal likelihood that uses Gaussian process regression with a squared exponential covariance kernel function. Gaussian process regression fitting with each gene is done and shown with the nine top ranking genes. With the screened data the Gaussian model-based clustering is done and its silhouette values are calculated for cluster validity.
https://doi.org/10.5351/KJAS.2013.26.3.389 인용 PDF KSCI

Re-Ranking Retrieval Model Using Similarity Transformation Based on Gene Algorithm (유전자 알고리즘 기반 유사도 변환을 이용한 순위 재조정 검색 모델)

이재훈;이성주
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2005.11a
- /
- pp.331-334
- /
- 2005
정보$\cdot$통신과학의 발달로 다양한 영역에서 수많은 정보들이 발생하고 있다. 그 결과 사용자의 요구에 무분별한 응답을 제시하는 검색 모델도 발생하였다. 본 논문은 정보들 사이의 유사도를 변환하고 순위를 재조정하여 더욱 적합한 정보를 상위 순위에 제시함으로써 사용자 요구에 더욱 적합한 정보를 획득할 수 있는 모델에 대해 연구하였다.
PDF

Assessment of Suitable Reference Genes for RT-qPCR Normalization with Developmental Samples in Pacific Abalone Haliotis discus hannai

Lee, Sang Yoon;Park, Choul-Ji;Nam, Yoon Kwon
- Journal of Animal Reproduction and Biotechnology
- /
- v.34 no.4
- /
- pp.280-291
- /
- 2019
Potential utility of 14 candidate housekeeping genes as normalization reference for RT-qPCR analysis with developmental samples (fertilized eggs to late veliger larvae) in Pacific abalone Haliotis discus hannai was evaluated using four different statistical algorithms (geNorm, NormFinder, BestKeeper and comparative ΔCT method). Different algorithms identified different genes as the best candidates, and geometric mean-based final ranking from the most to the least stable expression was as follow: RPL5, RPL4, RPS18, RPL8, RPL7, UBE2, RPL7A, GAPDH, RPL36, PPIB, EF1A, ACTB and B-TU. The findings were further validated via relative quantification of metallothionein (MT) transcripts using the stable and unstable reference genes, and expression levels of MT were greatly influenced according to the choice of reference genes. In overall, our data suggest that RPL5 and RPS18, either singly or in combination, are appropriate for normalizing gene expression in developmental samples of this abalone species, whereas ACTB, B-TU and EF1A are less stable and not recommended. In addition, our findings propose that standard deviations in geometric ranking as well as geometric mean itself should also be taken into account for the final selection of reference gene(s). This study could be a useful basis to facilitate the generation of accurate and reliable RT-qPCR data with developmental samples in this abalone species.
https://doi.org/10.12750/JARB.34.4.280 인용 PDF KSCI

Ranking Candidate Genes for the Biomarker Development in a Cancer Diagnostics

Kim, In-Young;Lee, Sun-Ho;Rha, Sun-Young;Kim, Byung-Soo
- Proceedings of the Korean Society for Bioinformatics Conference
- /
- 2004.11a
- /
- pp.272-278
- /
- 2004
Recently, Pepe et al. (2003) employed the receiver operating characteristic (ROC) approach to rank candidate genes from a microarray experiment that can be used for the biomarker development with the ultimate purpose of the population screening of a cancer, In the cancer microarray experiment based on n patients the researcher often wants to compare the tumor tissue with the normal tissue within the same individual using a common reference RNA. This design is referred to as a reference design or an indirect design. Ideally, this experiment produces n pairs of microarray data, where each pair consists of two sets of microarray data resulting from reference versus normal tissue and reference versus tumor tissue hybridizations. However, for certain individuals either normal tissue or tumor tissue is not large enough for the experimenter to extract enough RNA for conducting the microarray experiment, hence there are missing values either in the normal or tumor tissue data. Practically, we have $n_1$ pairs of complete observations, $n_2$ 'normal only' and $n_3$ 'tumor only' data for the microarray experiment with n patients, where n=$n_1$+$n_2$+$n_3$. We refer to this data set as a mixed data set, as it contains a mix of fully observed and partially observed pair data. This mixed data set was actually observed in the microarray experiment based on human tissues, where human tissues were obtained during the surgical operations of cancer patients. Pepe et al. (2003) provide the rationale of using ROC approach based on two independent samples for ranking candidate gene instead of using t or Mann -Whitney statistics. We first modify ROC approach of ranking genes to a paired data set and further extend it to a mixed data set by taking a weighted average of two ROC values obtained by the paired data set and two independent data sets.
PDF

Search Result 22, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)