• Title/Summary/Keyword: Informative genes

Search Result 38, Processing Time 0.021 seconds

De novo transcriptome sequencing and gene expression profiling with/without B-chromosome plants of Lilium amabile

  • Park, Doori;Kim, Jong-Hwa;Kim, Nam-Soo
    • Genomics & Informatics
    • /
    • v.17 no.3
    • /
    • pp.27.1-27.9
    • /
    • 2019
  • Supernumerary B chromosomes were found in Lilium amabile (2n = 2x = 24), an endemic Korean lily that grows in the wild throughout the Korean Peninsula. The extra B chromosomes do not affect the host-plant morphology; therefore, whole transcriptome analysis was performed in 0B and 1B plants to identify differentially expressed genes. A total of 154,810 transcripts were obtained from over 10 Gbp data by de novo assembly. By mapping the raw reads to the de novo transcripts, we identified 7,852 differentially expressed genes (log2FC > |10|), in which 4,059 and 3,794 were up-and down-regulated, respectively, in 1B plants compared to 0B plants. Functional enrichment analysis revealed that various differentially expressed genes were involved in cellular processes including the cell cycle, chromosome breakage and repair, and microtubule formation; all of which may be related to the occurrence and maintenance of B chromosomes. Our data provide insight into transcriptomic changes and evolution of plant B chromosomes and deliver an informative database for future study of B chromosome transcriptomes in the Korean lily.

A hybrid method to compose an optimal gene set for multi-class classification using mRMR and modified particle swarm optimization (mRMR과 수정된 입자군집화 방법을 이용한 다범주 분류를 위한 최적유전자집단 구성)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.683-696
    • /
    • 2020
  • The aim of this research is to find an optimal gene set that provides highly accurate multi-class classification with a minimum number of genes. A two-stage procedure is proposed: Based on minimum redundancy and maximum relevance (mRMR) framework, several statistics to rank differential expression genes and K-means clustering to reduce redundancy between genes are used for data filtering procedure. And a particle swarm optimization is modified to select a small subset of informative genes. Two well known multi-class microarray data sets, ALL and SRBCT, are analyzed to indicate the effectiveness of this hybrid method.

Ovarian Cancer Microarray Data Classification System Using Marker Genes Based on Normalization (표준화 기반 표지 유전자를 이용한 난소암 마이크로어레이 데이타 분류 시스템)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.9
    • /
    • pp.2032-2037
    • /
    • 2011
  • Marker genes are defined as genes in which the expression level characterizes a specific experimental condition. Such genes in which the expression levels differ significantly between different groups are highly informative relevant to the studied phenomenon. In this paper, first the system can detect marker genes that are selected by ranking genes according to statistics after normalizing data with methods that are the most widely used among several normalization methods proposed the while, And it compare and analyze a performance of each of normalization methods with mult-perceptron neural network layer. The Result that apply Multi-Layer perceptron algorithm at Microarray data set including eight of marker gene that are selected using ANOVA method after Lowess normalization represent the highest classification accuracy of 99.32% and the lowest prediction error estimate.

Transcriptional Profiles of Peripheral Blood Leukocytes Identify Patients with Cholangiocarcinoma and Predict Outcome

  • Subimerb, Chutima;Wongkham, Chaisiri;Khuntikeo, Narong;Leelayuwat, Chanvit;McGrath, Michael S.;Wongkham, Sopit
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.10
    • /
    • pp.4217-4224
    • /
    • 2014
  • Cholangiocarcinoma (CCA), a slow growing but highly metastatic tumor, is highly prevalent in Northeast Thailand. Specific tests that predict prognosis of CCA remain elusive. The present study was designed to investigate whether peripheral blood leukocyte (PBL) transcriptional profiles might be of use as a prognostic test in CCA patients. Gene expression profiles of PBLs from 9 CCA and 8 healthy subjects were conducted using the Affymetrix HG_U133 Plus 2.0 GeneChip. We indentified informative PBLs gene expression profiles that could reliably distinguish CCA patients from healthy subjects. Of these CCA specific genes, 117 genes were up regulated and 60 were down regulated. The molecular and cellular functions predicted for these CCA specific genes according to the Gene Ontology database indicated differential PBL expression of host immune response and tumor progression genes (EREG, TGF ${\beta}1$, CXCL2, CXCL3, IL-8, and VEGFA). The expression levels of 9 differentially expressed genes were verified in 36 CCA vs 20 healthy subjects. A set of three tumor invasion related genes (PLAU, CTSL and SERPINB2) computed as "prognostic index" was found to be an independent and statistically significant predictor for CCA patient survival. The present study shows that CCA PBLs may serve as disease predictive clinically accessible surrogates for indentifying expressed genes reflective of CCA disease severity.

Molecular Systematics of the Tephritoidea (Insecta: Diptera): Phylogenetic Signal in 16S and 28S rDNAs for Inferring Relationships Among Families

  • Han, Ho-Yeon;Ro, Kyung-Eui;Choi, Deuk-Soo;Kim, Sam-Kyu
    • Animal cells and systems
    • /
    • v.6 no.2
    • /
    • pp.145-151
    • /
    • 2002
  • Phylogenetic signal present in the mitochondrial 16S ribosomal RNA gene (16S rDNA) and the nuclear large subunit ribosomal RNA gene (28S rDNA) was explored to assess their utility in resolving family level relationships of the superfamily Tephritoidea. These two genes were chosen because they appear to evolve at different rates, and might contribute to resolve both shallow and deeper phylogenetic branches within a highly diversified group. For the 16S rDNA data set, the number of aligned sites was 1,258 bp, but 1,204 bp were used for analysis after excluding sites of ambiguous alignment. Among these 1,204 sites, 662 sites were variable and 450 sites were informative for parsimony analysis. For the 28S rDNA data set, the number of aligned sites was 1,102 bp, but 1,000 bp were used for analysis after excluding sites of ambiguous alignment. Among these 1000 sites, 235 sites were variable and 95 sites were informative for parsimony analysis. Our analyses suggest that: (1) while 16S rDNA is useful for resolving more recent phylogenetic divergences, 28S rDNA can be used to define much deeper phylogenetic branches; (2) the combined analysis of the 16S and 28S rDNAs enhances the overall resolution without losing phylogenetic signal from either single gene analysis; and (3) additional genes that evolve at intermediate rates between the 16S and 28S rDNAs are needed to further resolve relationships among the tephritoid families.

The Design and Implement of Microarry Data Classification Model for Tumor Classification (종양 분류를 위한 마이크로어레이 데이터 분류 모델 설계와 구현)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.10
    • /
    • pp.1924-1929
    • /
    • 2007
  • Nowadays, a lot of related data obtained from these research could be given a new present meaning to accomplish the original purpose of the whole research as a human project. The method of tumor classification based on microarray could contribute to being accurate tumor classification by finding differently expressing gene pattern statistically according to a tumor type. Therefore, the process to select a closely related informative gene with a particular tumor classification to classify tumor using present microarray technology with effect is essential. In this thesis, we used cDNA microarrays of 3840 genes obtained from neuronal differentiation experiment of cortical stem cells on white mouse with cancer, constructed accurate tumor classification model by extracting informative gene list through normalization separately and then did performance estimation by analyzing and comparing each of the experiment results. Result classifying Multi-Perceptron classifier for selected genes using Pearson correlation coefficient represented the accuracy of 95.6%.

Identification of Gene-based Potential Biomarkers for Cephalexin-induced Nephrotoxicity in Mice

  • Park, Han-Jin;Oh, Jung-Hwa;Hwang, Ji-Yoon;Lim, Jung-Sun;Jeong, Sun-Young;Kim, Yong-Bum;Yoon, Seok-Joo
    • Molecular & Cellular Toxicology
    • /
    • v.2 no.3
    • /
    • pp.193-201
    • /
    • 2006
  • Cephalexin, one of most widely prescribed cephalosporin, has been reported to cause acute renal failure as a side effect in human and experimental animals. Although numerous animal studies have been reported for the cephalosporin nephrotoxicity, the molecular and cellular nephrotoxic mechanisms of cephalexin are still unknown. This investigation evaluated the time-dependent gene expression profile of kidney in mouse during cephalexin induced nephrotoxicity. C57BL/6 female mice were administered either saline or 1,000 mg/kg cephalexin intraperitoneally. Mice were sacrificed at 3, 6, and 24 hr after administration. Blood biochemical and histopathological results indicated cephalexin induced nephrotoxicity. Microarray experiment carried out using Affymetrix $GeneChip^{(R)}$. There were 198 informative genes that were significantly expressed >5-fold versus control at 3, 6, and 24 hr (p<0.01), of which 156 and 42 were up-and down-regulated, respectively. Major classes of up-regulated genes at 3, 6 hr included those involved in MAPK/Jak-STAT signaling pathway and immune response such as cytokine-cytokine receptor interaction and complement and coagulation cascades. At 24 hr, up-regulated genes were mainly involved in regeneration/repair and immune response; down-regulated genes were generally associated with transporters and intermediary metabolism. Among the up-regulated genes at 24 hr, several potential biomarkers on nephrotoxicity such as Kim-1, Fga, Timp1, and Slc34a2 were clustered in a same category. In addition, Tnfrsf12a and Lcn2 which were consistently up-regulated (>5 fold) were also included as potential biomarkers. These results may provide clues for elucidating the mechanism of cephalexin induced nephrotoxicity and evaluating potential biomarkers to assess nephrotoxicity.

Identification of Functional and In silico Positional Differentially Expressed Genes in the Livers of High- and Low-marbled Hanwoo Steers

  • Lee, Seung-Hwan;Park, Eung-Woo;Cho, Yong-Min;Yoon, Duhak;Park, Jun-Hyung;Hong, Seong-Koo;Im, Seok-Ki;Thompson, J.M.;Oh, Sung-Jong
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.20 no.9
    • /
    • pp.1334-1341
    • /
    • 2007
  • This study identified hepatic differentially expressed genes (DEGs) affecting the marbling of muscle. Most dietary nutrients bypass the liver and produce plasma lipoproteins. These plasma lipoproteins transport free fatty acids to the target tissue, adipose tissue and muscle. We examined hepatic genes differentially expressed in a differential-display reverse transcription-polymerase chain reaction (ddRT-PCR) analysis comparing high- and low-marbled Hanwoo steers. Using 60 arbitrary primers, we found 13 candidate genes that were upregulated and five candidate genes that were downregulated in the livers of high-marbled Hanwoo steers compared to low-marbled individuals. A BLAST search for the 18 DEGs revealed that 14 were well characterized, while four were not annotated. We examined four DEGs: ATP synthase F0, complement component CD, insulin-like growth factor binding protein-3 (IGFBP3) and phosphatidylethanolamine binding protein (PEBP). Of these, only two genes (complement component CD and IGFBP3) were differentially expressed at p<0.05 between the livers of high- and low-marbled individuals. The mean mRNA levels of the PEBP and ATP synthase F0 genes did not differ significantly between the livers of high- and low-marbled individuals. Moreover, these DEGs showed very high inter-individual variation in expression. These informative DEGs were assigned to the bovine chromosome in a BLAST search of MS marker subsets and the bovine genome sequence. Genes related to energy metabolism (ATP synthase F0, ketohexokinase, electron-transfer flavoprotein-ubiquinone oxidoreductase and NADH hydrogenase) were assigned to BTA 1, 11, 17, and 22, respectively. Syntaxin, IGFBP3, decorin, the bax inhibitor gene and the PEBP gene were assigned to BTA 3, 4, 5, 5, and 17, respectively. In this study, the in silico physical maps provided information on the specific location of candidate genes associated with economic traits in cattle.

Rank-based Multiclass Gene Selection for Cancer Classification with Naive Bayes Classifiers based on Gene Expression Profiles (나이브 베이스 분류기를 이용한 유전발현 데이타기반 암 분류를 위한 순위기반 다중클래스 유전자 선택)

  • Hong, Jin-Hyuk;Cho, Sung-Bae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.8
    • /
    • pp.372-377
    • /
    • 2008
  • Multiclass cancer classification has been actively investigated based on gene expression profiles, where it determines the type of cancer by analyzing the large amount of gene expression data collected by the DNA microarray technology. Since gene expression data include many genes not related to a target cancer, it is required to select informative genes in order to obtain highly accurate classification. Conventional rank-based gene selection methods often use ideal marker genes basically devised for binary classification, so it is difficult to directly apply them to multiclass classification. In this paper, we propose a novel method for multiclass gene selection, which does not use ideal marker genes but directly analyzes the distribution of gene expression. It measures the class-discriminability by discretizing gene expression levels into several regions and analyzing the frequency of training samples for each region, and then classifies samples by using the naive Bayes classifier. We have demonstrated the usefulness of the proposed method for various representative benchmark datasets of multiclass cancer classification.

Feature Selection via Embedded Learning Based on Tangent Space Alignment for Microarray Data

  • Ye, Xiucai;Sakurai, Tetsuya
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.4
    • /
    • pp.121-129
    • /
    • 2017
  • Feature selection has been widely established as an efficient technique for microarray data analysis. Feature selection aims to search for the most important feature/gene subset of a given dataset according to its relevance to the current target. Unsupervised feature selection is considered to be challenging due to the lack of label information. In this paper, we propose a novel method for unsupervised feature selection, which incorporates embedded learning and $l_{2,1}-norm$ sparse regression into a framework to select genes in microarray data analysis. Local tangent space alignment is applied during embedded learning to preserve the local data structure. The $l_{2,1}-norm$ sparse regression acts as a constraint to aid in learning the gene weights correlatively, by which the proposed method optimizes for selecting the informative genes which better capture the interesting natural classes of samples. We provide an effective algorithm to solve the optimization problem in our method. Finally, to validate the efficacy of the proposed method, we evaluate the proposed method on real microarray gene expression datasets. The experimental results demonstrate that the proposed method obtains quite promising performance.