• Title/Summary/Keyword: gene information

Search Result 1,645, Processing Time 0.031 seconds

Consensus Clustering for Time Course Gene Expression Microarray Data

  • Kim, Seo-Young;Bae, Jong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.335-348
    • /
    • 2005
  • The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.

Normal Mixture Model with General Linear Regressive Restriction: Applied to Microarray Gene Clustering

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.205-213
    • /
    • 2007
  • In this paper, the normal mixture model subjected to general linear restriction for component-means based on linear regression is proposed, and its fitting method by EM algorithm and Lagrange multiplier is provided. This model is applied to gene clustering of microarray expression data, which demonstrates it has very good performances for real data set. This model also allows to obtain the clusters that an analyst wants to find out in the fashion that the hypothesis for component-means is represented by the design matrices and the linear restriction matrices.

Screening for Natural Bioactive Compounds Targeting the Intracellular Signal Transduction Pathway: Natural Products Modulating the Expression of the Interleukin-2 gene

  • Hakamatsuka, Takashi
    • Proceedings of the PSK Conference
    • /
    • 2003.10a
    • /
    • pp.60-61
    • /
    • 2003
  • Human Genome Project has recently been completed and the information on nucleotide sequences of our whole genome is now available at the public or commercial data banks. Next goals are to identify the functions of each gene and to elucidate the intracellular signal transduction pathways regulating gene expression. We have established a PCR-based bioassay to search for biologically active compounds that can modulate the expression of genes encoding important proteins. (omitted)

  • PDF

Reverting Gene Expression Pattern of Cancer into Normal-Like Using Cycle-Consistent Adversarial Network

  • Lee, Chan-hee;Ahn, TaeJin
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.275-283
    • /
    • 2018
  • Cancer show distinct pattern of gene expression when it is compared to normal. This difference results malignant characteristic of cancer. Many cancer drugs are targeting this difference so that it can selectively kill cancer cells. One of the recent demand for personalized treating cancer is retrieving normal tissue from a patient so that the gene expression difference between cancer and normal be assessed. However, in most clinical situation it is hard to retrieve normal tissue from a patient. This is because biopsy of normal tissues may cause damage to the organ function or a risk of infection or side effect what a patient to take. Thus, there is a challenge to estimate normal cell's gene expression where cancers are originated from without taking additional biopsy. In this paper, we propose in-silico based prediction of normal cell's gene expression from gene expression data of a tumor sample. We call this challenge as reverting the cancer into normal. We divided this challenge into two parts. The first part is making a generator that is able to fool a pretrained discriminator. Pretrained discriminator is from the training of public data (9,601 cancers, 7,240 normals) which shows 0.997 of accuracy to discriminate if a given gene expression pattern is cancer or normal. Deceiving this pretrained discriminator means our method is capable of generating very normal-like gene expression data. The second part of the challenge is to address whether generated normal is similar to true reverse form of the input cancer data. We used, cycle-consistent adversarial networks to approach our challenges, since this network is capable of translating one domain to the other while maintaining original domain's feature and at the same time adding the new domain's feature. We evaluated that, if we put cancer data into a cycle-consistent adversarial network, it could retain most of the information from the input (cancer) and at the same time change the data into normal. We also evaluated if this generated gene expression of normal tissue would be the biological reverse form of the gene expression of cancer used as an input.

Cancer-Subtype Classification Based on Gene Expression Data (유전자 발현 데이터를 이용한 암의 유형 분류 기법)

  • Cho Ji-Hoon;Lee Dongkwon;Lee Min-Young;Lee In-Beum
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.12
    • /
    • pp.1172-1180
    • /
    • 2004
  • Recently, the gene expression data, product of high-throughput technology, appeared in earnest and the studies related with it (so-called bioinformatics) occupied an important position in the field of biological and medical research. The microarray is a revolutionary technology which enables us to monitor several thousands of genes simultaneously and thus to gain an insight into the phenomena in the human body (e.g. the mechanism of cancer progression) at the molecular level. To obtain useful information from such gene expression measurements, it is essential to analyze the data with appropriate techniques. However the high-dimensionality of the data can bring about some problems such as curse of dimensionality and singularity problem of matrix computation, and hence makes it difficult to apply conventional data analysis methods. Therefore, the development of method which can effectively treat the data becomes a challenging issue in the field of computational biology. This research focuses on the gene selection and classification for cancer subtype discrimination based on gene expression (microarray) data.

Identifying statistically significant gene sets based on differential expression and differential coexpression (특이발현과 특이공발현을 고려한 유의한 유전자 집단 탐색)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.437-448
    • /
    • 2016
  • Gene set analysis utilizing biologic information is expected to produce more interpretable results because the occurrence of tumors (or diseases) is believed to be associated with the regulation of related genes. Many methods have been developed to identify statistically significant gene sets across different phenotypes; however, most focus exclusively on either the differential gene expression or the differential correlation structure in the gene set. This research provides a new method that simultaneously considers the differential expression of genes and differential coexpression with multiple genes in the gene set. Application of this NEW method is illustrated with real microarray data example, p53; subsequently, a simulation study compares its type I error rate and power with GSEA, SAMGS, GSCA and GSNCA.

General properties and phylogenetic utilities of nuclear ribosomal DNA and mitochondrial DNA commonly used in molecular systematics

  • Hwang, Ui-Wook;Kim, Won
    • Parasites, Hosts and Diseases
    • /
    • v.37 no.4
    • /
    • pp.215-228
    • /
    • 1999
  • To choose one or more appropriate molecular markers or gene regions for resolving a particular systematic question among the organisms at a certain categorical level is still a very difficult process. The primary goal of this review, therefore, is to provide a theoretical information in choosing one or more molecular markers or gene regions by illustrating general properties and phylogenetic utilities of nuclear ribosomal DNA (rDNA) and mitochondrial DNA (mtDNA) that have been most commonly used for phylogenetic researches. The highly conserved molecular markers and/or gene regions are useful for investigating phylogenetic relationships at higher categorical levels (deep branches of evolutionary history). On the other hand, the hypervariable molecular markers and/or gene regions are useful for elucidating phylogenetic relationships at lower categorical levels (recently diverged branches). In summary, different selective forces have led to the evolution of various molecular markers or gene regions with varying degrees of sequence conservation. Thus, appropriate molecular markers or gene regions should be chosen with even greater caution to deduce true phylogenetic relationships over a broad taxonomic spectrum.

  • PDF

Analysis for nucleotide sequence of the membrane protein gene of porcine epidemic diarrhea virus Chinju99

  • Baquilod, Greta Salvae V.;Yeo, Sang-Geon
    • Korean Journal of Veterinary Research
    • /
    • v.46 no.4
    • /
    • pp.355-361
    • /
    • 2006
  • Porcine epidemic diarrhea virus (PEDV) strain Chinju99, which was previously isolated from piglets suffering from severe diarrhea was used to characterize the membrane (M) protein gene to establish the molecular information, and the results will be useful in elucidating concepts related to molecular pathogenesis and antigenic structures of PEDV isolates. The Chinju99 M gene generated by reverse transcription and polymerase chain reaction (RT-PCR) consisted of 681 bases containing 22.3% adenine, 22.3% cytosine, 23.1% guanine and 32.3% thymine nucleotides, and the GC content was 45.4%. It had some nucleotide mismatches from M gene of other PEDV strains, such as CV777, Br1/87, KPEDV-9, JMe2, JS2004-2 and LJB-03 with 97-99% nucleotide sequence homology to these strains. Also, it encoded a protein of 226 amino acids, which had some mismatches from those of CV777, Br1/87, KPEDV-9, JMe2, JS20004-2 and LJB-03, as the amino acid sequence homology showed a 97-98% to these strains. The Chinju99 had a very close relationship to the Japanese strain JMe2 for the nucleotide and amino acid sequences of the M gene. The amino acids predicted from Chinju99 M gene consisted of mostly hydrophobic residues and contained three potential sites for asparagine (N)-linked glycosylation, two serine (S)-linked phosphorylation sites by protein kinase C, and two S- or threonine (T)-linked phosphorylation sites by casein kinase II.

Molecular Characterization of tgd057, a Novel Gene from Toxoplasma gondii

  • Wan, Kiew-Lian;Chang, Ti-Ling;Ajioka, James W.
    • BMB Reports
    • /
    • v.37 no.4
    • /
    • pp.474-479
    • /
    • 2004
  • The expressed sequence tag (EST) effort in Toxoplasma gondii has generated a substantial amount of gene information. To exploit this valuable resource, we chose to study tgd057, a novel gene identified by a large number of ESTs that otherwise show no significant match to known sequences in the database. Northern analysis showed that tgd057 is transcribed in this tachyzoite. The complete cDNA sequence of tgd057 is 1169 bp in length. Sequence analysis revealed that tgd057 possibly adopts two polyadenylation sites, utilizes the fourth in-frame ATG for translation initiation, and codes for a secretory protein. The longest open reading frame for the tgd057 gene was cloned and expressed as a recombinant protein (rd57) in Escherichia coli. Western analysis revealed that serum against rd57 recognized a molecule of ~21 kDa in the tachyzoite protein extract. This suggests that the tgd057 gene is expressed in vivo in the parasite.

Development of Gene Based STS Markers in Wheat

  • Lee, Sang-Kyu;Heo, Hwa-Young;Kwon, Young-Up;Lee, Byung-Moo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.57 no.1
    • /
    • pp.71-77
    • /
    • 2012
  • The objective of this study is to develop the gene based sequence tagged site (STS) markers in wheat. The euchromatin enriched genomic library was constructed and the STS primer sets were designed using gene based DNA sequence. The euchromatin enriched genomic (EEG) DNA library in wheat was constructed using the $Mcr$A and $Mcr$BC system in $DH5{\alpha}$ cell. The 2,166 EEG colonies have been constructed by methylated DNA exclusion. Among the colonies, 606 colonies with the size between 400 and 1200 bp of PCR products were selected for sequencing. In order to develop the gene based STS primers, blast analysis comparing between wheat genetic information and rice genome sequence was employed. The 227 STS primers mainly matched on $Triticum$ $aestivum$ (hexaploid), $Triticum$ $turgidum$ (tetraploid), $Aegilops$ (diploid), and other plants. The polymorphisms were detected in PCR products after digestion with restriction enzymes. The eight STS markers that showed 32 polymorphisms in twelve wheat genotypes were developed using 227 STS primers. The STS primers analysis will be useful for generation of informative molecular markers in wheat. Development of gene based STS marker is to identify the genetic function through cloning of target gene and find the new allele of target trait.