• 제목/요약/키워드: Clustering genes

검색결과 140건 처리시간 0.025초

An Application of the Clustering Threshold Gradient Descent Regularization Method for Selecting Genes in Predicting the Survival Time of Lung Carcinomas

  • Lee, Seung-Yeoun;Kim, Young-Chul
    • Genomics & Informatics
    • /
    • 제5권3호
    • /
    • pp.95-101
    • /
    • 2007
  • In this paper, we consider the variable selection methods in the Cox model when a large number of gene expression levels are involved with survival time. Deciding which genes are associated with survival time has been a challenging problem because of the large number of genes and relatively small sample size (n<

CLUSTERING DNA MICROARRAY DATA BY STOCHASTIC ALGORITHM

  • Shon, Ho-Sun;Kim, Sun-Shin;Wang, Ling;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2007년도 Proceedings of ISRS 2007
    • /
    • pp.438-441
    • /
    • 2007
  • Recently, due to molecular biology and engineering technology, DNA microarray makes people watch thousands of genes and the state of variation from the tissue samples of living body. With DNA Microarray, it is possible to construct a genetic group that has similar expression patterns and grasp the progress and variation of gene. This paper practices Cluster Analysis which purposes the discovery of biological subgroup or class by using gene expression information. Hence, the purpose of this paper is to predict a new class which is unknown, open leukaemia data are used for the experiment, and MCL (Markov CLustering) algorithm is applied as an analysis method. The MCL algorithm is based on probability and graph flow theory. MCL simulates random walks on a graph using Markov matrices to determine the transition probabilities among nodes of the graph. If you look at closely to the method, first, MCL algorithm should be applied after getting the distance by using Euclidean distance, then inflation and diagonal factors which are tuning modulus should be tuned, and finally the threshold using the average of each column should be gotten to distinguish one class from another class. Our method has improved the accuracy through using the threshold, namely the average of each column. Our experimental result shows about 70% of accuracy in average compared to the class that is known before. Also, for the comparison evaluation to other algorithm, the proposed method compared to and analyzed SOM (Self-Organizing Map) clustering algorithm which is divided into neural network and hierarchical clustering. The method shows the better result when compared to hierarchical clustering. In further study, it should be studied whether there will be a similar result when the parameter of inflation gotten from our experiment is applied to other gene expression data. We are also trying to make a systematic method to improve the accuracy by regulating the factors mentioned above.

  • PDF

약동학적 파라미터를 이용한 시간경로 마이크로어레이 자료의 군집분석 (Clustering of Time-Course Microarray Data Using Pharmacokinetic Parameter)

  • 이효정;김별아;박미라
    • 응용통계연구
    • /
    • 제24권4호
    • /
    • pp.623-631
    • /
    • 2011
  • 시간경로 마이크로어레이 자료 분석의 주요 목적 중의 하나는 유전자들의 시간에 따른 발현수준의 변화를 고려함으로써 발현패턴에 기초한 유전자들의 그룹을 찾기 위한 것으로, 군집분석을 위한 다양한 알고리즘들이 제안되었다. 본 연구에서 시간경로 마이크로어레이 자료에 대한 군집분석을 위해 두 약물제제 간 생물학적 동등성을 평가하기 위한 약동학 시험에서 사용되는 약동학적 파라미터 값에 기초한 군집분석을 제안하였으며 이를 실제 데이터 및 모의실험 자료에 적용하여 유용성을 검토하였다.

Trends in Genomics & Informatics: a statistical review of publications from 2003 to 2018 focusing on the most-studied genes and document clusters

  • Kim, Ji-Hyeon;Nam, Hee-Jo;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • 제17권3호
    • /
    • pp.25.1-25.6
    • /
    • 2019
  • Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Herein, we conduct a statistical analysis of the publications of Genomics & Informatics over the 16 years since its inception, with a particular focus on issues relating to article categories, word clouds, and the most-studied genes, drawing on recent reviews of the use of word frequencies in journal articles. Trends in the studies published in Genomics & Informatics are discussed both individually and collectively.

Genetic Diversity of Metallo-β-lactamase Genes of Chryseobacterium indologenes Isolates from Korea

  • Yum, Jong Hwa
    • 대한의생명과학회지
    • /
    • 제25권3호
    • /
    • pp.275-281
    • /
    • 2019
  • This study was performed to characterize the chromosomal metallo-${\beta}$-lactamases (MBLs) of Chryseobacterium indologenes isolated from Korea and to propose a clustering method of IND MBLs based on their amino acid similarities. Chromosomal MBL genes were amplified by PCR from 31 clinical isolates of E. indologenes. Nucleotide sequencing was performed by the dideoxy chain termination method using these PCR products. Antimicrobial susceptibilities were determined by the agar dilution method. PCR experiments showed that all 31 E. indologenes isolates contained all $bla_{IND}$ genes. DNA sequence analysis revealed that E. indologenes isolates possessed ten types of $bla_{IND}$ gene, including seven novel variants ($bla_{IND-8}$ to $bla_{IND-14}$). The most common combination of MBL was IND-2 (n = 18). Minimum inhibitory concentrations of imipenem and meropenem for the isolates harboring novel IND MBLs were ${\geq}16{\mu}g/mL$. IND MBLs were grouped in three clusters, based on amino acid similarities.

An EST survey of genes expressed in liver of rock bream(Oplegnathus fasciatus) with particular interests on the stress-responsive and immune-related genes

  • Park, Byul-Nim;Park, Ji-Eun;Kim, Ki-Hong;Kim, Dong-Soo;Nam, Yoon-Kwon
    • 한국양식학회:학술대회논문집
    • /
    • 한국양식학회 2003년도 추계학술발표대회 논문요약집
    • /
    • pp.43-43
    • /
    • 2003
  • EST analysis was performed to identify stress-responsive and immune-related genes from rock bream (Oplegnathus fasciatus). cDNA libraries were constructed with liver and randomly chosen 624 clones were subjected to automated sequence analysis. Of 624 clones sequenced in total, approximately 15% of ESTs was novel sequences (no match to GenBank) or sequences with high homology to hypothetical/unknown genes. The bioinforamtic sequence analysis including functional clustering, homology grouping, contig assembly with electronic northern and organism matches were carried out. Several potential stress-responsive biomarker and/or immune-related genes were identified in all the tissues examined. It included lectins, ferritins, CP450, proteinase, proteinase inhibitors, anti-oxidant enzymes, various heat-shock proteins, warm temperature acclimation protein, complements, methyltransferase, zinc finger proteins, lysozymes, macrophage maturation associated protein, and others. This information will offer new possibilities as fundamental baseline data for understanding and addressing their molecular mechanism involved in host defense and immune systems of this species.

  • PDF

CONSTRUCTING GENE REGULATORY NETWORK USING FREQUENT GENE EXPRESSION PATTERN MINING AND CHAIN RULES

  • Park, Hong-Kyu;Lee, Heon-Gyu;Cho, Kyung-Hwan;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.623-626
    • /
    • 2006
  • Group of genes controls the functioning of a cell by complex interactions. These interacting gene groups are called Gene Regulatory Networks (GRNs). Two previous data mining approaches, clustering and classification have been used to analyze gene expression data. While these mining tools are useful for determining membership of genes by homology, they don't identify the regulatory relationships among genes found in the same class of molecular actions. Furthermore, we need to understand the mechanism of how genes relate and how they regulate one another. In order to detect regulatory relationships among genes from time-series Microarray data, we propose a novel approach using frequent pattern mining and chain rule. In this approach, we propose a method for transforming gene expression data to make suitable for frequent pattern mining, and detect gene expression patterns applying FP-growth algorithm. And then, we construct gene regulatory network from frequent gene patterns using chain rule. Finally, we validated our proposed method by showing that our experimental results are consistent with published results.

  • PDF

Genome-wide Identification, Classification, and Expression Analysis of the Receptor-Like Protein Family in Tomato

  • Kang, Won-Hee;Yeom, Seon-In
    • The Plant Pathology Journal
    • /
    • 제34권5호
    • /
    • pp.435-444
    • /
    • 2018
  • Receptor-like proteins (RLPs) are involved in plant development and disease resistance. Only some of the RLPs in tomato (Solanum lycopersicum L.) have been functionally characterized though 176 genes encoding RLPs, which have been identified in the tomato genome. To further understand the role of RLPs in tomato, we performed genome-guided classification and transcriptome analysis of these genes. Phylogenic comparisons revealed that the tomato RLP members could be divided into eight subgroups and that the genes evolved independently compared to similar genes in Arabidopsis. Based on location and physical clustering analyses, we conclude that tomato RLPs likely expanded primarily through tandem duplication events. According to tissue specific RNA-seq data, 71 RLPs were expressed in at least one of the following tissues: root, leaf, bud, flower, or fruit. Several genes had expression patterns that were tissue specific. In addition, tomato RLP expression profiles after infection with different pathogens showed distinguish gene regulations according to disease induction and resistance response as well as infection by bacteria and virus. Notably, Some RLPs were highly and/or unique expressed in susceptible tomato to pathogen, suggesting that the RLP could be involved in disease response, possibly as a host-susceptibility factor. Our study could provide an important clues for further investigations into the function of tomato RLPs involved in developmental and response to pathogens.

Functional Gene Analysis to Identify Potential Markers Induced by Benzene in Two Different Cell Lines, HepG2 and HL-60

  • Kim, Youn-Jung;Song, Mi-Kyung;Sarma, Sailendra Nath;Choi, Han-Saem;Ryu, Jae-Chun
    • Molecular & Cellular Toxicology
    • /
    • 제4권3호
    • /
    • pp.183-191
    • /
    • 2008
  • Volatile organic compounds (VOCs) are common constituents of cleaning and degreasing agents, paints, pesticides, personal care products, gasoline and solvents. And VOCs are evaporated at room temperature and most of them exhibit acute and chronic toxicity to human. Benzene is the most widely used prototypical VOC and the toxic mechanisms of them are still unclear. The multi-step process of toxic mechanism can be more fully understood by characterizing gene expression changes induced in cells by toxicants. In this study, DNA microarray was used to monitor the expression levels of genes in HepG2 cells and HL-60 cells exposed to the benzene on IC20 and IC50 dose respectively. In the clustering analysis of gene expression profiles, although clusters of HepG2 and HL-60 cells by benzene were divided differently, expression pattern of many genes observed similarly. We identified 916 up-regulated genes and 1,144 down-regulated genes in HepG2 cells and also 1,002 up-regulated genes and 919 down-regulated genes in HL-60 cells. The gene ontology analysis on genes expressed by benzene in HepG2 and HL-60 cells, respectively, was performed. Thus, we found some principal pathways, such as, focal adhesion, gap junction and signaling pathway in HepG2 cells and toll-like receptor signaling pathway, MAPK signaling pathway, p53 signaling pathway and neuroactive ligand-receptor interaction in HL-60 cells. And we also found 16 up-regulated and 14 down-regulated commonly expressed total 30 genes that belong in the same biological process like inflammatory response, cell cycle arrest, cell migration, transmission of nerve impulse and cell motility in two cell lines. In conclusion, we suggest that this study is meaningful because these genes regarded as strong potential biomarkers of benzene independent of cell type.

Differentially Expressed Genes in Metastatic Advanced Egyptian Bladder Cancer

  • Zekri, Abdel-Rahman N;Hassan, Zeinab Korany;Bahnassy, Abeer A;Khaled, Hussein M;El-Rouby, Mahmoud N;Haggag, Rasha M;Abu-Taleb, Fouad M
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권8호
    • /
    • pp.3543-3549
    • /
    • 2015
  • Background: Bladder cancer is one of the most common cancers worldwide. Gene expression profiling using microarray technologies improves the understanding of cancer biology. The aim of this study was to determine the gene expression profile in Egyptian bladder cancer patients. Materials and Methods: Samples from 29 human bladder cancers and adjacent non-neoplastic tissues were analyzed by cDNA microarray, with hierarchical clustering and multidimensional analysis. Results: Five hundred and sixteen genes were differentially expressed of which SOS1, HDAC2, PLXNC1, GTSE1, ULK2, IRS2, ABCA12, TOP3A, HES1, and SRP68 genes were involved in 33 different pathways. The most frequently detected genes were: SOS1 in 20 different pathways; HDAC2 in 5 different pathways; IRS2 in 3 different pathways. There were 388 down-regulated genes. PLCB2 was involved in 11 different pathways, MDM2 in 9 pathways, FZD4 in 5 pathways, p15 and FGF12 in 4 pathways, POLE2 in 3 pathways, and MCM4 and POLR2E in 2 pathways. Thirty genes showed significant differences between transitional cell cancer (TCC) and squamous cell cancer (SCC) samples. Unsupervised cluster analysis of DNA microarray data revealed a clear distinction between low and high grade tumors. In addition 26 genes showed significant differences between low and high tumor stages, including fragile histidine triad, Ras and sialyltransferase 8 (alpha) and 16 showed significant differences between low and high tumor grades, like methionine adenosyl transferase II, beta. Conclusions: The present study identified some genes, that can be used as molecular biomarkers or target genes in Egyptian bladder cancer patients.