• Title/Summary/Keyword: gene information

Search Result 1,645, Processing Time 0.028 seconds

Recent Advances in the Clinical Application of Next-Generation Sequencing

  • Ki, Chang-Seok
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.24 no.1
    • /
    • pp.1-6
    • /
    • 2021
  • Next-generation sequencing (NGS) technologies have changed the process of genetic diagnosis from a gene-by-gene approach to syndrome-based diagnostic gene panel sequencing (DPS), diagnostic exome sequencing (DES), and diagnostic genome sequencing (DGS). A priori information on the causative genes that might underlie a genetic condition is a prerequisite for genetic diagnosis before conducting clinical NGS tests. Theoretically, DPS, DES, and DGS do not require any information on specific candidate genes. Therefore, clinical NGS tests sometimes detect disease-related pathogenic variants in genes underlying different conditions from the initial diagnosis. These clinical NGS tests are expensive, but they can be a cost-effective approach for the rapid diagnosis of rare disorders with genetic heterogeneity, such as the glycogen storage disease, familial intrahepatic cholestasis, lysosomal storage disease, and primary immunodeficiency. In addition, DES or DGS may find novel genes that that were previously not linked to human diseases.

Disease related Gene Identification Using Literature and Google data (텍스트마이닝 기법과 구글데이터를 이용한 질병관련 유전자 식별)

  • Kim, Jeong-U;Kim, Hyeon-Jin;Park, Sang-Hyeon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1084-1087
    • /
    • 2013
  • 텍스트마이닝은(Text mining) 바이오분야에서 사용되는 도구 중 하나이다. 본 논문에서는 전립선암(Prostate cancer)과 관련된 질병 유전자(Disease gene)를 찾기 위해 텍스트마이닝을 이용하여 유전자 네트워크(Gene-network)를 구축하였다. 추가적으로 구글(Google) 검색을 통해 네트워크 내의 유전자 노드(Node)들 사이의 간선(Edge)에 새로운 가중치(Weight)를 추가하고 네트워크를 재구성하였다. 구축된 네트워크에서 노드와 노드 사이의 가중치를 기반으로 전립선암과 관련된 질병 유전자를 추출하였다. 본 논문의 방법은 성공적으로 네트워크를 구축하고 질병 유전자를 찾았으며, 구글 데이터를 사용하지 않고 네트워크를 구축하는 경우보다 더 높은 정확성을 입증했다.

A Eukaryotic Gene Structure Prediction Program Using Duration HMM (Duration HMM을 이용한 진핵생물 유전자 예측 프로그램 개발)

  • Tae, Hong-Seok;Park, Gi-Jeong
    • Korean Journal of Microbiology
    • /
    • v.39 no.4
    • /
    • pp.207-215
    • /
    • 2003
  • Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is the most important process in annotating genes and greatly affects gene analysis and genome annotation. As eukaryotic genes have more complicated stuructures in DNA sequences than those of prokaryotic genes, analysis programs for eukaryotic gene structure prediction have more diverse and more complicated computational models. We have developed EGSP, a eukaryotic gene structure program, using duration hidden markov model. The program consists of two major processes, one of which is a training process to produce parameter values from training data sets and the other of which is to predict protein coding regions based on the parameter values. The program predicts multiple genes rather than a single gene from a DNA sequence. A few computational models were implemented to detect signal pattern and their scanning efficiency was tested. Prediction performance was calculated and was compared with those of a few commonly used programs, GenScan, GeneID and Morgan based on a few criteria. The results show that the program can be practically used as a stand-alone program and a module in a system. For gene prediction of eukaryotic microbial genomes, training and prediction analysis was done with Saccharomyces chromosomes and the result shows the program is currently practically applicable to real eukaryotic microbial genomes.

New Approach to Predict microRNA Gene by using data Compression technique

  • Kim, Dae-Won;Yang, Joshua SungWoo;Kim, Pan-Jun;Chu, In-Sun;Jeong, Ha-Woong;Park, Hong-Seog
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.361-365
    • /
    • 2005
  • Over the past few years, the complex and subtle roles of microRNA (miRNA) in gene regulation have been increasingly appreciated. Computational approaches have played one of important roles in identifying miRNAs from plant and animals, as well as in predicting their putative gene target. We present a new approach of comprehensive analysis of the evolutionarily conserved element scores and applied data compression technique to detect putative miRNA genes. We used the evolutionarily conserved elements [19] (see more detail on method and material) to calculate for base-by-base along the candidate pre-miRNA gene region by detecting common conserved pattern from target sequence. We applied the data compression technique [20] to detect unknown miRNA genes. This zipping method devises, without loss of generality with respect to the nature of the character strings, a method to measure the similarity between the strings under consideration [20]. Our experience to using our new computational method for detecting miRNA gene identification (or miRNA gene prediction) has been stratified and we were able to find 28 putative miRNA genes.

  • PDF

Genotoxicity and Identification of Differentially Expressed Genes of Formaldehyde in human Jurkat Cells

  • Kim, Youn-Jung;Kim, Mi-Soon;Ryu, Jae-Chun
    • Molecular & Cellular Toxicology
    • /
    • v.1 no.4
    • /
    • pp.230-236
    • /
    • 2005
  • Formaldehyde is a common environmental contaminant found in tobacco smoke, paint, garments, diesel and exhaust, and medical and industrial products. Formaldehyde has been considered to be potentially carcinogenic, making it a subject of major environmental concern. However, only a little information on the mechanism of immunological sensitization and asthma by this compound has been known. So, we performed with Jurkat cell line, a human T lymphocyte, to assess the induction of DNA damage and to identify the DEGs related to immune response or toxicity by formaldehyde. In this study, we investigated the induction of DNA single strand breaks by formaldehyde using single cell gel electrophoresis assay (comet assay). And we compared gene expression between control and formaldehyde treatment to identify genes that are specifically or predominantly expressed by employing annealing control primer (ACP)-based $GeneFishing^{TM}$ method. The cytotoxicity ($IC_{30}$) of formaldehyde was determined above the 0.65 mM in Jurkat cell in 48 h treatment. Based on the $IC_{30}$ value from cytotoxicity test, we performed the comet assay in this concentration. From these results, 0.65 mM of formaldehyde was not revealed significant DNA damages in the absence of S-9 metabolic activation system. And the one differentially expressed gene (DEG) of formaldehyde was identified to zinc finger protein 292 using $GeneFishing^{TM}$ method. Through further investigation, we will identify more meaningful and useful DEGs on formaldehyde, and then can get the information on the associated mechanism and pathway with immune response or other toxicity by formaldehyde exposure.

Searching for Optimal Ensemble of Feature-classifier Pairs in Gene Expression Profile using Genetic Algorithm (유전알고리즘을 이용한 유전자발현 데이타상의 특징-분류기쌍 최적 앙상블 탐색)

  • 박찬호;조성배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.525-536
    • /
    • 2004
  • Gene expression profile is numerical data of gene expression level from organism, measured on the microarray. Generally, each specific tissue indicates different expression levels in related genes, so that we can classify disease with gene expression profile. Because all genes are not related to disease, it is needed to select related genes that is called feature selection, and it is needed to classify selected genes properly. This paper Proposes GA based method for searching optimal ensemble of feature-classifier pairs that are composed with seven feature selection methods based on correlation, similarity, and information theory, and six representative classifiers. In experimental results with leave-one-out cross validation on two gene expression Profiles related to cancers, we can find ensembles that produce much superior to all individual feature-classifier fairs for Lymphoma dataset and Colon dataset.

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

Possibility of the Use of Public Microarray Database for Identifying Significant Genes Associated with Oral Squamous Cell Carcinoma

  • Kim, Ki-Yeol;Cha, In-Ho
    • Genomics & Informatics
    • /
    • v.10 no.1
    • /
    • pp.23-32
    • /
    • 2012
  • There are lots of studies attempting to identify the expression changes in oral squamous cell carcinoma. Most studies include insufficient samples to apply statistical methods for detecting significant gene sets. This study combined two small microarray datasets from a public database and identified significant genes associated with the progress of oral squamous cell carcinoma. There were different expression scales between the two datasets, even though these datasets were generated under the same platforms - Affymetrix U133A gene chips. We discretized gene expressions of the two datasets by adjusting the differences between the datasets for detecting the more reliable information. From the combination of the two datasets, we detected 51 significant genes that were upregulated in oral squamous cell carcinoma. Most of them were published in previous studies as cancer-related genes. From these selected genes, significant genetic pathways associated with expression changes were identified. By combining several datasets from the public database, sufficient samples can be obtained for detecting reliable information. Most of the selected genes were known as cancer-related genes, including oral squamous cell carcinoma. Several unknown genes can be biologically evaluated in further studies.

A comparison study of classification method based of SVM and data depth in microarray data (마이크로어레이 자료에서 서포트벡터머신과 데이터 뎁스를 이용한 분류방법의 비교연구)

  • Hwang, Jin-Soo;Kim, Jee-Yun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.311-319
    • /
    • 2009
  • A robust L1 data depth was used in clustering and classification, so called DDclus and DDclass by Jornsten (2004). SVM-based classification works well in most of the situation but show some weakness in the presence of outliers. Proper gene selection is important in classification since there are so many redundant genes. Either by selecting appropriate genes or by gene clustering combined with classification method enhance the overall performance of classification. The performance of depth based method are evaluated among several SVM-based classification methods.

  • PDF

Mouse phenogenomics, toolbox for functional annotation of human genome

  • Kim, Il-Yong;Shin, Jae-Hoon;Seong, Je-Kyung
    • BMB Reports
    • /
    • v.43 no.2
    • /
    • pp.79-90
    • /
    • 2010
  • Mouse models are crucial for the functional annotation of human genome. Gene modification techniques including gene targeting and gene trap in mouse have provided powerful tools in the form of genetically engineered mice (GEM) for understanding the molecular pathogenesis of human diseases. Several international consortium and programs are under way to deliver mutations in every gene in mouse genome. The information from studying these GEM can be shared through international collaboration. However, there are many limitations in utility because not all human genes are knocked out in mouse and they are not yet phenotypically characterized by standardized ways which is required for sharing and evaluating data from GEM. The recent improvement in mouse genetics has now moved the bottleneck in mouse functional genomics from the production of GEM to the systematic mouse phenotype analysis of GEM. Enhanced, reproducible and comprehensive mouse phenotype analysis has thus emerged as a prerequisite for effectively engaging the phenotyping bottleneck. In this review, current information on systematic mouse phenotype analysis and an issue-oriented perspective will be provided.