• 제목/요약/키워드: gene annotation

검색결과 180건 처리시간 0.022초

Full-Length Enriched cDNA Library Construction from Tissues Related to Energy Metabolism in Pigs

  • Lee, Kyung-Tai;Byun, Mi-Jeong;Lim, Dajeong;Kang, Kyung-Soo;Kim, Nam-Soon;Oh, Jung-Hwa;Chung, Chung-Soo;Park, Hae-Suk;Shin, Younhee;Kim, Tae-Hun
    • Molecules and Cells
    • /
    • 제28권6호
    • /
    • pp.529-536
    • /
    • 2009
  • Genome sequencing of the pig is being accelerated because of its importance as an evolutionary and biomedical model animal as well as a major livestock animal. However, information on expressed porcine genes is insufficient to allow annotation and use of the genomic information. A series of expressed sequence tags of 5' ends of five full-length enriched cDNA libraries (SUSFLECKs) were functionally characterized. SUSFLECKs were constructed from porcine abdominal fat, induced fat cells, loin muscle, liver, and pituitary gland, and were composed of non-normalized and normalized libraries. A total of 55,658 ESTs that were sequenced once from the 5′ ends of clones were produced and assembled into 17,684 unique sequences with 7,736 contigs and 9,948 singletons. In Gene Ontology analysis, two significant biological process leaf nodes were found: gluconeogenesis and translation elongation. In functional domain analysis based on the Pfam database, the beta transducin repeat domain of WD40 protein was the most frequently occurring domain. Twelve genes, including SLC25A6, EEF1G, EEF1A1, COX1, ACTA1, SLA, and ANXA2, were significantly more abundant in fat tissues than in loin muscle, liver, and pituitary gland in the SUSFLECKs. These characteristics of SUSFLECKs determined by EST analysis can provide important insight to discover the functional pathways in gene networks and to expand our understanding of energy metabolism in the pig.

Genome-wide survey and expression analysis of F-box genes in wheat

  • Kim, Dae Yeon;Hong, Min Jeong;Seo, Yong Weon
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2017년도 9th Asian Crop Science Association conference
    • /
    • pp.141-141
    • /
    • 2017
  • The ubiquitin-proteasome pathway is the major regulatory mechanism in a number of cellular processes for selective degradation of proteins and involves three steps: (1) ATP dependent activation of ubiquitin by E1 enzyme, (2) transfer of activated ubiquitin to E2 and (3) transfer of ubiquitin to the protein to be degraded by E3 complex. F-box proteins are subunit of SCF complex and involved in specificity for a target substrate to be degraded. F-box proteins regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence. However, little is known about the F-box genes in wheat. The draft genome sequence of wheat (IWGSC Reference Sequence v1.0 assembly) used to analysis a genome-wide survey of the F-box gene family in wheat. The Hidden Markov Model (HMM) profiles of F-box (PF00646), F-box-like (PF12937), F-box-like 2 (PF13013), FBA (PF04300), FBA_1 (PF07734), FBA_2 (PF07735), FBA_3 (PF08268) and FBD (PF08387) domains were downloaded from Pfam database were searched against IWGSC Reference Sequence v1.0 assembly. RNA-seq paired-end libraries from different stages of wheat, such as stages of seedling, tillering, booting, day after flowering (DAF) 1, DAF 10, DAF 20, and DAF 30 were conducted and sequenced by Illumina HiSeq2000 for expression analysis of F-box protein genes. Basic analysis including Hisat, HTseq, DEseq, gene ontology analysis and KEGG mapping were conducted for differentially expressed gene analysis and their annotation mappings of DEGs from various stages. About 950 F-box domain proteins identified by Pfam were mapped to wheat reference genome sequence by blastX (e-value < 0.05). Among them, more than 140 putative F-box protein genes were selected by fold changes cut-offs of > 2, significance p-value < 0.01, and FDR<0.01. Expression profiling of selected F-box protein genes were shown by heatmap analysis, and average linkage and squared Euclidean distance of putative 144 F-box protein genes by expression patterns were calculated for clustering analysis. This work may provide valuable and basic information for further investigation of protein degradation mechanism by ubiquitin proteasome system using F-box proteins during wheat development stages.

  • PDF

Annotation and Expression Profile Analysis of cDNAs from the Antarctic Diatom Chaetoceros neogracile

  • Jung, Gyeong-Seo;Lee, Choul-Gyun;Kang, Sung-Ho;Jin, Eon-Seon
    • Journal of Microbiology and Biotechnology
    • /
    • 제17권8호
    • /
    • pp.1330-1337
    • /
    • 2007
  • To better understand the gene expression of the cold-adapted polar diatom, we conducted a survey of the Chaetoceros neogracile transcriptome by cDNA sequencing and expression of interested cDNAs from the Antarctic diatom. A non-normalized cDNA library was constructed from the C. neogracile, and a total of 2,500 cDNAs were sequenced to generate 1,881 high-quality expressed sequence tags (ESTs) (accession numbers EL620615-EL622495). Based on their clustering, we identified 154 unique clusters comprising 342 ESTs. The remaining 1,540 ESTs did not cluster. The number of unique genes identified in the data set is thus estimated to be 1,694. Taking advantage of various tools and databases, putative functions were assigned to 939 (55.4%) of these genes. Of the remaining 540 (31.9%) unknown sequences, 215 (12.7%) appeared to be C. neogracile-specific since they lacked any significant sequence similarity to any sequence available in the public databases. C. neogracile consisted of a relatively high percentage of genes involved in metabolism, genetic information processing, cellular processes, defense or stress resistance, photosynthesis, structure, and signal transduction. From the ESTs, the expression of these putative C. neogracile genes was investigated: fucoxanthin chlorophyll (chl) a,c-binding protein (FCP), ascorbate peroxidase (ASP), and heat-shock protein 90 (HSP90). The abundance of ASP and HSP90 changed substantially in response to different culture conditions, indicating the possible regulation of these genes in C. neogracile.

대규모 유전자 상호작용 네트워크 추론을 위한 클라이언트-서버 시스템 구조 (Client-Server System Architecture for Inferring Large-Scale Genetic Interaction Networks)

  • 김영훈;이필현;이도헌
    • Bioinformatics and Biosystems
    • /
    • 제1권1호
    • /
    • pp.38-45
    • /
    • 2006
  • 본 논문은 베이지안 네트워크를 기반으로 대규모 유전자 상호작용 네트워크를 추론하기 위한 클라이언트-서버 시스템 구조를 제시한다. 유전체 수준(genome-wide)의 대규모 유전자 상호작용 네트워크를 베이지안 네트워크 형태로 추론하기 위해서는 병렬 서버를 이용하더라도 통상 수십시간이 소요된다. 따라서, 일반적인 대화형(interactive) 독자(standalone) 시스템 구조보다는 배치형(batch) 분산(distributed) 시스템 구조가 적합하다. 본 논문에서는 그와 같은 상황에 적합한 느슨한 연결의 (loosely-coupled) 클라이언트-서버 시스템을 구현할 결과를 기술한다. 유전자 상호작용 네트워크 추론은 크게 두 단계로 나누어진다. 첫째로, 생물주석정보(biological annotation)과 유전자 발현정보(expression data)를 사용하여, 전체 유전자 집단을 서로 중복이 가능한 모듈들로 나누며, 둘째로, 각각의 모듈들에 대해 독립적인 베이지안 학습을 수행하여 추론결과를 얻고, 각 모듈들이 공통으로 포함하는 유전자를 사용하여 각 모듈의 추론결과들을 하나로 통합한다.

  • PDF

Calibrating Thresholds to Improve the Detection Accuracy of Putative Transcription Factor Binding Sites

  • Kim, Young-Jin;Ryu, Gil-Mi;Park, Chan;Kim, Kyu-Won;Oh, Berm-Seok;Kim, Young-Youl;Gu, Man-Bok
    • Genomics & Informatics
    • /
    • 제5권4호
    • /
    • pp.143-151
    • /
    • 2007
  • To understand the mechanism of transcriptional regulation, it is essential to detect promoters and regulatory elements. Various kinds of methods have been introduced to improve the prediction accuracy of regulatory elements. Since there are few experimentally validated regulatory elements, previous studies have used criteria based solely on the level of scores over background sequences. However, selecting the detection criteria for different prediction methods is not feasible. Here, we studied the calibration of thresholds to improve regulatory element prediction. We predicted a regulatory element using MATCH, which is a powerful tool for transcription factor binding site (TFBS) detection. To increase the prediction accuracy, we used a regulatory potential (RP) score measuring the similarity of patterns in alignments to those in known regulatory regions. Next, we calibrated the thresholds to find relevant scores, increasing the true positives while decreasing possible false positives. By applying various thresholds, we compared predicted regulatory elements with validated regulatory elements from the Open Regulatory Annotation (ORegAnno) database. The predicted regulators by the selected threshold were validated through enrichment analysis of muscle-specific gene sets from the Tissue-Specific Transcripts and Genes (T-STAG) database. We found 14 known muscle-specific regulators with a less than a 5% false discovery rate (FDR) in a single TFBS analysis, as well as known transcription factor combinations in our combinatorial TFBS analysis.

다종의 유전체로부터 탐지된 Ortholog 군집에 대한 분석 (An Analysis of Ortholog Clusters Detected from Multiple Genomes)

  • 김선신;오정수;이범주;김태경;정광수;이충세;김영창;조완섭;류근호
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제35권2호
    • /
    • pp.125-131
    • /
    • 2008
  • 새로운 유전체 주석달기와 유전체 진화에 대한 연구를 위해서 올소로그(Ortholog)를 탐지하는 일은 매우 유용하다. 이전에 제안한 연구에서, 우리는 여러 종의 유전체로부터 올소로그 클러스터를 자동적으로 구축하는 방법을 제안하였다. 이 방법은 단지 두 종의 결과를 생성하는 InParanoid를 여러 종으로 확장하고 이와 동일한 질을 가진 결과를 산출한다. 한편, 새롭게 서열이 밝혀진 유전자의 기능을 보다 정확히 예측하기 위해, 패럴로그(Paralog)가 가급적 적게 포함되는 올소로그 클러스터를 구축하는 것이 중요한 문제가 될 수 있다. 이 논문에서, 우리는 임계값을 사용하여 보다 순수한 올소로그 클러스터를 구축하는 방법에 대하여 조사하였다 우리는 20개의 원핵생물의 데이타셋으로부터 올소로그 클러스터를 구축하였다. 우리의 올소로그 클러스터를 COG(Clusters of Orthologous Group) 및 KO(Kegg Orthology)와 비교하였을 매, 약 90%의 유사도를 가지며 임계간의 증가와 더불어 증가하는 경향이 있다.

Protein-protein Interaction Network Analyses for Elucidating the Roles of LOXL2-delta72 in Esophageal Squamous Cell Carcinoma

  • Wu, Bing-Li;Zou, Hai-Ying;Lv, Guo-Qing;Du, Ze-Peng;Wu, Jian-Yi;Zhang, Pi-Xian;Xu, Li-Yan;Li, En-Min
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권5호
    • /
    • pp.2345-2351
    • /
    • 2014
  • Lysyl oxidase-like 2 (LOXL2), a member of the lysyl oxidase (LOX) family, is a copper-dependent enzyme that catalyzes oxidative deamination of lysine residues on protein substrates. LOXL2 was found to be overexpressed in esophageal squamous cell carcinoma (ESCC) in our previous research. We later identified a LOXL2 splicing variant LOXL2-delta72 and we overexpressed LOXL2-delta72 and its wild type counterpart in ESCC cells following microarray analyses. First, the differentially expressed genes (DEGs) of LOXL2 and LOXL2-delta72 compared to empty plasmid were applied to generate protein-protein interaction (PPI) sub-networks. Comparison of these two sub-networks showed hundreds of different proteins. To reveal the potential specific roles of LOXL2- delta72 compared to its wild type, the DEGs of LOXL2-delta72 vs LOXL2 were also applied to construct a PPI sub-network which was annotated by Gene Ontology. The functional annotation map indicated the third PPI sub-network involved hundreds of GO terms, such as "cell cycle arrest", "G1/S transition of mitotic cell cycle", "interphase", "cell-matrix adhesion" and "cell-substrate adhesion", as well as significant "immunity" related terms, such as "innate immune response", "regulation of defense response" and "Toll signaling pathway". These results provide important clues for experimental identification of the specific biological roles and molecular mechanisms of LOXL2-delta72. This study also provided a work flow to test the different roles of a splicing variant with high-throughput data.

Transcriptomic Analysis of Oryza sativa Leaves Reveals Key Changes in Response to Magnaporthe oryzae MSP1

  • Meng, Qingfeng;Gupta, Ravi;Kwon, Soon Jae;Wang, Yiming;Agrawal, Ganesh Kumar;Rakwal, Randeep;Park, Sang-Ryeol;Kim, Sun Tae
    • The Plant Pathology Journal
    • /
    • 제34권4호
    • /
    • pp.257-268
    • /
    • 2018
  • Rice blast disease, caused by Magnaporthe oryzae, results in an extensive loss of rice productivity. Previously, we identified a novel M. oryzae secreted protein, termed MSP1 which causes cell death and pathogen-associated molecular pattern (PAMP)-triggered immune (PTI) responses in rice. Here, we report the transcriptome profile of MSP1-induced response in rice, which led to the identification of 21,619 genes, among which 4,386 showed significant changes (P < 0.05 and fold change > 2 or < 1/2) in response to exogenous MSP1 treatment. Functional annotation of differentially regulated genes showed that the suppressed genes were deeply associated with photosynthesis, secondary metabolism, lipid synthesis, and protein synthesis, while the induced genes were involved in lipid degradation, protein degradation, and signaling. Moreover, expression of genes encoding receptor-like kinases, MAPKs, WRKYs, hormone signaling proteins and pathogenesis-related (PR) proteins were also induced by MSP1. Mapping these differentially expressed genes onto various pathways revealed critical information about the MSP1-triggered responses, providing new insights into the molecular mechanism and components of MSP1-triggered PTI responses in rice.

Identification of salt and drought inducible glutathione S-transferase genes of hybrid poplar

  • Kwon, Soon-Ho;Kwon, Hye-Kyoung;Kim, Wook;Noh, Eun Woon;Kwon, Mi;Choi, Young Im
    • Journal of Plant Biotechnology
    • /
    • 제41권1호
    • /
    • pp.26-32
    • /
    • 2014
  • Recent genome annotation revealed that Populus trichocarpa contains 81 glutathione S-transferase (GST) genes. GST genes play important and varying roles in plants, including conferring tolerance to various abiotic stresses. Little information is available on the relationship - if any - between drought/salt stresses and GSTs in woody plants. In this study, we screened the PatgGST genes in hybrid poplar (Populus alba ${\times}$ Populus tremula var. glandulosa) that were predicted to confer drought tolerance based on our expression analysis of all members of the poplar GST superfamily following exposure to salt (NaCl) and drought (PEG) stresses, respectively. Exposure to the salt stress resulted in the induction of eight PatgGST genes and down-regulation of one PatgGST gene, and the level of induction/repression was different in leaf and stem tissues. In contrast, 16 PatgGST genes were induced following exposure to the drought (PEG) stress, and two were down-regulated. Taken together, we identified seven PatgGSTs (PatgGSTU15, PatgGSTU18, PatgGSTU22, PatgGSTU27, PatgGSTU46, PatgGSTU51 and PatgGSTU52) as putative drought tolerance genes based on their induction by both salt and drought stresses.

Genome analysis of Yucatan miniature pigs to assess their potential as biomedical model animals

  • Kwon, Dae-Jin;Lee, Yeong-Sup;Shin, Donghyun;Won, Kyeong-Hye;Song, Ki-Duk
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제32권2호
    • /
    • pp.290-296
    • /
    • 2019
  • Objective: Pigs share many physiological, anatomical and genomic similarities with humans, which make them suitable models for biomedical researches. Understanding the genetic status of Yucatan miniature pigs (YMPs) and their association with human diseases will help to assess their potential as biomedical model animals. This study was performed to identify non-synonymous single nucleotide polymorphisms (nsSNPs) in selective sweep regions of the genome of YMPs and present the genetic nsSNP distributions that are potentially associated with disease occurrence in humans. Methods: nsSNPs in whole genome resequencing data from 12 YMPs were identified and annotated to predict their possible effects on protein function. Sorting intolerant from tolerant (SIFT) and polymorphism phenotyping v2 analyses were used, and gene ontology (GO) network and Kyoto encyclopedia of genes and genomes (KEGG) pathway analyses were performed. Results: The results showed that 8,462 genes, encompassing 72,067 nsSNPs were identified, and 118 nsSNPs in 46 genes were predicted as deleterious. GO network analysis classified 13 genes into 5 GO terms (p<0.05) that were associated with kidney development and metabolic processes. Seven genes encompassing nsSNPs were classified into the term associated with Alzheimer's disease by referencing the genetic association database. The KEGG pathway analysis identified only one significantly enriched pathway (p<0.05), hsa04080: Neuroactive ligand-receptor interaction, among the transcripts. Conclusion: The number of deleterious nsSNPs in YMPs was identified and then these variants-containing genes in YMPs data were adopted as the putative human diseases-related genes. The results revealed that many genes encompassing nsSNPs in YMPs were related to the various human genes which are potentially associated with kidney development and metabolic processes as well as human disease occurrence.