• 제목/요약/키워드: gene annotation

검색결과 184건 처리시간 0.021초

Genomic Analysis of a Freshwater Actinobacterium, "Candidatus Limnosphaera aquatica" Strain IMCC26207, Isolated from Lake Soyang

  • Kim, Suhyun;Kang, Ilnam;Cho, Jang-Cheon
    • Journal of Microbiology and Biotechnology
    • /
    • 제27권4호
    • /
    • pp.825-833
    • /
    • 2017
  • Strain IMCC26207 was isolated from the surface layer of Lake Soyang in Korea by the dilutionto-extinction culturing method, using a liquid medium prepared with filtered and autoclaved lake water. The strain could neither be maintained in a synthetic medium other than natural freshwater medium nor grown on solid agar plates. Phylogenetic analysis of 16S rRNA gene sequences indicated that strain IMCC26207 formed a distinct lineage in the order Acidimicrobiales of the phylum Actinobacteria. The closest relative among the previously identified bacterial taxa was "Candidatus Microthrix parvicella" with 16S rRNA gene sequence similarity of 91.7%. Here, the draft genome sequence of strain IMCC26207, a freshwater actinobacterium, is reported with the description of the genome properties and annotation summary. The draft genome consisted of 10 contigs with a total size of 3,316,799 bp and an average G+C content of 57.3%. The IMCC26207 genome was predicted to contain 2,975 protein-coding genes and 51 non-coding RNA genes, including 45 tRNA genes. Approximately 76.8% of the protein coding genes could be assigned with a specific function. Annotation of the IMCC26207 genome showed several traits of adaptation to living in oligotrophic freshwater environments, such as phosphorus-limited condition. Comparative genomic analysis revealed that the genome of strain IMCC26207 was distinct from that of "Candidatus Microthrix" strains; therefore, we propose the name "Candidatus Limnosphaera aquatica" for this bacterium.

Comparative genome characterization of Leptospira interrogans from mild and severe leptospirosis patients

  • Anuntakarun, Songtham;Sawaswong, Vorthon;Jitvaropas, Rungrat;Praianantathavorn, Kesmanee;Poomipak, Witthaya;Suputtamongkol, Yupin;Chirathaworn, Chintana;Payungporn, Sunchai
    • Genomics & Informatics
    • /
    • 제19권3호
    • /
    • pp.31.1-31.9
    • /
    • 2021
  • Leptospirosis is a zoonotic disease caused by spirochetes from the genus Leptospira. In Thailand, Leptospira interrogans is a major cause of leptospirosis. Leptospirosis patients present with a wide range of clinical manifestations from asymptomatic, mild infections to severe illness involving organ failure. For better understanding the difference between Leptospira isolates causing mild and severe leptospirosis, illumina sequencing was used to sequence genomic DNA in both serotypes. DNA of Leptospira isolated from two patients, one with mild and another with severe symptoms, were included in this study. The paired-end reads were removed adapters and trimmed with Q30 score using Trimmomatic. Trimmed reads were constructed to contigs and scaffolds using SPAdes. Cross-contamination of scaffolds was evaluated by ContEst16s. Prokka tool for bacterial annotation was used to annotate sequences from both Leptospira isolates. Predicted amino acid sequences from Prokka were searched in EggNOG and David gene ontology database to characterize gene ontology. In addition, Leptospira from mild and severe patients, that passed the criteria e-value < 10e-5 from blastP against virulence factor database, were used to analyze with Venn diagram. From this study, we found 13 and 12 genes that were unique in the isolates from mild and severe patients, respectively. The 12 genes in the severe isolate might be virulence factor genes that affect disease severity. However, these genes should be validated in further study.

Identifying long non-coding RNAs and characterizing their functional roles in swine mammary gland from colostrogenesis to lactogenesis

  • Shi, Lijun;Zhang, Longchao;Wang, Ligang;Liu, Xin;Gao, Hongmei;Hou, Xinhua;Zhao, Fuping;Yan, Hua;Cai, Wentao;Wang, Lixian
    • Animal Bioscience
    • /
    • 제35권6호
    • /
    • pp.814-825
    • /
    • 2022
  • Objective: This study was conducted to identify the functional long non-coding RNAs (lncRNAs) for swine lactation by RNA-seq data of mammary gland. Methods: According to the RNA-seq data of swine mammary gland, we screened lncRNAs, performed differential expression analysis, and confirmed the functional lncRNAs for swine lactation by validation of genome wide association study (GWAS) signals, functional annotation and weighted gene co-expression network analysis (WGCNA). Results: We totally identified 286 differentially expressed (DE) lncRNAs in mammary gland at different stages from 14 days prior to (-) parturition to day 1 after (+) parturition, and the expressions of most of lncRNAs were strongly changed from day -2 to day +1. Further, the GWAS signals of sow milk ability trait were significantly enriched in DE lncRNAs. Functional annotation revealed that these DE lncRNAs were mainly involved in mammary gland and lactation developing, milk composition metabolism and colostrum function. By performing weighted WGCNA, we identified 7 out of 12 lncRNA-mRNA modules that were highly associated with the mammary gland at day -14, day -2, and day +1, in which, 35 lncRNAs and 319 mRNAs were involved. Conclusion: This study suggested that 18 lncRNAs and their 20 target genes were promising candidates for swine parturition and colostrum occurrence processes. Our research provided new insights into lncRNA profiles and their regulating mechanisms from colostrogenesis to lactogenesis in swine.

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • 제25권2호
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

ORF Miner: a Web-based ORF Search Tool

  • Park, Sin-Gi;Kim, Ki-Bong
    • Genomics & Informatics
    • /
    • 제7권4호
    • /
    • pp.217-219
    • /
    • 2009
  • The primary clue for locating protein-coding regions is the open reading frame and the determination of ORFs (Open Reading Frames) is the first step toward the gene prediction, especially for prokaryotes. In this respect, we have developed a web-based ORF search tool called ORF Miner. The ORF Miner is a graphical analysis utility which determines all possible open reading frames of a selectable minimum size in an input sequence. This tool identifies all open reading frames using alternative genetic codes as well as the standard one and reports a list of ORFs with corresponding deduced amino acid sequences. The ORF Miner can be employed for sequence annotation and give a crucial clue to determination of actual protein-coding regions.

Hypothetical protein predicted to be tumor suppressor: a protein functional analysis

  • Kader, Md. Abdul;Ahammed, Akash;Khan, Md. Sharif;Ashik, Sheikh Abdullah Al;Islam, Md. Shariful;Hossain, Mohammad Uzzal
    • Genomics & Informatics
    • /
    • 제20권1호
    • /
    • pp.6.1-6.15
    • /
    • 2022
  • Litorilituus sediminis is a Gram-negative, aerobic, novel bacterium under the family of Colwelliaceae, has a stunning hypothetical protein containing domain called von Hippel-Lindau that has significant tumor suppressor activity. Therefore, this study was designed to elucidate the structure and function of the biologically important hypothetical protein EMK97_00595 (QBG34344.1) using several bioinformatics tools. The functional annotation exposed that the hypothetical protein is an extracellular secretory soluble signal peptide and contains the von Hippel-Lindau (VHL; VHL beta) domain that has a significant role in tumor suppression. This domain is conserved throughout evolution, as its homologs are available in various types of the organism like mammals, insects, and nematode. The gene product of VHL has a critical regulatory activity in the ubiquitous oxygen-sensing pathway. This domain has a significant role in inhibiting cell proliferation, angiogenesis progression, kidney cancer, breast cancer, and colon cancer. At last, the current study depicts that the annotated hypothetical protein is linked with tumor suppressor activity which might be of great interest to future research in the higher organism.

GEDA: New Knowledge Base of Gene Expression in Drug Addiction

  • Suh, Young-Ju;Yang, Moon-Hee;Yoon, Suk-Joon;Park, Jong-Hoon
    • BMB Reports
    • /
    • 제39권4호
    • /
    • pp.441-447
    • /
    • 2006
  • Abuse of drugs can elicit compulsive drug seeking behaviors upon repeated administration, and ultimately leads to the phenomenon of addiction. We developed a procedure for the standardization of microarray gene expression data of rat brain in drug addiction and stored them in a single integrated database system, focusing on more effective data processing and interpretation. Another characteristic of the present database is that it has a systematic flexibility for statistical analysis and linking with other databases. Basically, we adopt an intelligent SQL querying system, as the foundation of our DB, in order to set up an interactive module which can automatically read the raw gene expression data in the standardized format. We maximize the usability of this DB, helping users study significant gene expression and identify biological function of the genes through integrated up-to-date gene information such as GO annotation and metabolic pathway. For collecting the latest information of selected gene from the database, we also set up the local BLAST search engine and non-redundant sequence database updated by NCBI server on a daily basis. We find that the present database is a useful query interface and data-mining tool, specifically for finding out the genes related to drug addiction. We apply this system to the identification and characterization of methamphetamine-induced genes' behavior in rat brain.

Identification of Hub Genes in the Pathogenesis of Ischemic Stroke Based on Bioinformatics Analysis

  • Yang, Xitong;Yan, Shanquan;Wang, Pengyu;Wang, Guangming
    • Journal of Korean Neurosurgical Society
    • /
    • 제65권5호
    • /
    • pp.697-709
    • /
    • 2022
  • Objective : The present study aimed to identify the function of ischemic stroke (IS) patients' peripheral blood and its role in IS, explore the pathogenesis, and provide direction for clinical research progress by comprehensive bioinformatics analysis. Methods : Two datasets, including GSE58294 and GSE22255, were downloaded from Gene Expression Omnibus database. GEO2R was utilized to obtain differentially expressed genes (DEGs). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were performed using the database annotation, visualization and integrated discovery database. The protein-protein interaction (PPI) network of DEGs was constructed by search tool of searching interactive gene and visualized by Cytoscape software, and then the Hub gene was identified by degree analysis. The microRNA (miRNA) and miRNA target genes closely related to the onset of stroke were obtained through the miRNA gene regulatory network. Results : In total, 36 DEGs, containing 27 up-regulated and nine down-regulated DEGs, were identified. GO functional analysis showed that these DEGs were involved in regulation of apoptotic process, cytoplasm, protein binding and other biological processes. KEGG enrichment analysis showed that these DEGs mediated signaling pathways, including human T-cell lymphotropic virus (HTLV)-I infection and microRNAs in cancer. The results of PPI network and cytohubba showed that there was a relationship between DEGs, and five hub genes related to stroke were obtained : SOCS3, KRAS, PTGS2, EGR1, and DUSP1. Combined with the visualization of DEG-miRNAs, hsa-mir-16-5p, hsa-mir-181a-5p and hsa-mir-124-3p were predicted to be the key miRNAs in stroke, and three miRNAs were related to hub gene. Conclusion : Thirty-six DEGs, five Hub genes, and three miRNA were obtained from bioinformatics analysis of IS microarray data, which might provide potential targets for diagnosis and treatment of IS.

IMGT Unique Numbering for Standardized Contact Analysis of Immunoglobulin/antigen and T cell receptor/peptide/MHC Complexes

  • Kaas, Quentin;Chiche, Laurent;Lefrane, Marie-Paule
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.209-214
    • /
    • 2005
  • Immunoglobulins (IG) , T cell receptors (TR) and major histocompatibility complex (MHC) are major components of the immune system. Their experimentally determined three-dimensional (3D) structures are numerous and their retrieval and comparison is problematic. IMGT, the international ImMunoGeneTics information system$^{\circledR}$(http://imgt.cines.fr), has devised controlled vocabulary and annotation rules for the sequences and 3D structures of the IG TR and MHC. Annotated data from IMGT/3D sructure-DB, the IMGT 3D structure database, are used in this paper to compare 3D structure of the domains and receptor, and to characterize IG/antigen, peptide/MHC and TR/peptide/MHC interfaces. The analysis includes angle measures to assess receptor flexibility, structural superimposition and contact analysis. Up-to-date data and analysis results are available at the IMGT Web site, http://imgt.cines.fr.

  • PDF

Identification of Genes and MicroRNAs Involved in Ovarian Carcinogenesis

  • Wan, Shu-Mei;Lv, Fang;Guan, Ting
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권8호
    • /
    • pp.3997-4000
    • /
    • 2012
  • MicroRNAs (miRNAs) play roles in the clinic, both as diagnostic and therapeutic tools. The identification of relevant microRNAs is critically required for ovarian cancer because of the prevalence of late diagnosis and poor treatment options currently. To identify miRNAs involved in the development or progression of ovarian cancer, we analyzed gene expression profiles downloaded from Gene Expression Omnibus. Comparison of expression patterns between carcinomas and the corresponding normal ovarian tissues enabled us to identify 508 genes that were commonly up-regulated and 1331 genes that were down-regulated in the cancer specimens. Function annotation of these genes showed that most of the up-regulated genes were related to cell cycling, and most of the down-regulated genes were associated with the immune response. When these differentially expressed genes were mapped to MiRTarBase, we obtained a total of 18 key miRNAs which may play important regulatory roles in ovarian cancer. Investigation of these genes and microRNAs should help to disclose the molecular mechanisms of ovarian carcinogenesis and facilitate development of new approaches to therapeutic intervention.