• Title/Summary/Keyword: pan-genome analysis

Search Result 30, Processing Time 0.025 seconds

Comparative Genome Analysis of Psychrobacillus Strain PB01, Isolated from an Iceberg

  • Choi, Jun Young;Kim, Sun Chang;Lee, Pyung Cheon
    • Journal of Microbiology and Biotechnology
    • /
    • v.30 no.2
    • /
    • pp.237-243
    • /
    • 2020
  • A novel psychrotolerant Psychrobacillus strain PB01, isolated from an Antarctic iceberg, was comparatively analyzed with five related strains. The complete genome of strain PB01 consists of a single circular chromosome (4.3 Mb) and a plasmid (19 Kb). As potential low-temperature adaptation strategies, strain PB01 has four genes encoding cold-shock proteins, two genes encoding DEAD-box RNA helicases, and eight genes encoding transporters for glycine betaine, which can serve as a cryoprotectant, on the genome. The pan-genome structure of the six Psychrobacillus strains suggests that strain PB01 might have evolved to adapt to extreme environments by changing its genome content to gain higher capacity for DNA repair, translation, and membrane transport. Notably, strain PB01 possesses a complete TCA cycle consisting of eight enzymes as well as three additional Helicobacter pylori-type enzymes: ferredoxin-dependent 2-oxoglutarate synthase, succinyl-CoA/acetoacetyl-CoA transferase, and malate/quinone oxidoreductase. The co-existence of the genes for TCA cycle enzymes has also been identified in the other five Psychrobacillus strains.

Bioinformatics services for analyzing massive genomic datasets

  • Ko, Gunhwan;Kim, Pan-Gyu;Cho, Youngbum;Jeong, Seongmun;Kim, Jae-Yoon;Kim, Kyoung Hyoun;Lee, Ho-Yeon;Han, Jiyeon;Yu, Namhee;Ham, Seokjin;Jang, Insoon;Kang, Byunghee;Shin, Sunguk;Kim, Lian;Lee, Seung-Won;Nam, Dougu;Kim, Jihyun F.;Kim, Namshin;Kim, Seon-Young;Lee, Sanghyuk;Roh, Tae-Young;Lee, Byungwook
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.8.1-8.10
    • /
    • 2020
  • The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.

Genomic Insight into the Salt Tolerance of Enterococcus faecium, Enterococcus faecalis and Tetragenococcus halophilus

  • Heo, Sojeong;Lee, Jungmin;Lee, Jong-Hoon;Jeong, Do-Won
    • Journal of Microbiology and Biotechnology
    • /
    • v.29 no.10
    • /
    • pp.1591-1602
    • /
    • 2019
  • To shed light on the genetic basis of salt tolerance in Enterococcus faecium, Enterococcus faecalis, and Tetragenococcus halophilus, we performed comparative genome analysis of 10 E. faecalis, 11 E. faecium, and three T. halophilus strains. Factors involved in salt tolerance that could be used to distinguish the species were identified. Overall, T. halophilus contained a greater number of potassium transport and osmoprotectant synthesis genes compared with the other two species. In particular, our findings suggested that T. halophilus may be the only one among the three species capable of synthesizing glycine betaine from choline, cardiolipin from glycerol and proline from citrate. These molecules are well-known osmoprotectants; thus, we propose that these genes confer the salt tolerance of T. halophilus.

Sequence Analysis of Mitochondrial Genome of Toxascaris leonina from a South China Tiger

  • Li, Kangxin;Yang, Fang;Abdullahi, A.Y.;Song, Meiran;Shi, Xianli;Wang, Minwei;Fu, Yeqi;Pan, Weida;Shan, Fang;Chen, Wu;Li, Guoqing
    • Parasites, Hosts and Diseases
    • /
    • v.54 no.6
    • /
    • pp.803-807
    • /
    • 2016
  • Toxascaris leonina is a common parasitic nematode of wild mammals and has significant impacts on the protection of rare wild animals. To analyze population genetic characteristics of T. leonina from South China tiger, its mitochondrial (mt) genome was sequenced. Its complete circular mt genome was 14,277 bp in length, including 12 proteincoding genes, 22 tRNA genes, 2 rRNA genes, and 2 non-coding regions. The nucleotide composition was biased toward A and T. The most common start codon and stop codon were TTG and TAG, and 4 genes ended with an incomplete stop codon. There were 13 intergenic regions ranging 1 to 10 bp in size. Phylogenetically, T. leonina from a South China tiger was close to canine T. leonina. This study reports for the first time a complete mt genome sequence of T. leonina from the South China tiger, and provides a scientific basis for studying the genetic diversity of nematodes between different hosts.

Deep Learning in Genomic and Medical Image Data Analysis: Challenges and Approaches

  • Yu, Ning;Yu, Zeng;Gu, Feng;Li, Tianrui;Tian, Xinmin;Pan, Yi
    • Journal of Information Processing Systems
    • /
    • v.13 no.2
    • /
    • pp.204-214
    • /
    • 2017
  • Artificial intelligence, especially deep learning technology, is penetrating the majority of research areas, including the field of bioinformatics. However, deep learning has some limitations, such as the complexity of parameter tuning, architecture design, and so forth. In this study, we analyze these issues and challenges in regards to its applications in bioinformatics, particularly genomic analysis and medical image analytics, and give the corresponding approaches and solutions. Although these solutions are mostly rule of thumb, they can effectively handle the issues connected to training learning machines. As such, we explore the tendency of deep learning technology by examining several directions, such as automation, scalability, individuality, mobility, integration, and intelligence warehousing.

New Approach to Predict microRNA Gene by using data Compression technique

  • Kim, Dae-Won;Yang, Joshua SungWoo;Kim, Pan-Jun;Chu, In-Sun;Jeong, Ha-Woong;Park, Hong-Seog
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.361-365
    • /
    • 2005
  • Over the past few years, the complex and subtle roles of microRNA (miRNA) in gene regulation have been increasingly appreciated. Computational approaches have played one of important roles in identifying miRNAs from plant and animals, as well as in predicting their putative gene target. We present a new approach of comprehensive analysis of the evolutionarily conserved element scores and applied data compression technique to detect putative miRNA genes. We used the evolutionarily conserved elements [19] (see more detail on method and material) to calculate for base-by-base along the candidate pre-miRNA gene region by detecting common conserved pattern from target sequence. We applied the data compression technique [20] to detect unknown miRNA genes. This zipping method devises, without loss of generality with respect to the nature of the character strings, a method to measure the similarity between the strings under consideration [20]. Our experience to using our new computational method for detecting miRNA gene identification (or miRNA gene prediction) has been stratified and we were able to find 28 putative miRNA genes.

  • PDF

Pan-Genomics of Lactobacillus plantarum Revealed Group-Specific Genomic Profiles without Habitat Association

  • Choi, Sukjung;Jin, Gwi-Deuk;Park, Jongbin;You, Inhwan;Kim, Eun Bae
    • Journal of Microbiology and Biotechnology
    • /
    • v.28 no.8
    • /
    • pp.1352-1359
    • /
    • 2018
  • Lactobacillus plantarum is a lactic acid bacterium that promotes animal intestinal health as a probiotic and is found in a wide variety of habitats. Here, we investigated the genomic features of different clusters of L. plantarum strains via pan-genomic analysis. We compared the genomes of 108 L. plantarum strains that were available from the NCBI GenBank database. These genomes were 2.9-3.7 Mbp in size and 44-45% in G+C content. A total of 8,847 orthologs were collected, and 1,709 genes were identified to be shared as core genes by all the strains analyzed. On the basis of SNPs from the core genes, 108 strains were clustered into five major groups (G1-G5) that are different from previous reports and are not clearly associated with habitats. Analysis of group-specific enriched or depleted genes revealed that G1 and G2 were rich in genes for carbohydrate utilization (${\text\tiny{L}}-arabinose$, ${\text\tiny{L}}-rhamnose$, and fructooligosaccharides) and that G3, G4, and G5 possessed more genes for the restriction-modification system and MazEF toxin-antitoxin. These results indicate that there are critical differences in gene content and survival strategies among genetically clustered L. plantarum strains, regardless of habitats.

Unraveling the hypoxia modulating potential of VEGF family genes in pan-cancer

  • So-Hyun Bae;Taewon Hwang;Mi-Ryung Han
    • Genomics & Informatics
    • /
    • v.21 no.4
    • /
    • pp.44.1-44.10
    • /
    • 2023
  • Tumor hypoxia, oxygen deprivation state, occurs in most cancers and promotes angiogenesis, enhancing the potential for metastasis. The vascular endothelial growth factor (VEGF) family genes play crucial roles in tumorigenesis by promoting angiogenesis. To investigate the malignant processes triggered by hypoxia-induced angiogenesis across pan-cancers, we comprehensively analyzed the relationships between the expression of VEGF family genes and hypoxic microenvironment based on integrated bioinformatics methods. Our results suggest that the expression of VEGF family genes differs significantly among various cancers, highlighting their heterogeneity effect on human cancers. Across the 33 cancers, VEGFB and VEGFD showed the highest and lowest expression levels, respectively. The survival analysis showed that VEGFA and placental growth factor (PGF) were correlated with poor prognosis in many cancers, including kidney renal cell and liver hepatocellular carcinoma. VEGFC expression was positively correlated with glioma and stomach cancer. VEGFA and PGF showed distinct positive correlations with hypoxia scores in most cancers, indicating a potential correlation with tumor aggressiveness. The expression of miRNAs targeting VEGF family genes, including hsa-miR-130b-5p and hsa-miR-940, was positively correlated with hypoxia. In immune subtypes analysis, VEGFC was highly expressed in C3 (inflammatory) and C6 (transforming growth factor β dominant) across various cancers, indicating its potential role as a tumor promotor. VEGFC expression exhibited positive correlations with immune infiltration scores, suggesting low tumor purity. High expression of VEGFA and VEGFC showed favorable responses to various drugs, including BLU-667, which abrogates RET signaling, an oncogenic driver in liver and thyroid cancers. Our findings suggest potential roles of VEGF family genes in malignant processes related with hypoxia-induced angiogenesis.

Gateway RFP-Fusion Vectors for High Throughput Functional Analysis of Genes

  • Park, Jae-Yong;Hwang, Eun Mi;Park, Nammi;Kim, Eunju;Kim, Dong-Gyu;Kang, Dawon;Han, Jaehee;Choi, Wan Sung;Ryu, Pan-Dong;Hong, Seong-Geun
    • Molecules and Cells
    • /
    • v.23 no.3
    • /
    • pp.357-362
    • /
    • 2007
  • There is an increasing demand for high throughput (HTP) methods for gene analysis on a genome-wide scale. However, the current repertoire of HTP detection methodologies allows only a limited range of cellular phenotypes to be studied. We have constructed two HTP-optimized expression vectors generated from the red fluorescent reporter protein (RFP) gene. These vectors produce RFP-tagged target proteins in a multiple expression system using gateway cloning technology (GCT). The RFP tag was fused with the cloned genes, thereby allowing us localize the expressed proteins in mammalian cells. The effectiveness of the vectors was evaluated using an HTP-screening system. Sixty representative human C2 domains were tagged with RFP and overexpressed in HiB5 neuronal progenitor cells, and we studied in detail two C2 domains that promoted the neuronal differentiation of HiB5 cells. Our results show that the two vectors developed in this study are useful for functional gene analysis using an HTP-screening system on a genome-wide scale.

Methylation-sensitive high-resolution melting analysis of the USP44 promoter can detect early-stage hepatocellular carcinoma in blood samples

  • Si-Cho, Kim;Jiwon, Kim;Da-Won, Kim;Yanghee, Choi;Kyunghyun, Park;Eun Ju, Cho;Su Jong, Yu;Jeongsil, Kim-Ha;Young-Joon, Kim
    • BMB Reports
    • /
    • v.55 no.11
    • /
    • pp.553-558
    • /
    • 2022
  • Hepatocellular carcinoma (HCC) is dangerous cancer that often evades early detection because it is asymptomatic and an effective detection method is lacking. For people with chronic liver inflammation who are at high risk of developing HCC, a sensitive detection method for HCC is needed. In a meta-analysis of The Cancer Genome Atlas pan-cancer methylation database, we identified a CpG island in the USP44 promoter that is methylated specifically in HCC. We developed methylation-sensitive high-resolution melting (MS-HRM) analysis to measure the methylation levels of the USP promoter in cell-free DNA isolated from patients. Our MS-HRM assay correctly identified 40% of patients with early-stage HCC, whereas the α-fetoprotein test, which is currently used to detect HCC, correctly identified only 25% of early-stage HCC patients. These results demonstrate that USP44 MS-HRM analysis is suitable for HCC surveillance.