• Title/Summary/Keyword: Genome sequence

Search Result 1,399, Processing Time 0.026 seconds

Five Computer Simulation Studies of Whole-Genome Fragment Assembly: The Case of Assembling Zymomonas mobilis ZM4 Sequences

  • Jung, Cholhee;Choi, Jin-Young;Park, Hyun Seck;Seo, Jeong-Sun
    • Genomics & Informatics
    • /
    • v.2 no.4
    • /
    • pp.184-190
    • /
    • 2004
  • An approach for genome analysis based on assembly of fragments of DNA from the whole genome can be applied to obtain the complete nucleotide sequence of the genome of Zymomonas mobilis. However, the problem of fragment assembly raise thorny computational issues. Computer simulation studies of sequence assembly usually show some abnormal assemblage of artificial sequences containing repetitive or duplicated regions, and suggest methods to correct those abnormalities. In this paper, we describe five simulation studies which had been performed previous to the actual genome assembly process of Zymomonas mobilis ZM4.

Draft Genome Sequence of Weissella koreensis Strain HJ, a Probiotic Bacterium Isolated from Kimchi

  • Seung-Min Yang;Eiseul Kim;So-Yun Lee;Soyeong Mun;Hae Choon Chang;Hae-Yeong Kim
    • Microbiology and Biotechnology Letters
    • /
    • v.51 no.1
    • /
    • pp.128-131
    • /
    • 2023
  • Here we report the draft genome sequence of Weissella koreensis strain HJ and genomic analysis of its key features. The genome consists of 1,427,571 bp with a GC content of 35.5%, and comprises 1,376 coding genes. In silico analysis revealed the absence of pathogenic factors within the genome. The genome harbors several genes that play an important role in the survival of the gastrointestinal tract. In addition, a type III polyketide synthase cluster was identified. Pangenome analysis identified 68 unique genes in W. koreensis strain HJ. The genome information of this strain provides the basis for understanding its probiotic properties.

The complete genome sequence of a white spot syndrome virus isolated from Litopenaeus vannamei (흰다리새우(Litopenaeus vannamei )에서 분리된 WSSV의 전장유전체 분석)

  • Lee, A-reum;Kong, Kyoung-Hui;Kim, Hwi-Jin;Oh, Myung-Joo;Kim, Do-Hyung;Kim, Jong-Oh;Kim, Wi-Sik
    • Journal of fish pathology
    • /
    • v.35 no.1
    • /
    • pp.129-133
    • /
    • 2022
  • The full genome sequence of a Korean white spot syndrome virus (WSSV, isolate: WSSV-GoC18) is presented here. We obtained a total of 12,320,554 reads with 291,172 bases, 170 gene, and 170 coding DNA sequence, which were assembled in 1 contig. Phylogenetic analysis revealed that the WSSV-GoC18 was closely related to Chinese isolate (WSSV-PC) and distinctly different with previously reported a Korean isolate (WSSV K-LV1). The complete genome sequence of WSSV isolates will be of great help in molecular epidemiological studies, contributing to molecular diagnosis and disease prevention in shrimp aquaculture.

Chloroplast Genome Evolution in Early Diverged Leptosporangiate Ferns

  • Kim, Hyoung Tae;Chung, Myong Gi;Kim, Ki-Joong
    • Molecules and Cells
    • /
    • v.37 no.5
    • /
    • pp.372-382
    • /
    • 2014
  • In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnV-GCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of codons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns.

Bridging a Gap between DNA sequences and expression patterns of genes

  • Morishita, Shinichi
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.69-70
    • /
    • 2000
  • The completion of sequencing human genome would motivate us to map millions of human cDNAs onto the unique ruler "genome sequence", in order to identify the exact address of each cDNA together with its exons, its promoter region, and its alternative splicing patterns. The expression patterns of some cDNAs could therefore be associated with these precise gene addresses, which further accelerate studies on mining correlations between motifs of promoters and expressions of genes in tissues. Towards the realization of this goal, we have developed a time-and-space efficient software named SQUALL that is able to map one cDNA sequence of length a few thousand onto a long genome sequence of length thirty million in a couple of minutes on average. Using SQUALL, we have mapped twenty thousand of our Bodymap (http://bodymap.ims.u-tokyo.ac.jp) cDNAs onto the genome sequences of Chr.21st and 22nd. In this talk, I will report the status of this ongoing project.

  • PDF

A Simple Java Sequence Alignment Editing Tool for Resolving Complex Repeat Regions

  • Ham, Seong-Il;Lee, Kyung-Eun;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.7 no.1
    • /
    • pp.46-48
    • /
    • 2009
  • Finishing is the most time-consuming step in sequencing, and many genome projects are left unfinished due to complex repeat regions. Here, we have developed BACContigEditor, a prototype shotgun sequence finishing tool. It is essentially an editor that visualizes assemblies of shotgun sequence fragment reads as gapped multiple alignments. The program offers some flexibility that is needed to rapidly resolve complex regions within a working session. The sole purpose of the release is to promote collaborative creation of extensible software for fragment assembly editors, foster collaborative development, and reduce barriers to initial tool development effort. We describe our software architecture and identify current challenges. The program is available under an Open Source license.

A Primer for Disease Gene Prioritization Using Next-Generation Sequencing Data

  • Wang, Shuoguo;Xing, Jinchuan
    • Genomics & Informatics
    • /
    • v.11 no.4
    • /
    • pp.191-199
    • /
    • 2013
  • High-throughput next-generation sequencing (NGS) technology produces a tremendous amount of raw sequence data. The challenges for researchers are to process the raw data, to map the sequences to genome, to discover variants that are different from the reference genome, and to prioritize/rank the variants for the question of interest. The recent development of many computational algorithms and programs has vastly improved the ability to translate sequence data into valuable information for disease gene identification. However, the NGS data analysis is complex and could be overwhelming for researchers who are not familiar with the process. Here, we outline the analysis pipeline and describe some of the most commonly used principles and tools for analyzing NGS data for disease gene identification.

The complete chloroplast genome sequence of Korean Neolitsea sericea (Lauraceae)

  • PARK, Yoo-Jung;CHEON, Kyeong-Sik
    • Korean Journal of Plant Taxonomy
    • /
    • v.51 no.3
    • /
    • pp.332-336
    • /
    • 2021
  • The complete chloroplast (cp) genome sequence of Neolitsea sericea was determined by Illumina sequencing. The complete cp genome was 152,446bp in length, containing a large single-copy region of 93,796 bp and a small single-copy region of 18,506bp, which were separated by a pair of 20,072bp inverted repeats. A total of 112 unique genes were annotated, including 78 protein-coding genes (PCGs), 30 transfer RNAs, and four ribosomal RNAs. Among the PCGs, 18 genes contained one or two introns. A very low level of sequence variation between two cp genomes of N. sericea was found with seven insertions or deletions and only one single nucleotide polymorphism. An analysis using the maximum likelihood method showed that N. sericea was closely related to Actinodaphne trichocarpa.

Draft genome sequence of oligosaccharide producing Leuconostoc lactis CCK940 isolated from kimchi in Korea (올리고당을 생산하는 Leuconostoc lactis CCK940 균주의 유전체 염기서열)

  • Lee, Sulhee;Park, Young-Seo
    • Korean Journal of Microbiology
    • /
    • v.54 no.4
    • /
    • pp.445-447
    • /
    • 2018
  • Leuconostoc lactis CCK940, which was isolated from kimchi obtained from a Korean traditional market, produced an oligosaccharide with a degree of polymerization of more than 4. In this study, the draft genome sequence of L. lactis CCK940 was reported by using PacBio 20 kb platform. The genome of this strain was sequenced and the genome assembly revealed 2 contigs. The genome was 1,741,511 base pairs in size with a G + C content of 43.33%, containing 1,698 coding sequences, 12 rRNA genes, and 68 tRNA genes. L. lactis CCK940 contained genes encoding glycosyltransferase, sucrose phosphorylase, maltose phosphorylase, and ${\beta}$-galactosidase which could synthesize oligosaccharide.

Development of Workbench for Analysis and Visualization of Whole Genome Sequence (전유전체(Whole gerlome) 서열 분석과 가시화를 위한 워크벤치 개발)

  • Choe, Jeong-Hyeon;Jin, Hui-Jeong;Kim, Cheol-Min;Jang, Cheol-Hun;Jo, Hwan-Gyu
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.387-398
    • /
    • 2002
  • As whole genome sequences of many organisms have been revealed by small-scale genome projects, the intensive research on individual genes and their functions has been performed. However on-memory algorithms are inefficient to analysis of whole genome sequences, since the size of individual whole genome is from several million base pairs to hundreds billion base pairs. In order to effectively manipulate the huge sequence data, it is necessary to use the indexed data structure for external memory. In this paper, we introduce a workbench system for analysis and visualization of whole genome sequence using string B-tree that is suitable for analysis of huge data. This system consists of two parts : analysis query part and visualization part. Query system supports various transactions such as sequence search, k-occurrence, and k-mer analysis. Visualization system helps biological scientist to easily understand whole structure and specificity by many kinds of visualization such as whole genome sequence, annotation, CGR (Chaos Game Representation), k-mer, and RWP (Random Walk Plot). One can find the relations among organisms, predict the genes in a genome, and research on the function of junk DNA using our workbench.