• 제목/요약/키워드: Genomic Sequence

검색결과 904건 처리시간 0.026초

Bioinformatics for the Korean Functional Genomics Project

  • Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.45-52
    • /
    • 2000
  • Genomic approach produces massive amount of data within a short time period, New high-throughput automatic sequencers can generate over a million nucleotide sequence information overnight. A typical DNA chip experiment produces tens of thousands expression information, not to mention the tens of megabyte image files, These data must be handled automatically by computer and stored in electronic database, Thus there is a need for systematic approach of data collection, processing, and analysis. DNA sequence information is translated into amino acid sequence and is analyzed for key motif related to its biological and/or biochemical function. Functional genomics will play a significant role in identifying novel drug targets and diagnostic markers for serious diseases. As an enabling technology for functional genomics, bioinformatics is in great need worldwide, In Korea, a new functional genomics project has been recently launched and it focuses on identi☞ing genes associated with cancers prevalent in Korea, namely gastric and hepatic cancers, This involves gene discovery by high throughput sequencing of cancer cDNA libraries, gene expression profiling by DNA microarray and proteomics, and SNP profiling in Korea patient population, Our bioinformatics team will support all these activities by collecting, processing and analyzing these data.

  • PDF

Genome Sequence and Comparative Genome Analysis of Pseudomonas syringae pv. syringae Type Strain ATCC 19310

  • Park, Yong-Soon;Jeong, Haeyoung;Sim, Young Mi;Yi, Hwe-Su;Ryu, Choong-Min
    • Journal of Microbiology and Biotechnology
    • /
    • 제24권4호
    • /
    • pp.563-567
    • /
    • 2014
  • Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level.

그래프 기반 분산 시스템을 이용한 염기 서열 정렬 (DNA Sequence Alignment Using a Graph-based Distributed System)

  • 이준수;안재균;여윤구;노홍찬;박상현
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2013년도 춘계학술발표대회
    • /
    • pp.894-897
    • /
    • 2013
  • 서열 정렬(sequence alignment)은 유전학(genomic)에서 널리 사용되는 도구 중 하나이다. 최근에는 차세대 시퀀싱 기술(NGS)이 발달함에 따라 데이터의 생산량이 크게 증가했고, 이에 따라 높은 처리량(throughput)을 가진 서열 정렬 알고리즘의 필요성이 증가하였다. 본 논문에서 제안하는 염기 서열 정렬 알고리즘은 시퀀스(sequence)데이터를 그래프 형태로 변형시킨 다음, 마이크로소프트사의 그래프 기반인 메모리(in-memory) 분산시스템(distributed system) 트리니티(Trinity)를 이용해 서열 정렬을 수행한다. 본 논문의 알고리즘은 트리니티 시스템에서 시뮬레이션 염기 데이터를 성공적으로 정렬하였으며, 슬레이브의 개수가 늘어날수록 빠른 속도를 나타내어 확장성(scalability)을 입증했다.

Whole-genome sequence analysis through online web interfaces: a review

  • Gunasekara, A.W.A.C.W.R.;Rajapaksha, L.G.T.G.;Tung, T.L.
    • Genomics & Informatics
    • /
    • 제20권1호
    • /
    • pp.3.1-3.10
    • /
    • 2022
  • The recent development of whole-genome sequencing technologies paved the way for understanding the genomes of microorganisms. Every whole-genome sequencing (WGS) project requires a considerable cost and a massive effort to address the questions at hand. The final step of WGS is data analysis. The analysis of whole-genome sequence is dependent on highly sophisticated bioinformatics tools that the research personal have to buy. However, many laboratories and research institutions do not have the bioinformatics capabilities to analyze the genomic data and therefore, are unable to take maximum advantage of whole-genome sequencing. In this aspect, this study provides a guide for research personals on a set of bioinformatics tools available online that can be used to analyze whole-genome sequence data of bacterial genomes. The web interfaces described here have many advantages and, in most cases exempting the need for costly analysis tools and intensive computing resources.

Complete genome and two plasmids sequences of Lactiplantibacillus plantarum L55 for probiotic potentials

  • Bogun Kim;Kiyeop Kim;Xiaoyue Xu;Hyunju Lee;Duleepa Pathiraja;Dong-June Park;In-Geol Choi;Sejong Oh
    • Journal of Animal Science and Technology
    • /
    • 제65권6호
    • /
    • pp.1341-1344
    • /
    • 2023
  • In this study, we report the complete genome sequence of Lactiplantibacillus plantarum L55, a probiotic strain of lactic acid bacteria isolated from kimchi. The genome consists of one circular chromosome (2,077,416 base pair [bp]) with a guanine cytosine (GC) content of 44.5%, and two circular plasmid sequences (54,267 and 19,592 bp, respectively). We also conducted a comprehensive analysis of the genome, which identified the presence of functional genes, genomic islands, and antibiotic-resistance genes. The genome sequence data presented in this study provide insights into the genetic basis of L. plantarum L55, which could be beneficial for the future development of probiotic applications.

Analysis of nucleotide sequence of a novel plasmid, pILR091, from Lactobacillus reuteri L09 isolated from pig

  • Lee, Deog-Yong;Kang, Sang-Gyun;Rayamajhi, Nabin;Kang, Milan;Yoo, Han Sang
    • 대한수의학회지
    • /
    • 제48권4호
    • /
    • pp.441-449
    • /
    • 2008
  • The genus Lactobacillus is the largest of the genera included in lactic acid bacteria and is associated with mucosal membranes of human and animal. Only a few Lactobacillus plasmid-encoded functions have been discovered and used. In this study, a novel plasmid (pILR091) was isolated from a wild L. reuteri isolated from pig and described the characteristics of its replicons, genetic organization, and relationship with other plasmids. After digestion of the plasmid, pILR091, with SalI, plasmid DNA was cloned into the pQE-30Xa vector and sequenced. The complete sequence was confirmed by the sequencing of PCR products and analyzed with the Genbank database. The isolate copy number and stability were determined by quantitative-PCR. The complete sequence of L. reuteri contained 7,185 nucleotides with 39% G-C content and one cut site by two enzymes, SalI and HindIII. The similar ori sequence of the pC194- rolling circle replication family (TTTATATTGAT) was located 63 bp upstream of the protein replication sequence, ORF 1. Total of five ORFs was identified and the coding sequence represented 4,966 nucleotides (70.4%). ORF1 of pILR091 had a low similarity with the sequence of pTE44. Other ORFs also showed low homology and E-values. The average G-C content of pILR091 was 39%, similar with that of genomic DNA. The copy number of pILR091 was determined at approximately 24 to 25 molecules per genomic DNA. These results suggested that pILR091 might be a good candidate to construct a new vector, which could be used for cloning and expression of foreign genes in lactobacilli.

수분부족 및 식물호르몬, ABA에 의하여 발현이 유도되는 배추의 C-DH cDNA에 대한 분자적 특성 (Molecular Characterization of a Chinese cabbage cDNA, C-DH, Predominantly Induced by Water-Deficit Stress and Plant Hormone, ABA)

  • 정나은;이균오;홍창휘;정배교;박정동;이상열
    • 한국식물병리학회지
    • /
    • 제14권3호
    • /
    • pp.240-246
    • /
    • 1998
  • A cDNA encoding desiccation-related protein was isolated from a flower bud cDNA library of Chinese cabbage (C-DH) and its nucleotide sequence was characterized. It contains 679 bp nucleotides with 501 bp open reading frame. The amino acid sequence of the putative protein showed the highest amino acid sequence homology (79 % identity) to dehydrin protein in Gossypium hirsutum. Also, the C-DH shares 48-52% amino acid sequence identity with the other typical dehydrin proteins in plant cells. When the amino acid sequence of their proteins were aligned, several peptide motifs were well conserved, of which function has to be solved. Particularly the C-DH contains 15 additional amino acids at its N-terminus. Genomic Southern blot analysis using the coding region of C-DH showed that the C-DH consists of a single copy gene in Chinese cabbage genome. The C-DH mRNA, whose transcript size is 0.7 kb, was expressed with a tissue-specific manner. It was highly expressed in seed, flower buds and low expression as detected in root, stem or leaf tissues of Chinese cabbage. And the transcript level of C-DH was significantly induced by the treatment of plant hormone, abscisic acid and water-deficit conditions.

  • PDF

An Efficient DNA Sequence Compression using Small Sequence Pattern Matching

  • Murugan., A;Punitha., K
    • International Journal of Computer Science & Network Security
    • /
    • 제21권8호
    • /
    • pp.281-287
    • /
    • 2021
  • Bioinformatics is formed with a blend of biology and informatics technologies and it employs the statistical methods and approaches for attending the concerning issues in the domains of nutrition, medical research and towards reviewing the living environment. The ceaseless growth of DNA sequencing technologies has resulted in the production of voluminous genomic data especially the DNA sequences thus calling out for increased storage and bandwidth. As of now, the bioinformatics confronts the major hurdle of management, interpretation and accurately preserving of this hefty information. Compression tends to be a beacon of hope towards resolving the aforementioned issues. Keeping the storage efficiently, a methodology has been recommended which for attending the same. In addition, there is introduction of a competent algorithm that aids in exact matching of small pattern. The DNA representation sequence is then implemented subsequently for determining 2 bases to 6 bases matching with the remaining input sequence. This process involves transforming of DNA sequence into an ASCII symbols in the first level and compress by using LZ77 compression method in the second level and after that form the grid variables with size 3 to hold the 100 characters. In the third level of compression, the compressed output is in the grid variables. Hence, the proposed algorithm S_Pattern DNA gives an average better compression ratio of 93% when compared to the existing compression algorithms for the datasets from the UCI repository.

Mycobacteria에 대해 항균력을 나타내는 엉겅퀴의 분류를 위한 ITS1, 5.8S rRNA, ITS2의 염기서열 분석 (Identification of a Carduus spp. Showing Anti-Mycobacterial Activity by DNA Sequence Analysis of Its ITS1, 5.8S rRNA and ITS2)

  • 배영민
    • 생명과학회지
    • /
    • 제20권4호
    • /
    • pp.578-583
    • /
    • 2010
  • 세균 및 진균류의 증식을 억제하는 능력이 있는 것으로 보고된 누로와 대계의 추출물을 사용하여 Mycobacterium smegmatis 및 Mycobacterium fortuitum의 증식을 억제하는 능력이 있는지를 시험하였다. 그 결과, 누로의 추출물에서는 증식억제능을 발견할 수 없었으나, 대계의 추출물에서는 뚜렷한 증식억제능이 관찰되었다. 따라서 본 연구에 사용된 대계(엉겅퀴)에 대한 분류학적 또는 진화적 분석을 수행하기 위하여 genomic DNA를 추출한 후, ITS1, 5.8S rRNA 유전자 및 ITS2를 포함하는 부분을 PCR로 증폭시켰다. PCR 산물의 염기서열을 분석한 결과, 733-bp의 염기서열이 얻어졌고, 이것을 GenBank에 등록하였다(accession number GU188570). 이렇게 얻어진 염기서열을 사용하여 BLAST analysis를 수행한 결과, 염기서열이 일치하는 생물체는 아직까지 GenBank에 보고된 적이 없고, 가장 가까운 식물들로는 귀화식물로서 전국적으로 분포하는 Carduus crispus (지느러미엉겅퀴) 및 현재까지 국내에 자생하는 것으로 보고된 적이 없는 Carduus defloratus로서 각각 3개씩의 염기가 다른 것으로 나타났다.

Cloning and Characterization of Squalene Synthase (SQS) Gene from Ganoderma lucidum

  • Zhao, Ming-Wen;Liang, Wan-Qi;Zhang, Da-Bing;Wang, Nan;Wang, Chen-Guang;Pan, Ying-Jie
    • Journal of Microbiology and Biotechnology
    • /
    • 제17권7호
    • /
    • pp.1106-1112
    • /
    • 2007
  • This report provides the complete nucleotide sequences of the full-length cDNA encoding squalene synthase (SQS) and its genomic DNA sequence from a triterpene-producing fungus, Ganoderma lucidum. The cDNA of the squalene synthase (SQS) (GenBank Accession Number: DQ494674) was found to contain an open reading frame (ORF) of 1,404 bp encoding a 468-amino-acid polypeptide, whereas the SQS genomic DNA sequence (GenBank Accession Number: DQ494675) consisted of 1,984 bp and contained four exons and three introns. Only one gene copy was present in the G. lucidum genome. The deduced amino acid sequence of Ganoderma lucidum squalene synthase (GI-SQS) exhibited a high homology with other fungal squalene synthase genes and contained six conserved domains. A phylogenetic analysis revealed that G. lucidum SQS belonged to the fungi SQS group, and was more closely related to the SQS of U. maydis than to those of other fungi. A gene expression analysis showed that the expression level was relatively low in mycelia incubated for 12 days, increased after 14 to 20 days of incubation, and reached a relatively high level in the mushroom primordia. Functional complementation of GI-SQS in a SQS-deficient strain of Saccharomyces cerevisiae confirmed that the cloned cDNA encoded a squalene synthase.