• Title/Summary/Keyword: multiple sequence alignment

Search Result 101, Processing Time 0.029 seconds

Cloning and Sequence Analysis of a Levansucrase Gene from Rahnella aquatilis ATCC15552

  • Kim, Hyun-Jin;Yang, Ji-Young;Lee, Hyeon-Gye;Cha, Jae-Ho
    • Journal of Microbiology and Biotechnology
    • /
    • v.11 no.4
    • /
    • pp.693-699
    • /
    • 2001
  • An intracellular levansucrase gene, lscR from Rahnella aquatilis ATCC 15552, was cloned and its nucleotide sequence was determined. Nucleotide sequence analysis of this gene revealed a 1,238 bp open reading frame coding for a protein of 415 amino acids. The levansucrase was expressed by using a T7 promoter in Escherichia coli BL21 (DE3) and the enzyme activity was detected in the cytoplasmic fraction. The optimum pH and temperature of this enzyme for levan formation was pH 6 and $30^{\circ}C$, respectively. The deduced amino acid sequence of the lscR gene showed a high sequence similarity (59-89%) with Gram-negative levansucrses, while the level of similarity with Gram-positive enzymes was less than 42%. Multiple alignments of levansucrase sequences reported from Gram-negative and Gram-positive bacteria revealed seven conserved regions. A comparison of the catalytic properties and deduced amino acid sequence of lscR with those of other bacterial levansucrases strongly suggest that Gram-negative and Gram-positive levansucrases have an overall different structure, but they have a similar structure at the active site.

  • PDF

A Web-Based High Performance Multiple Sequence Alignment System Design and Implementation (웹 기반 고성능 다중서열정렬시스템 설계 및 구현)

  • Kim, Tae-Kyung;Kim, Hun-Gi;Choi, Chi-Hwan;Jung, Seung-Hyun;Hou, Bo-Kyeng;Cho, Wan-Sup
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2010.07a
    • /
    • pp.79-82
    • /
    • 2010
  • 다중서열정렬 알고리즘은 생명정보학 분야에서 서열기반의 계통분류 분석에 가장 많이 사용되며, 가장 대표적인 공개 프로그램은 ClustalW로 사용자가 로컬시스템에 설치하여 이용할 수 있다. 그러나 실제로 사용자들이 ClustalW을 설치한 후, 서열데이터의 준비, 가공, 처리 및 타 시스템과 연동 등과 같은 작업을 하는데 여러 가지 어려움이 있다. 따라서 본 논문에서는 다중서열정렬 작업을 편리하고 빠르게 수행할 수 있는 웹기반의 고성능 다중서열정렬시스템을 제안한다. 제안된 시스템의 특징은, (1) Inter-Query 라우팅 알고리즘을 통해 다수의 PC 자원을 효율적으로 활용하여 계산 성능을 극대화하였으며, (2) 사용자 편의성을 고려한 웹인터페이스의 제공을 통해 개인화된 데이터관리, 실시간 모니터링, 데이터 편집 등을 지원하여 사용자가 손쉽게 서열데이터의 수집, 관리 및 처리할 수 있도록 지원한다.

  • PDF

Bioinformatics based Identification and Characterization of Epoxide Hydrolase of Gordonia westfalica for the Production of Chiral Epoxides (Bioinformatics를 활용한 토양미생물인 Gordonia westfalica Epoxide Hydrolase 생촉매 개발 및 Chiral Epoxides 제조 특성 분석)

  • Lee Soo Jung;Lee Eun Jung;Kim Hee Sook;Lee Eun Yeol
    • KSBB Journal
    • /
    • v.20 no.4
    • /
    • pp.311-316
    • /
    • 2005
  • Epoxide hydrolases (EHs) are versatile biocatalysts for the preparation of chiral epoxides by enantioselective hydrolysis from racemic epoxides. Various microorganisms were identified to possess a EH activity by multiple sequence alignment and analysis of conserved domain sequence from genomic and megaplasmid sequence data. We successfully isolated Gordonia westfalica possessing EH activity from various microbial strains from culture type collections. G. westfalica exhibited (R)-styrene oxide preferred enantioselective hydrolysis activity. Chiral (S)-styrene oxide with high optical purity $(>\;99\%)\;ee)$ and yield of $36.5\%$ was obtained from its racemate using whole-cell of G. westfalica.

Global Sequence Homology Detection Using Word Conservation Probability

  • Yang, Jae-Seong;Kim, Dae-Kyum;Kim, Jin-Ho;Kim, Sang-Uk
    • Interdisciplinary Bio Central
    • /
    • v.3 no.4
    • /
    • pp.14.1-14.9
    • /
    • 2011
  • Protein homology detection is an important issue in comparative genomics. Because of the exponential growth of sequence databases, fast and efficient homology detection tools are urgently needed. Currently, for homology detection, sequence comparison methods using local alignment such as BLAST are generally used as they give a reasonable measure for sequence similarity. However, these methods have drawbacks in offering overall sequence similarity, especially in dealing with eukaryotic genomes that often contain many insertions and duplications on sequences. Also these methods do not provide the explicit models for speciation, thus it is difficult to interpret their similarity measure into homology detection. Here, we present a novel method based on Word Conservation Score (WCS) to address the current limitations of homology detection. Instead of counting each amino acid, we adopted the concept of 'Word' to compare sequences. WCS measures overall sequence similarity by comparing word contents, which is much faster than BLAST comparisons. Furthermore, evolutionary distance between homologous sequences could be measured by WCS. Therefore, we expect that sequence comparison with WCS is useful for the multiple-species-comparisons of large genomes. In the performance comparisons on protein structural classifications, our method showed a considerable improvement over BLAST. Our method found bigger micro-syntenic blocks which consist of orthologs with conserved gene order. By testing on various datasets, we showed that WCS gives faster and better overall similarity measure compared to BLAST.

Cloning and characterization of a cDNA encoding a paired box protein, PAX7, from black sea bream, Acanthopagrus schlegelii

  • Choi, Jae Hoon;Han, Dan Hee;Gong, Seung Pyo
    • Journal of Animal Reproduction and Biotechnology
    • /
    • v.36 no.4
    • /
    • pp.314-322
    • /
    • 2021
  • Paired box protein, PAX7, is a key molecule for the specification, maintenance and skeletal muscle regeneration of muscle satellite cells. In this study, we identified and characterized the cDNA and amino acid sequences of PAX7 from black sea bream (Acanthopagrus schlegelii) via molecular cloning and sequence analysis. A. schlegelii PAX7 cDNA was comprised of 1,524 bp encoding 507 amino acids and multiple sequence alignment analysis of the translated amino acids showed that it contained three domains including paired DNA-binding domain, homeobox domain and OAR domain which were well conserved across various animal species investigated. Pairwise Sequence Alignment indicated that A. schlegelii PAX7 had the same amino acid sequences with that of yellowfin seabream (A. latus) and 99.8% identity and similarity with that of gilt-head bream (Sparus aurata). Molecular phylogenetic analysis confirmed that A. schlegelii PAX7 formed a monophyletic group with those of teleost and most closely related with those of the fish that belong to Sparidae family including A. latus and S. aurata. In the investigation of its tissue specific mRNA expression, the expression was specifically identified in skeletal muscle tissue and a weak expression was also shown in gonad tissue. The cultured cells derived from skeletal muscle tissues expressed PAX7 mRNA at early passage but the expression was not observed after several times of subculture.

Implementation of Parallel Local Alignment Method for DNA Sequence using Apache Spark (Apache Spark을 이용한 병렬 DNA 시퀀스 지역 정렬 기법 구현)

  • Kim, Bosung;Kim, Jinsu;Choi, Dojin;Kim, Sangsoo;Song, Seokil
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.10
    • /
    • pp.608-616
    • /
    • 2016
  • The Smith-Watrman (SW) algorithm is a local alignment algorithm which is one of important operations in DNA sequence analysis. The SW algorithm finds the optimal local alignment with respect to the scoring system being used, but it has a problem to demand long execution time. To solve the problem of SW, some methods to perform SW in distributed and parallel manner have been proposed. The ADAM which is a distributed and parallel processing framework for DNA sequence has parallel SW. However, the parallel SW of the ADAM does not consider that the SW is a dynamic programming method, so the parallel SW of the ADAM has the limit of its performance. In this paper, we propose a method to enhance the parallel SW of ADAM. The proposed parallel SW (PSW) is performed in two phases. In the first phase, the PSW splits a DNA sequence into the number of partitions and assigns them to multiple nodes. Then, the original Smith-Waterman algorithm is performed in parallel at each node. In the second phase, the PSW estimates the portion of data sequence that should be recalculated, and the recalculation is performed on the portions in parallel at each node. In the experiment, we compare the proposed PSW to the parallel SW of the ADAM to show the superiority of the PSW.

A Simple Java Sequence Alignment Editing Tool for Resolving Complex Repeat Regions

  • Ham, Seong-Il;Lee, Kyung-Eun;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.7 no.1
    • /
    • pp.46-48
    • /
    • 2009
  • Finishing is the most time-consuming step in sequencing, and many genome projects are left unfinished due to complex repeat regions. Here, we have developed BACContigEditor, a prototype shotgun sequence finishing tool. It is essentially an editor that visualizes assemblies of shotgun sequence fragment reads as gapped multiple alignments. The program offers some flexibility that is needed to rapidly resolve complex regions within a working session. The sole purpose of the release is to promote collaborative creation of extensible software for fragment assembly editors, foster collaborative development, and reduce barriers to initial tool development effort. We describe our software architecture and identify current challenges. The program is available under an Open Source license.

A study of system development for multiple sequence alignment (복수 서열 정렬을 위한 시스템 개발에 관한 연구)

  • Kim, Dong-Hoi;Kim, Jin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05b
    • /
    • pp.1027-1030
    • /
    • 2003
  • 유전체 서열결정이 폭발적으로 증가해 가고 있다. 인간 유전체사업(Human genome project)의 궁극적인 목적은 인간 염색체에 있는 30억개의 뉴클레오티드와 10만개의 유전자를 밝혀내는 것이고 생의학에서 새로운 발견이나 옹용을 위한 정보로 이용하는 것이다. 이 사업은 1980년대 후반에 시작되었고 현재 서열의 결정이 완료된 상태이다. 본 논문에서는 인간 유전체 사업에서 파생된 가장 중요한 문제 중의 하나인 복수 염기서열 정렬 문제와 복수 염기서열 정렬 시스템의 구현에 대하여 논한다.

  • PDF