• Title/Summary/Keyword: DNA sequence alignment

Search Result 157, Processing Time 0.03 seconds

Cloning, Expression, and Characterization of DNA Polymerase from Hyperthermophilic Bacterium Aquifex pyrophilus

  • Choi, Jeong-Jin;Kwon, Suk-Tae
    • Journal of Microbiology and Biotechnology
    • /
    • v.14 no.5
    • /
    • pp.1022-1030
    • /
    • 2004
  • The gene encoding Aquifex pyrophilus (Apy) DNA polymerase was cloned and sequenced. The Apy DNA polymerase gene consists of 1,725 bp coding for a protein with 574 amino acid residues. The deduced amino acid sequence of Apy DNA. polymerase showed a high sequence homology to Escherichia coli DNA polymerase I-like DNA polymerases. It was deduced by amino acid sequence alignment that Apy DNA polymerase, like the Klenow fragment, has only the two domains, the $3'{\rightarrow}5'$ exonuclease domain and the $5'{\rightarrow}3'$ polymerase domain, containing the characteristic motifs. The Apy DNA polymerase gene was expressed under the control of T7lac promoter on the expression vector pET-22b(+) in E. coli. The expressed enzyme was purified by heat treatment, and Cibacron blue 3GA and $UNO^{TM}$ Q column chromatographies. The optimum pH of the purified enzyme was 7.5, and the optimal concentrations of KCl and $Mg^{2+}$ were 20 mM and 3 mM, respectively. Apy DNA polymerase contained a double strand-dependent $3'{\rightarrow}5'$ proofreading exonuclease activity, but lacked any detectable $5'{\rightarrow}3'$ exonuclease activity, which is consistent with its amino acid sequence. The somewhat lower thermostability of Apy DNA polymerase than the growth temperature of A. pyrophilus was analyzed by the comparison of amino acid composition and pressure effect.

Cloning and characterization of a cDNA encoding a paired box protein, PAX7, from black sea bream, Acanthopagrus schlegelii

  • Choi, Jae Hoon;Han, Dan Hee;Gong, Seung Pyo
    • Journal of Animal Reproduction and Biotechnology
    • /
    • v.36 no.4
    • /
    • pp.314-322
    • /
    • 2021
  • Paired box protein, PAX7, is a key molecule for the specification, maintenance and skeletal muscle regeneration of muscle satellite cells. In this study, we identified and characterized the cDNA and amino acid sequences of PAX7 from black sea bream (Acanthopagrus schlegelii) via molecular cloning and sequence analysis. A. schlegelii PAX7 cDNA was comprised of 1,524 bp encoding 507 amino acids and multiple sequence alignment analysis of the translated amino acids showed that it contained three domains including paired DNA-binding domain, homeobox domain and OAR domain which were well conserved across various animal species investigated. Pairwise Sequence Alignment indicated that A. schlegelii PAX7 had the same amino acid sequences with that of yellowfin seabream (A. latus) and 99.8% identity and similarity with that of gilt-head bream (Sparus aurata). Molecular phylogenetic analysis confirmed that A. schlegelii PAX7 formed a monophyletic group with those of teleost and most closely related with those of the fish that belong to Sparidae family including A. latus and S. aurata. In the investigation of its tissue specific mRNA expression, the expression was specifically identified in skeletal muscle tissue and a weak expression was also shown in gonad tissue. The cultured cells derived from skeletal muscle tissues expressed PAX7 mRNA at early passage but the expression was not observed after several times of subculture.

Optimized and Portable FPGA-Based Systolic Cell Architecture for Smith-Waterman-Based DNA Sequence Alignment

  • Shah, Hurmat Ali;Hasan, Laiq;Koo, Insoo
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.1
    • /
    • pp.26-34
    • /
    • 2016
  • The alignment of DNA sequences is one of the important processes in the field of bioinformatics. The Smith-Waterman algorithm (SWA) performs optimally for aligning sequences but is computationally expensive. Field programmable gate array (FPGA) performs the best on parameters such as cost, speed-up, and ease of re-configurability to implement SWA. The performance of FPGA-based SWA is dependent on efficient cell-basic implementation-unit design. In this paper, we present an optimized systolic cell design while avoiding oversimplification, very large-scale integration (VLSI)-level design, and direct mapping of iterative equations such as previous cell designs. The proposed design makes efficient use of hardware resources and provides portability as the proposed design is not based on gate-level details. Our cell design implementing a linear gap penalty resulted in a performance improvement of 32× over a GPP platform and surpassed the hardware utilization of another implementation by a factor of 4.23.

cDNA Sequence and mRNA Expression of a Novel Peroxiredoxin from the Firefly, pyrocoelia rufa

  • Jin, Byung-Rae;Lee, Kwang-Sik;Kim, Seong-Ryul;Sohn, Hung-Dae
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.4 no.2
    • /
    • pp.101-107
    • /
    • 2002
  • We describe here the cDNA sequence and mRNA expression of a novel family of the antioxidant protein, peroxiredoxin, from the firefly, Pyracoetia ruin. The 555 bp cDNA sequence codes for a 185 amino acid protein with a calculated molecular mass of approximately 21 kDa. The deduced protein of P. rufa peroxiredoxin gene contains two conserved cysteine residues. Alignment of the deduced protein of P. rufa peroxiredoxin gene showed 71.1% protein sequenceidentity to known insect Drosophila melanogaster peroxiredoxin. Northern blot analysis revealed that the P. rufa peroxiredoxin is specifically expressed in the fat body of P. rufa larvae.

cDNA Sequence and mRNA Expression of a Novel Serine Protease from the Firefly, Pyrocoelia rufa

  • Lee, Kwang-Sik;Kim, Seong-Ryul;Sohn, Hung-Dae;Jin, Byung-Rae
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.5 no.1
    • /
    • pp.103-108
    • /
    • 2002
  • We describe here the cDNA sequence and mRNA expression of a novel serine pretense from the firefly, Pyrocoelia rufa. The 771 bp cDNA encodes for 257 amino acid residues. The deduced protein of P. rufa serine pretense gene contains the catalytic triad and six-conserved cysteine residues. Alignment of the deduced protein of P. rufa serine pretense gene showed 47.4% protein sequence identity to known coleopteran insect Rhyzopertha dominica midgut trpsin-like enzyme. Northern blot analysis revealed that the P. rufa serine pretense is specifically expressed in the midgut of P. rufa larvae.

Physiological and Phylogenetic Analysis of Burkholderia sp. HY1 Capable of Aniline Degradation

  • Kahng, Hyung-Yeel;Jerome J. Kukor;Oh, Kye-Heon
    • Journal of Microbiology and Biotechnology
    • /
    • v.10 no.5
    • /
    • pp.643-650
    • /
    • 2000
  • A new aniline-utilizing microorganism, strain HY1 obtained from an orchard soil, was characterized by using the BIOLOG system, an analysis of the total cellular fatty acids, and a 16S rDNA sequence. Strain HY1 was identified as a Burkholderia species, and was designated Burkholderia sp. HY1. GC and HPLC analyses revealed that Burkholderia sp. HY1 was able to degrade aniline to produce catechol, which was subsequently converted to cis,cis-muconic acid through an ortho-ring fission pathway under aerobic conditions. Strain HY1 exhibited a drastic reduction in the rate of aniline degradation when glucose was added to the aniline media. However, the addition of peptone or nitrate to the aniline media dramatically accelerated the rate of aniline degradation. A fatty acid analysis showed that strain HY1 was able to produce lipids 16:0 2OH, and 11 methyl 18:1 ${\omega}7c$ approximately 3.7-, 2.2-, and 6-fold more, respectively, when grown on aniline media than when grown on TSA. An analysison the alignment of a 1,435 bp fragment. A phylogenetic analysis of the 16S rDNA sequence based on a 1,420 bp multi-alignment sowed of the 16s rDNA sequence revealed that strain HY1 was very closely related to Burkholderia graminis with 95% similarity based that strain HY1 was placed among three major clonal types of $\beta$-Proteobacteria, including Burkholderia graminis, Burkholderia phenazinium, and Burkholderia glathei. The sequence GAT(C or G)${\b{G}}$, which is highly conserved in several locations in the 16S rDNA gene among the major clonal type strains of $\beta$-Proteobacteria, was frequently replaced with GAT(C or G)${\b{A}}$ in the 16S rDNA sequence from strain HY1.

  • PDF

Algorithm of Clustering-based Multiple Sequence Alignment (클러스터링 기반 다중 서열 정렬 알고리즘)

  • Lee, Byung-Il;Lee, Jong-Yun;Jung, Soon-Key
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.27-30
    • /
    • 2005
  • 3개 이상의 DNA 혹은 단백질의 염기서열을 정렬하는 다중 서열 정렬(multiple sequence alignment, MSA)은 서열들 사이의 진화관계, 단백질의 구조와 기능에 관한 연구에 필수적인 도구이다. 최적화된 다중서열 정렬을 얻기 위해 사용되는 가장 유용한 방법은 동적 프로그래밍이다. 그러나 동적프로그래밍은 정렬하고자 하는 서열의 수가 증가함에 따라 시간도 지수함수($O(n^k)$)로 증가하기 때문에 다중 서열 정렬에는 효율적이지 못하다. 따라서, 본 논문에서는 최적의 MSA 문제를 해결하기 위해 클러스터링 기반의 새로운 다중 서열 정렬 (Clustering-based Multiple Sequence Alignment, CMSA) 알고리즘을 제안한다. 결과적으로 제안한 CMSA 알고리즘의 기여도는 다중 서열 정렬의 질적 향상과 처리 시간 단축($O(n^3L^2)$)이 기대된다.

  • PDF

Phylogenetic Analysis of Dendropanax morbifera Using Nuclear Ribosomal DNA Internal Transcribed Spacer (ITS) Region Sequences (Internal transcribed spacer (ITS) region의 염기서열 분석에 의한 보길도산 황칠나무의 분자 계통학적 연구)

  • Shin, Yong Kook
    • Journal of Life Science
    • /
    • v.26 no.11
    • /
    • pp.1341-1344
    • /
    • 2016
  • Dendropanax morbifera is an endemic tree species of Korea, it is restricted to the southern parts of Korea. The internal transcribed spacer (ITS) region of nuclear ribosomal DNA (nrDNA) for Dendropanax morbifera grown at Bogil-do, Korea was determined. We investigated the sequence-based phylogenetic relationships of plants related and clarified its taxonomical position. The determined sequences consisted of 689 residues. ITS1 was 222 bp long while ITS2 was 233 bp long. The 5.8S rDNA was 160 bp long. The ITS region sequences of the Dendropanax species included in this study were obtained from GenBank. Oreopanax polycephalus was used as the outgroup. A pairwise alignment was calculated using the Clustal X program. A phylogenetic tree was constructed by the neighbor-joining method using the Tree view program. Sequence similarities among species including D. morbifera Bogil-do isolate showed the range 92.6 to 99.7% in sequence-based phylogenetic analysis using total 615 base pairs of ITS1, 5.8S rDNA and ITS2. D. morbifera Bogil-do isolate exhibited the highest degree of relatedness to D. chevalieri, sharing 99.7% ITS region similarity. D. morbifera Bogil-do isolate also showed to D. trifidus, sharing 99.4% ITS region similarity.

Gene Sequences Clustering for the Prediction of Functional Domain (기능 도메인 예측을 위한 유전자 서열 클러스터링)

  • Han Sang-Il;Lee Sung-Gun;Hou Bo-Kyeng;Byun Yoon-Sup;Hwang Kyu-Suk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.10
    • /
    • pp.1044-1049
    • /
    • 2006
  • Multiple sequence alignment is a method to compare two or more DNA or protein sequences. Most of multiple sequence alignment tools rely on pairwise alignment and Smith-Waterman algorithm to generate an alignment hierarchy. Therefore, in the existing multiple alignment method as the number of sequences increases, the runtime increases exponentially. In order to remedy this problem, we adopted a parallel processing suffix tree algorithm that is able to search for common subsequences at one time without pairwise alignment. Also, the cross-matching subsequences triggering inexact-matching among the searched common subsequences might be produced. So, the cross-matching masking process was suggested in this paper. To identify the function of the clusters generated by suffix tree clustering, BLAST and CDD (Conserved Domain Database)search were combined with a clustering tool. Our clustering and annotating tool consists of constructing suffix tree, overlapping common subsequences, clustering gene sequences and annotating gene clusters by BLAST and CDD search. The system was successfully evaluated with 36 gene sequences in the pentose phosphate pathway, clustering 10 clusters, finding out representative common subsequences, and finally identifying functional domains by searching CDD database.

DNA Sequence Alignment Using a Graph-based Distributed System (그래프 기반 분산 시스템을 이용한 염기 서열 정렬)

  • Lee, Jun-Su;Ahn, Jae-Gyoon;Yeu, Yun-Ku;Roh, Hong-Chan;Park, Sang-Hyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.894-897
    • /
    • 2013
  • 서열 정렬(sequence alignment)은 유전학(genomic)에서 널리 사용되는 도구 중 하나이다. 최근에는 차세대 시퀀싱 기술(NGS)이 발달함에 따라 데이터의 생산량이 크게 증가했고, 이에 따라 높은 처리량(throughput)을 가진 서열 정렬 알고리즘의 필요성이 증가하였다. 본 논문에서 제안하는 염기 서열 정렬 알고리즘은 시퀀스(sequence)데이터를 그래프 형태로 변형시킨 다음, 마이크로소프트사의 그래프 기반인 메모리(in-memory) 분산시스템(distributed system) 트리니티(Trinity)를 이용해 서열 정렬을 수행한다. 본 논문의 알고리즘은 트리니티 시스템에서 시뮬레이션 염기 데이터를 성공적으로 정렬하였으며, 슬레이브의 개수가 늘어날수록 빠른 속도를 나타내어 확장성(scalability)을 입증했다.