• Title/Summary/Keyword: DNA sequences

Search Result 2,697, Processing Time 0.026 seconds

Fast Matching Method for DNA Sequences (DNA 서열을 위한 빠른 매칭 기법)

  • Kim, Jin-Wook;Kim, Eun-Sang;Ahn, Yoong-Ki;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.4
    • /
    • pp.231-238
    • /
    • 2009
  • DNA sequences are the fundamental information for each species and a comparison between DNA sequences of different species is an important task. Since DNA sequences are very long and there exist many species, not only fast matching but also efficient storage is an important factor for DNA sequences. Thus, a fast string matching method suitable for encoded DNA sequences is needed. In this paper, we present a fast string matching method for encoded DNA sequences which does not decode DNA sequences while matching. We use four-characters-to-one-byte encoding and combine a suffix approach and a multi-pattern matching approach. Experimental results show that our method is about 5 times faster than AGREP and the fastest among known algorithms.

Effect of escherichia coli plasmid DNA sequences on plasmid replication in yeast (효모에서 plasmid의 복제에 대장균 plasmid DNA가 미치는 영향에 관한 연구)

  • 김태국;최철용;노현모
    • Korean Journal of Microbiology
    • /
    • v.27 no.1
    • /
    • pp.16-20
    • /
    • 1989
  • The effect of E. coli plasmid DNA sequences contained by chimeric vectors on plasmid replication was investigated. We constructed YRp7- or 2.$\mu$m circle-based plasmids containing E. coli plasmid DNA sequences and those not containing it. By examining their maintenance in yeast, we showed that plasmid without E. coli plasmid DNA sdquences was nore stable and presented higher copy number, and espressed higher level of hepatitis B viral surface antigen as a foreign gene. This result suggested that E. coli plasmid DNA sequences within chimeric plasmid somehow inhibited plasmid replication in yeast.

  • PDF

Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

  • Yoo, Wonseok;Lim, Dongbin;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.14 no.1
    • /
    • pp.29-33
    • /
    • 2016
  • A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA.

A Revision of the Phylogeny of Helicotylenchus Steiner, 1945 (Tylenchida: Hoplolaimidae) as Inferred from Ribosomal and Mitochondrial DNA

  • Abraham Okki, Mwamula;Oh-Gyeong Kwon;Chanki Kwon;Yi Seul Kim;Young Ho Kim;Dong Woon Lee
    • The Plant Pathology Journal
    • /
    • v.40 no.2
    • /
    • pp.171-191
    • /
    • 2024
  • Identification of Helicotylenchus species is very challenging due to phenotypic plasticity and existence of cryptic species complexes. Recently, the use of rDNA barcodes has proven to be useful for identification of Helicotylenchus. Molecular markers are a quick diagnostic tool and are crucial for discriminating related species and resolving cryptic species complexes within this speciose genus. However, DNA barcoding is not an error-free approach. The public databases appear to be marred by incorrect sequences, arising from sequencing errors, mislabeling, and misidentifications. Herein, we provide a comprehensive analysis of the newly obtained, and published DNA sequences of Helicotylenchus, revealing the potential faults in the available DNA barcodes. A total of 97 sequences (25 nearly full-length 18S-rRNA, 12 partial 28S-rRNA, 16 partial internal transcribed spacer [ITS]-rRNA, and 44 partial cytochrome c oxidase subunit I [COI] gene sequences) were newly obtained in the present study. Phylogenetic relationships between species are given as inferred from the analyses of 103 sequences of 18S-rRNA, 469 sequences of 28S-rRNA, 183 sequences of ITS-rRNA, and 63 sequences of COI. Remarks on suggested corrections of published accessions in GenBank database are given. Additionally, COI gene sequences of H. dihystera, H. asiaticus and the contentious H. microlobus are provided herein for the first time. Similar to rDNA gene analyses, the COI sequences support the genetic distinctness and validity of H. microlobus. DNA barcodes from type material are needed for resolving the taxonomic status of the unresolved taxonomic groups within the genus.

2-D Graphical Representation for Characteristic Sequences of DNA and its Application

  • Li, Chun;Hu, Ji
    • BMB Reports
    • /
    • v.39 no.3
    • /
    • pp.292-296
    • /
    • 2006
  • DNA sequencing has resulted in an abundance of data on DNA sequences for various species. Hence, the characterization and comparison of sequences become more important but still difficult tasks. In this paper, we first give a 2-D ladderlike graphical representation for the characteristic sequences of a DNA sequence, and then construct a 3-component vector, in which the normalized ALE-indices extracted from such three 2-D graphs via D/D matrices are individual components, to characterize the DNA sequence. The examination of similarities/dissimilarities among sequences of the $\beta$-globin genes of different species illustrates the utility of the approach.

DNA Sequence Visualization with k-convex Hull (k-convex hull을 이용한 DNA 염기 배열의 가시화)

  • Kim, Min Ah;Lee, Eun Jeong;Cho, Hwan Gyu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.2 no.2
    • /
    • pp.61-68
    • /
    • 1996
  • In this paper we propose a new visualization technique to characterize qualitative information of a large DNA sequence. While a long DNA sequence has huge information, it is not easy to obtain genetic information from the DNA sequence. We transform DNA sequences into a polygon to compute their homology in image domain rather than text domain. Our program visualizes DNA sequences with colored random walk plots and simplify them k-convex hulls. A random walk plot represents DNA sequence as a curve in a plane. A k-convex hull simplifies a random work plot by removing some parts of its insignificant information. This technique gives a biologist an insight to detect and classify DNA sequences with easy. Experiments with real genome data proves our approach gives a good visual forms for long DNA sequences for homology analysis.

  • PDF

Mitochondrial DNA Sequence Variability of Spirometra Species in Asian Countries

  • Jeon, Hyeong-Kyu;Eom, Keeseon S.
    • Parasites, Hosts and Diseases
    • /
    • v.57 no.5
    • /
    • pp.481-487
    • /
    • 2019
  • Mitochondrial DNA sequence variability of Spirometra erinaceieuropaei in GenBank was observed by reinvestigation of mitochondrial cox1 and cytb sequences. The DNA sequences were analyzed in this study, comprising complete DNA sequences of cox1 (n=239) and cytb (n=213) genes. The 10 complete mitochondrial DNA sequences of Spirometra species were compared with those of Korea, China and Japan. The sequences were analyzed for nucleotide composition, conserved sites, variable sites, singleton sites and parsimony-informative sites. Phylogenetic analyses was done using neighbor joining, maximum parsimony, Bayesian inference and maximum-likelihood on cox1 and cytb sequences of Spirometra species. These polymorphic sites identified 148 (cox1) and 83 (cytb) haplotypes within 239 and 213 isolates from 3 Asian countries. Phylogenetic tree topologies were presented high-level confidence values for the 2 major branches of 2 Spirometra species containing S. erinaceieuropaei and S. decipiens, and S. decipiens sub-clades including all sequences registered as S. erinaceieuropaei in cox1 and cytb genes. These results indicated that mitochondrial haplotypes of S. erinaceieuropaei and S. decipiens were found in the 3 Asian countries.

Conserved Regions in Mitochondrial Genome Sequences of Small Mammals in Korea

  • Kim, Hye Ri;Park, Yung Chul
    • Journal of Forest and Environmental Science
    • /
    • v.28 no.4
    • /
    • pp.278-281
    • /
    • 2012
  • Comparative sequence analyses were conducted on complete mtDNA sequences from four small mammal species in Korea and revealed the presence of 30 well conserved sequences in various regions of the complete mtDNA sequences. The conserved sequences were found in 9 regions in protein coding genes, 10 regions in tRNA genes, 10 in rRNA genes, one region in replication origin and 2 regions in D loop. They could be used to design primers for amplifying complete mtDNA sequences of small mammals.

An assessment of the taxonomic reliability of DNA barcode sequences in publicly available databases

  • Jin, Soyeong;Kim, Kwang Young;Kim, Min-Seok;Park, Chungoo
    • ALGAE
    • /
    • v.35 no.3
    • /
    • pp.293-301
    • /
    • 2020
  • The applications of DNA barcoding have a wide range of uses, such as in taxonomic studies to help elucidate cryptic species and phylogenetic relationships and analyzing environmental samples for biodiversity monitoring and conservation assessments of species. After obtaining the DNA barcode sequences, sequence similarity-based homology analysis is commonly used. This means that the obtained barcode sequences are compared to the DNA barcode reference databases. This bioinformatic analysis necessarily implies that the overall quantity and quality of the reference databases must be stringently monitored to not have an adverse impact on the accuracy of species identification. With the development of next-generation sequencing techniques, a noticeably large number of DNA barcode sequences have been produced and are stored in online databases, but their degree of validity, accuracy, and reliability have not been extensively investigated. In this study, we investigated the extent to which the amount and types of erroneous barcode sequences were deposited in publicly accessible databases. Over 4.1 million sequences were investigated in three largescale DNA barcode databases (NCBI GenBank, Barcode of Life Data System [BOLD], and Protist Ribosomal Reference database [PR2]) for four major DNA barcodes (cytochrome c oxidase subunit 1 [COI], internal transcribed spacer [ITS], ribulose bisphosphate carboxylase large chain [rbcL], and 18S ribosomal RNA [18S rRNA]); approximately 2% of erroneous barcode sequences were found and their taxonomic distributions were uneven. Consequently, our present findings provide compelling evidence of data quality problems along with insufficient and unreliable annotation of taxonomic data in DNA barcode databases. Therefore, we suggest that if ambiguous taxa are presented during barcoding analysis, further validation with other DNA barcode loci or morphological characters should be mandated.

Grouping DNA sequences with similarity measure and application

  • Lee, Sanghyuk
    • Journal of the Korea Convergence Society
    • /
    • v.4 no.3
    • /
    • pp.35-41
    • /
    • 2013
  • Grouping problem with similarities between DNA sequences are studied. The similaritymeasure and the distance measure showed the complementary characteristics. Distance measure can be obtained by complementing similarity measure, and vice versa. Similarity measure is derived and proved. Usefulness of the proposed similarity measure is applied to grouping problem of 25 cockroach DNA sequences. By calculation of DNA similarity, 25 cockroaches are clustered by four groups, and the results are compared with the previous neighbor-joining method.