Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

  • Yoo, Wonseok (Department of Bioinformatics and Life Science, Soongsil University) ;
  • Lim, Dongbin (Department of Bioinformatics and Life Science, Soongsil University) ;
  • Kim, Sangsoo (Department of Bioinformatics and Life Science, Soongsil University)
  • Received : 2015.12.03
  • Accepted : 2016.02.23
  • Published : 2016.03.31


A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA.


Supported by : National Research Foundation of Korea


  1. Inouye S, Herzer PJ, Inouye M. Two independent retrons with highly diverse reverse transcriptases in Myxococcus xanthus. Proc Natl Acad Sci U S A 1990;87:942-945.
  2. Lima TM, Lim D. A novel retron that produces RNA-less msDNA in Escherichia coli using reverse transcriptase. Plasmid 1997;38:25-33.
  3. Inouye S, Hsu MY, Xu A, Inouye M. Highly specific recognition of primer RNA structures for 2'-OH priming reaction by bacterial reverse transcriptases. J Biol Chem 1999;274:31236-31244.
  4. Borodovsky M, McIninch J. GeneMark: parallel gene recognition for both DNA strands. Comput Chem 1993;17:123-133.
  5. Tao Tao. 3.5 RPS BLAST. Bethesda: National Center for Biotechnology Information, 2006. Accessed 2015 Jun 6. Available from:
  6. Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 2000;16:276-277.
  7. Durbin R, Eddy S, Krogh A, Mitchison G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press, 1998.
  8. Sato K, Hamada M, Asai K, Mituyama T. CENTROIDFOLD: a web server for RNA secondary structure prediction. Nucleic Acids Res 2009;37:W277-W280.
  9. Lorenz R, Bernhart SH, Honer Zu Siederdissen C, Tafer H, Flamm C, et al. ViennaRNA Package 2.0. Algorithms Mol Biol 2011;6:26.
  10. Katoh K, Toh H. Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 2008;9:286-298.

Cited by

  1. Multi-copy single-stranded DNA in Escherichia coli pp.1465-2080, 2017,
  2. It Is Imperative to Establish a Pellucid Definition of Chimeric RNA and to Clear Up a Lot of Confusion in the Relevant Research vol.18, pp.4, 2017,