O(N log N) ALGORITHM FOR FINDING PRIMARY TANDEM REPEATS IN A DNA GENOMIC SEQUENCE

  • Ma, Sang-Back (School of Electrical Engineering and Computer Science Hanyang University) ;
  • Jun, Hyeong-Hwa (School of Electrical Engineering and Computer Science Hanyang University)
  • Published : 2005.06.25

Abstract

The genomes of organism are being published in an enormous speed. The genomes has a lot of intronic regions, and repeats constitute a substantial part of that. Repeats playa crucial role in DNA finger-printing, and detecting certain genomic diseases, such as Huntington disease, which has a high number of CAG repeats. Also, they throw important clues about the evolutionary history. Repeats are in two types, Tandem Repeats and Interspersed Repeats. In this paper we address ourselves to the problem of detecting Primary Tandem Repeats, which are tandem repeats that are not contained in any tandem repeats. We show that our algorithm takes O(n log n) time, where n is the length of genome.

Keywords