DOI QR코드

DOI QR Code

계층적 정렬쌍 가시화를 이용한 유전자 클러스터 탐색 알고리즘

A Gene Clustering Method with Hierarchical Visualization of Alignment Pairs

  • 발행 : 2009.06.30

초록

최근 생물정보학 분야의 연구는 하나하나의 유전자를 연구하던 예전의 방법에서 유전자들간의 관계를 알아보는 연구들로 변해가고 있다. 이러한 유전자들 간의 연구 중 하나가 유전자 팀(gene team)을 연구하는 것이다. 유전자 팀이란 몇몇 염색체들 사이의 유전자들이 보존되어 있는 것을 말하며, 닫힌 영역 안에 보존되어 있는 유전자들의 집합으로 볼 수 있다. 이들은 진화과정을 거치면서, 유전자 팀 내의 유전자들의 위치나 그 종류가 변한다. 이러한 유전자 팀을 찾기 위해 많은 연구들이 이루어져왔다. 본 논문은 생물정보학 분야에서 많이 사용되는 계층적 클러스터링(hierarchical clustering)방법을 변형하여 전체 유전체(whole genome) 쌍내에서의 의미 있는 영역을 찾고, 영역 내에서 gene team을 찾을 수 있는 방법을 소개한다. 본 연구 방법을 이용하면, 복잡한 구조의 두 유전체 사이의 연관 유전자들이나 유사 영역들의 맵(map)을 단계별로 간략화 하여 나타낼 수 있다.

One of the main issues in comparative genomics is to study chromosomal gene order in one or more related species. For this purpose, the whole genome alignment is usually applied to find the horizontal gene transfer, gene duplication, and gene loss between two related genomes. Also it is well known that the novel visualization tool with whole genome alignment is greatly useful for us to understand genome organization and evolution process. There are a lot of algorithms and visualization tools already proposed to find the "gene clusters" on genome alignments. But due to the huge size of whole genome, the previous visualization tools are not convenient to discover the relationship between two genomes. In this paper, we propose AlignScope, a novel visualization system for whole genome alignment, especially useful to find gene clusters between two aligned genomes. This AlignScope not only provides the simplified structure of genome alignment at any simplified level, but also helps us to find gene clusters. In experiment, we show the performance of AlignScope with several microbial genomes such as B. subtilis, B.halodurans, E. coli K12, and M. tuberculosis H37Rv, which have more than 5000 alignment pairs (matched DNA subsequence).

키워드

참고문헌

  1. R. Overbeek, M. Fonstein, M. D'Souza, G. D. Pusch, and N. Maltsev, The use of gene clusters to infer functional cupling, In Proc, the National Academy of Sciences USA, 1999
  2. S. Goto H. Ogata, W. Fujibunchi and M. Kanehisa, A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters, NAR, 2000 https://doi.org/10.1093/nar/28.20.4021
  3. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, and Natale DA, The cog database: an updated version includes eukaryotes, BMC Bioinformatics, 2003 https://doi.org/10.1186/1471-2105-4-41
  4. Risler J.-L. Bergeron A. Nicolas, L and M. Raffinot, Gene teams: a new formalization of gene clusters for comparative genomics, Computational Biology and Chemistry, 2002 https://doi.org/10.1016/S1476-9271(02)00097-X
  5. Hee-Jeong Jin and Hye-Jung Kim and Jung-Hyun Choi and Hwan-Gue Cho, AlignScope : A Visual Mining Tool for Gene Team Finding with Whole Genome Alignment, 4th Asia Pacic Bioinformatics Conference, pp.69-78, 2006
  6. Hee-Jeong and Hwan-Gue Cho, Hierarchical Alignment Graph for Gene Teams Finding on Whole Genomes, SAC(the acm Symposium on Applied Computing ), pp.113-117, 2007 https://doi.org/10.1145/1244002.1244030
  7. M. Junger and P. Mutzel, 2-layer straightline crossing minimization: performance of exact and heuristic algorithms, Journal of Graph Algorithms and Applications, pp.1-25, 1997
  8. P. Eades and N. Wormald, Edge crossings in drawings of bipartite graphs, Algorithmica, pp.379-403, 1994 https://doi.org/10.1007/BF01187020
  9. A. Yamaguchi and Sugimoto, An approximation algorithm for the two-layered graph drawing problem, Proc, of the 6th Annual International Computing and Combinatorics Conference, Lecture Notes in Computer Science, pp.81-91, 1999 https://doi.org/10.1007/3-540-48686-0