분자 데이터베이스 스크리닝을 위한 원자간 거리 기반의 3차원 형상 기술자

3D Shape Descriptor with Interatomic Distance for Screening the Molecular Database

  • 이재호 (동국대학교 디지털제품연구실) ;
  • 박준영 (동국대학교 산업시스템공학과)
  • 발행 : 2009.12.31

초록

In the computational molecular analysis, 3D structural comparison for protein searching plays a very important role. As protein databases have been grown rapidly in size, exhaustive search methods cannot provide satisfactory performance. Because exhaustive search methods try to handle the structure of protein by using sphere set which is converted from atoms set, the similarity calculation about two sphere sets is very expensive. Instead, the filter-and-refine paradigm offers an efficient alternative to database search without compromising the accuracy of the answers. In recent, a very fast algorithm based on the inter-atomic distance has been suggested by Ballester and Richard. Since they adopted the moments of distribution with inter-atomic distance between atoms which are rotational invariant, they can eliminate the structure alignment and orientation fix process and perform the searching faster than previous methods. In this paper, we propose a new 3D shape descriptor. It has properties of the general shape distribution and useful property in screening the molecular database. We show some experimental results for the validity of our method.

키워드

참고문헌

  1. Akbar, S., Kung, J. and Wagner, R., "Exploiting Geometrical Properties on Protein Similarity Search", In 17th Proceedings on International Conference on Database and Expert Systems Applications (DEXA '06), pp. 228-234, 2006 https://doi.org/10.1109/DEXA.2006.56
  2. Ankerst, M., Kastenmuller G, Kriegel H.-P. and Seidl T., "Nearest Neighbor Classification in 3D Protein Databases", In Proceedings of 7th International Conference on Intelligent Systems for Molecular Biology, pp. 34-43, 1999
  3. Ankerst M., Kastenmuller G, Kriegel H.-P. and Seidl T., "3D Shape Histograms for Similarity Search and Classification in Spatial Databases", Lecture Notes in Computer Science, Vol. 1651, pp. 207-226, 1999 https://doi.org/10.1007/3-540-48482-5_14
  4. Aung, Z., Fu. W. and Tan, K.L., "An Efficient Index-based Protein Structure Database Searching Method", In Proceedings of $8^{th}$ International Conference on Database System for Advanced Applications (DASFAA'03), pp. 311-318, 2003 https://doi.org/10.1109/DASFAA.2003.1192396
  5. Ballester, P. J. and Richard, W. G, "Ultrafast Shape Recognition to Search Compound Databases for Similar Molecular Shapes", Journal of Computational Chemistry, Vol. 28, pp. 1711-1723,2007 https://doi.org/10.1002/jcc.20681
  6. Bemis, G W. and Kuntz, I. D. "A Fast and Efficient Method for 2D and 3D Molecular Shape Description", Journal of Computer Aided Molecular Design, Vol. 6, pp. 607-628, 1992 https://doi.org/10.1007/BF00126218
  7. Berman, H. M., et ai, "The Protein Data Bank", Nucleic Acid Res., Vol. 28, pp. 235-242, 2000 https://doi.org/10.1093/nar/28.1.235
  8. Bertino, E., et aI., "The Astral Compendium for Sequence and Structure Analysis", Nucleic Acids Res., Vol. 28, pp. 254-256, 2000 https://doi.org/10.1093/nar/28.1.254
  9. B$\ddot{a}$hm, H.-J., Flohr, A. and Stahl, M. "Scaffold Hopping", Drug Discery. Today: Technology, Vol. 1, pp. 217-224, 2004 https://doi.org/10.1016/j.ddtec.2004.10.009
  10. Good, A. C. and Richards, W. G ,"Explicit Calculation of 3D Molecular Similarity", Perspective Drug Discovery Design, Vol. 9, pp. 321-338, 1998 https://doi.org/10.1023/A:1027280526177
  11. Hall, P., "A Distribution is Completely Determined by Its Translated Moments", Probability Theory and Related Fields, Vol. 62, pp. 355-359, 1983 https://doi.org/10.1007/BF00535259
  12. Jenkins, J. L., Glick, M. and Davies, J. w., "A 3D Similarity Method for Scaffold Hopping from Known Drugs or Natural Ligands to New Chemotypes", Journal of Medical Chemistry, Vol. 47, pp. 6144-6159, 2004 https://doi.org/10.1021/jm049654z
  13. Kransnogor, N. and Pelta, D.A., "Measuring the Similarity of Protein Structures by Means of the Universal Similarity Matrix", Bioinformatics, Vol. 20, pp. 1015-1021, 2007
  14. Matter, L., "Completeness of Location Families, Translated Moments, and Uniqueness of Charges", Probability Theory and Related Fields, Vol. 62, pp. 137-149, 1985
  15. Nilakantan, R., Bauman, N. and Venkataraghavan, R., "New Method for Rapid Characterisation of Molecular Shapes: Applications in Drug Design", Journal of Chemical Information Computer Science, Vol. 33, pp. 79-85, 1993 https://doi.org/10.1021/ci00011a012
  16. Rodgers, J. L. and Nicewander, W. A., "Thirteen Ways to Look at the Correlation Coefficient", The American Statistician, Vol. 42, No. I, pp. 59-66, 1988 https://doi.org/10.2307/2685263
  17. Yeh, J.-S. et at., "A Web-based Three Dimensional Protein Retrieval System by Matching Visual Similarity", Bioinformatics Applications Note, Vol. 21, pp. 3056-3057, 2005 https://doi.org/10.1093/bioinformatics/bti458
  18. 김동욱, 조영송, 김덕수, "삼차원 구의 보로노이 다이어그램 계산을 위한 두 가지 알고리듬 및 단백질구조해석에의 응용", 한국CAD/CAM학회 논문집, Vol. 11, No.2, pp. 97-106, 2006