DOI QR코드

DOI QR Code

Detection of M:N corresponding class group pairs between two spatial datasets with agglomerative hierarchical clustering

응집 계층 군집화 기법을 이용한 이종 공간정보의 M:N 대응 클래스 군집 쌍 탐색

  • Received : 2012.02.14
  • Accepted : 2012.04.14
  • Published : 2012.04.30

Abstract

In this paper, we propose a method to analyze M:N corresponding relations in semantic matching, especially focusing on feature class matching. Similarities between any class pairs are measured by spatial objects which coexist in the class pairs, and corresponding classes are obtained by clustering with these pairwise similarities. We applied a graph embedding method, which constructs a global configuration of each class in a low-dimensional Euclidean space while preserving the above pairwise similarities, so that the distances between the embedded classes are proportional to the overall degree of similarity on the edge paths in the graph. Thus, the clustering problem could be solved by employing a general clustering algorithm with the embedded coordinates. We applied the proposed method to polygon object layers in a topographic map and land parcel categories in a cadastral map of Suwon area and evaluated the results. F-measures of the detected class pairs were analyzed to validate the results. And some class pairs which would not detected by analysis on nominal class names were detected by the proposed method.

본 연구는 두 공간정보의 대응 클래스 군집 쌍 탐색을 중심으로 의미론적 정합과정에서 발생하는 M:N 대응관계를 분석하는 방법을 제안한다. 객체의 공유 관계를 이용하여 클래스의 유사도를 측정하고 높은 유사도를 가지는 클래스들을 군집화함으로써 M:N 대응관계를 탐색하고자 한다. 클래스 사이의 유사도를 그래프 모형으로 표현하고 그래프 임베딩 기법을 적용하여 투영공간에서 클래스 사이의 거리가 클래스 중첩분석에 의한 국지적 유사도에 반비례하도록 개별 클래스들의 투영좌표를 계산하고 군집화를 수행함으로써 계층적 대응 군집 쌍을 탐색할 수 있다. 제안된 방법을 평가하기 위하여 경기도 수원시의 수치지형도와 연속지적도에 적용하여 수치지형도의 면 객체 레이어와 연속지적도의 필지 지목의 대응 군집 쌍을 탐색하였다. 탐색된 대응 클래스 쌍의 F-measure를 측정한 결과 약 0.80에서 0.35 사이의 다양한 값을 얻을 수 있었으며, 클래스 명칭과는 상이한 다양한 대응관계를 얻을 수 있었다.

Keywords

References

  1. 오일석 (2008), 패턴인식, 교보문고, pp. 340-346.
  2. 허용, 김정옥, 유기윤 (2009), 지형도와 연속지적도의 가구계 폴리곤 집합간의 M:N 대응쌍 탐색, 한국공간정보시스템학회지, 한국공간정보시스템학회, 제 11권, 제 3호, pp. 47-49.
  3. 황보택근, 이기정 (2006), 시맨틱 검색을 위한 이기종 데이터간의 매칭기법, 한국콘텐츠학회지, 한국콘텐츠학회, 제 6권, 제 10호, pp. 25-33.
  4. Bel Hadj Ali, A. (2001), Qualite geometrique des entites ge ographiques surfaciques: Application a l'appariement et de finition d'une typologie des ecarts geometriques, PhD dissertation, Universite Marne la Vallee, Marne la Valle
  5. Dhillon, I. S. (2001), Co-clustering documents and words using bipartite spectral graph partitioning, Proceeding of 7th ACM SIGKDD Conference, SIGKDD, San Francisco, pp. 269-274.
  6. Duckham, M. and Worboys, M. (2005), An algebraic approach to automated geospatial information fusion, International Journal of Geographical Information Systems, Taylor&Francis, Vol. 19, No. 5, pp. 537-557. https://doi.org/10.1080/13658810500032339
  7. Euzenat, J. and Shvaiki, P. (2007), Ontology Matching, Springer, NewYork, pp. 40-49.
  8. Fichtinger, A., Rix, J., Schaffler, U., Michi, I., Gone, M. and Reitz, T. (2011), Data harmonisation put into practice by the HUMBOLDT project, International journal of spatial data infrastructure research, Vol. 6, No. 3, pp. 234-260.
  9. Fiedler, M. (1975), A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, Czechoslovak Mathematical Journal, IMAS, Vol 25, No. 10, pp. 619-633. https://doi.org/10.1007/BF01591018
  10. Huh, Y., Yu, Y. and Heo, J. (2011), Detecting conjugate-point pairs for map alignment between two polygon datasets, Computers, Environment and Urban Systems, Elsevier, Vol. 35, No. 3, pp. 250-262. https://doi.org/10.1016/j.compenvurbsys.2010.08.001
  11. Hendrickson, B. (2007), Latent semantic analysis and Fiedler retrieval, Linear Algebra and its Applications, Elsevier, Vol. 421, No. 2-3, pp. 345-355. https://doi.org/10.1016/j.laa.2006.09.026
  12. Kieler, B. (2007), A geometry-driven approach for the semantic integration of geodata sets. Proceeding of X X III International Cartographic Conference, ICA, Moscow
  13. Kokla, M. (2006), Guidelines on geographic ontology integration, Proceeding of ISPRS technical commission II symposium, ISPRS, pp. 67-72.
  14. Parundekar, R., Knoblock, C. A. and Ambite, J. L. (2010), Aligning ontologies of geospatial linked data, Proceedings of the Workshop on Linked Spatio-temporal Data, 2010.
  15. Pothen, A., Simon, H. D. and Liou, K. P. (1990), Partitioning sparse matrices with eigenvecotors of graphs, SIAM Journal of Matrix Analysis and Application, SIAM, Vol. 11, No. 3, pp. 430-452. https://doi.org/10.1137/0611030
  16. Sameh, A. H. and Wisniewski, J. A. (1982), A trace minimization algorithm for the generalized eigenvalue problem, SIAM Journal of Numerical Analysis, SIAM, Vol. 19, No. 6, pp. 1243-1259. https://doi.org/10.1137/0719089
  17. Trosset, M. W. and Tang, M. (2010) On combinatorial Laplacian eigenmaps. Technical Report 10-02, Department of Statistics, Indiana University, pp. 8-9.
  18. Uitermark, H. T., van Oosterom, P. J. M., Mars, N. J. I. and Molenaar, M. (1999), Ontology-based geographic data set integration, Lecture Notes in Computer Science 1678, Springer, pp. 60-78.
  19. Yi, S., Huang, B. and Wang, C. (2007), Pattern matching for heterogegeous geodata sources using attributed relational graph and probabilistic relaxation, Photogrammetric Engineering & Remote Sensing, PE&RS, Vol. 73, No. 6, pp. 663-670. https://doi.org/10.14358/PERS.73.6.663
  20. Yan, S., Xu, D., Zhang, B., Zhang, H.J., Yang, Q. and Lin, S. (2007), Graph Embedding and Extensions: A General Framework for Dimensionality Reduction, IEEE Transactions on pattern analysis and machine intelligence, IEEE, Vol. 29, No. 1, pp. 40-51. https://doi.org/10.1109/TPAMI.2007.250598