DOI QR코드

DOI QR Code

Ontology Matching Method Based on Word Embedding and Structural Similarity

  • Hongzhou Duan (School of Computer Science and Engineering, Kyungpook National University) ;
  • Yuxiang Sun (Software Technology Research Center, Kyungpook National University) ;
  • Yongju Lee (School of Computer Science and Engineering, Kyungpook National University)
  • Received : 2023.07.10
  • Accepted : 2023.07.20
  • Published : 2023.09.30

Abstract

In a specific domain, experts have different understanding of domain knowledge or different purpose of constructing ontology. These will lead to multiple different ontologies in the domain. This phenomenon is called the ontology heterogeneity. For research fields that require cross-ontology operations such as knowledge fusion and knowledge reasoning, the ontology heterogeneity has caused certain difficulties for research. In this paper, we propose a novel ontology matching model that combines word embedding and a concatenated continuous bag-of-words model. Our goal is to improve word vectors and distinguish the semantic similarity and descriptive associations. Moreover, we make the most of textual and structural information from the ontology and external resources. We represent the ontology as a graph and use the SimRank algorithm to calculate the structural similarity. Our approach employs a similarity queue to achieve one-to-many matching results which provide a wider range of insights for subsequent mining and analysis. This enhances and refines the methodology used in ontology matching.

Keywords

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2016R1D1A1B02008553). This study was supported by the BK21 FOUR project (AI-driven Convergence Software Education Research Program) funded by the Ministry of Education, School of Computer Science and Engineering, Kyungpook National University, Korea (4199990214394)

References

  1. H. Zhu, X. Xue, C. Jiang, and H. Ren, "Multiobjective Sensor Ontology Matching Technique with User Preference Metrics," Wirel. Commun. Mob. Comput. 2021 (2021), pp. 1-9, https://doi.org/10.1155/2021/5594553.
  2. P. Shvaiko and J. Euzenat, "Ontology Matching: State of the Art and Future Challenges," IEEE Transactions on knowledge and data engineering, Vol. 25, No. 1, pp. 158-176, 2011, https://doi.org/10.1109/TKDE.2011.253.
  3. P. Kolyvakis, A. Kalousis, and D. Kiritsis, "Deep Alignment: Unsupervised Ontology Matching with Refined Word Vectors," Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 787-798, 2018, https://doi.org/10.3389/fgene.2022.893409.
  4. F. Li, L. Liao, L. Zhang, X. Zhu, B. Zhang, and Z. Wang, "An Efficient Approach for Measuring Semantic Similarity Combining WordNet and Wikipedia," IEEE Access, Vol. 8, pp. 184318-184338, 2020, https://doi.org/10.1109/ACCESS.2020.3025611.
  5. Y. Yana, Q. Dong, and Y. Ruiteng, "A Quantum-like Text Representation based on Syntax Tree for Fuzzy Semantic Analysis," Journal of Intelligent & Fuzzy Systems, Vol. 44, No. 6, pp. 9977-9991, 2023, https://doi.org/10.3233/JIFS-223499.
  6. S. Neutel and M. D. Boer, "Towards Automatic Ontology Alignment using BERT," AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering, 2021, https://doi.org/10.48550/arXiv.2112.02682.
  7. X. Xue and J. Zhang, "Matching Large-scale Biomedical Ontologies with Central Concept based Partitioning Algorithm and Adaptive Compact Evolutionary Algorithm," Appl. Soft Comput. 106, 107343, 2021, https://doi.org/10.1016/j.asoc.2021.107343.
  8. W. Yu, J. McCann, C. Zhang, and H. Ferhatosmanoglu, "Scaling High-quality Pairwise Link-based Similarity Retrieval on Billion-edge Graphs," ACM Transactions on Information Systems (TOIS), Vol. 40, No. 4, pp. 1-45, 2022, https://doi.org/10.1145/3495209.
  9. T. Mikolov, I. Sutskever, K. Chen, et al., "Distributed Representations of Words and Phrases and Their Compositionality," 2013, arXiv preprint arXiv:1310.4546, https://doi.org/10.48550/arXiv.1310.4546.
  10. J. Mueller and A. Thyagarajan, "Siamese Recurrent Architectures for Learning Sentence Similarity," Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30, No. 1, 2016, https://doi.org/10.1609/aaai.v30i1.10350.
  11. J. Devlin, M. W. Chang, K. Lee, et al., "Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2018, arXiv preprint arXiv:1810.04805, https://doi.org/10.48550/arXiv.1810.04805.
  12. Vaswani, N. Shazeer, N. Parmar, et al., "Attention is All You Need, 2017, arXiv preprint arXiv: 1706.03762, https://doi.org/10.48550/arXiv.1706.03762.
  13. G. Jeh and J. Widom, "SimRank: A Measure of Structural-Context Similarity," Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 538-543, 2002, https://doi.org/10.1145/775047.775126
  14. R. Speer, J. Chin, C. Havasi, "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge," Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, No. 1, 2017, https://doi.org/10.48550/arXiv.1612.03975.
  15. W. Dakka and P. G. Ipeirotis, "Automatic Extraction of Useful Facet Hierarchies from Text Databases," Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 466-475, 2008, https://doi.orgg/10.1109/1CDE.2008.4497455.
  16. M. Zhao, S. Zhang, W. Li, et al., "Matching Biomedical Ontologies based on Formal Concept Analysis," Journal of biomedical semantics, Vol. 9, No. 1, pp. l-27, 2018, https://doi.org/10.1186/s13326-018-0178-9
  17. C. Rosse and J. L. V. Mejino, "The Foundational Model of Anatomy Ontology, Anatomy Ontologies for Bioinformatics, Springer, London, pp. 59-117, 2008.
  18. D. Lee, R. Cornet, F. Lau, et al., "A Survey of SNOMED CT Implementations," Journal of Biomedical Informatics, Vol. 46, No. 1, pp. 87-96, 2013, https://doi.org/10.1016/j.jbi.2012.09.006
  19. T. Benson, Principles of Health Interoperability HL7 and SNOMED, London, England, Springer, 2012. ISBN 978-1-4471-2800-7.
  20. J. Golbeck, G. Fragoso, F. Hartel, et al., "The National Cancer Institute's Thesaurus and Ontology," Journal of Web Semantics, First Look 1_1_4, 2003, http://dx.doi.org/10.2139/ssrn.3199007.