Extraction of Relationships between Scientific Terms based on Composite Kernels

혼합 커널을 활용한 과학기술분야 용어간 관계 추출

  • 최성필 (한국과학기술정보연구원 정보기술연구실) ;
  • 최윤수 (한국과학기술정보연구원 정보기술연구실) ;
  • 정창후 (한국과학기술정보연구원 정보기술연구실) ;
  • 맹성현 (한국과학기술원 전산학과)
  • Published : 2009.12.15

Abstract

In this paper, we attempted to extract binary relations between terminologies using composite kernels consisting of convolution parse tree kernels and WordNet verb synset vector kernels which explain the semantic relationships between two entities in a sentence. In order to evaluate the performance of our system, we used three domain specific test collections. The experimental results demonstrate the superiority of our system in all the targeted collection. Especially, the increase in the effectiveness on KREC 2008, 8% in F1, shows that the core contexts around the entities play an important role in boosting the entire performance of relation extraction.

본 논문에서는 합성곱 구문 트리 커널(convolution parse tree kernel)과, 한 문장에서 나타나는 두 개체 간의 관계를 가장 잘 설명하는 동사 상당어구에 대한 개념화를 통해 생성되는 워드넷 신셋 벡터(WordNet synsets vector) 커널을 활용하여 과학기술분야 전문용어 간의 관계 추출을 시도하였다. 본 논문에서 적용한 모델의 성능 평가를 위해서 세 가지 검증 컬렉션을 활용하였으며, 각각의 컬렉션 마다 기존의 접근 방법론 보다 우수한 성능을 보여주었다. 특히 KREC 2008 컬렉션을 대상으로 한 성능 실험에서는, 기존의 합성곱 구문 트리 커널과 동사 신셋 벡터(verb synsets vector)를 함께 적용한 합성 커널이 비교적 높은 성능 향상(8% F1)을 나타내고 있다. 이는 성능을 높이기 위해서 관계 추출에서 많이 활용하였던 개체 자질 정보와 더불어 개체 주변에 존재하는 주변 문맥 정보(동사 및 동사 상당어구)도 매우 유용한 정보임을 입증하고 있다.

Keywords

References

  1. Bunescu, R. C., Mooney, R. J., “A Shortest Path Dependency Kernel for Relation Extraction,” Pro-ceedings of the Human Language Technology Con-ference and Conference on Empirical Methods in Natural Language Processing, Vancouver, B.C., pp.724-731, 2005 https://doi.org/10.3115/1220575.1220666
  2. Culotta, A., Sorensen, J., “Dependency Tree Kernels for Relation Extraction,” Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, 2004 https://doi.org/10.3115/1218955.1219009
  3. Bunescu, R. C., Mooney, H. J., Subsequence Kernels for Relation Extraction, Advances in Neural Infor-mation Processing Systems, 2006
  4. Kambhatla N., “Combining lexical, syntactic and semantic features with Maximum Entropy models for extracting relations,” ACL'2004(Poster), pp.178- 181. 21-26 July 2004, Barcelona, Spain https://doi.org/10.3115/1219044.1219066
  5. GuoDong Z., Su J. Zhang J. and Zhang M., “Ex-ploring various knowledge in relation extraction,” ACL'2005, pp.427-434, 25-30 June, AnnArbor, Michgan, USA, 2005 https://doi.org/10.3115/1219840.1219893
  6. Zhao, S. B., Grishman, R., "Extracting Relations with Integrated Information Using Kernel Methods," ACL-2005, 2005 https://doi.org/10.3115/1219840.1219892
  7. Zelenko, D., Aone, C., Richardella, A., "Kernel Methods for Relation Extraction," Journal of Machine Learning Research 3, pp.1083-1106, 2003 https://doi.org/10.1162/153244303322533205
  8. Collins, M., Duffy, N., “Convolution Kernels for Natural Language,” NIPS-2001, 2001
  9. GuoDong Z., Min Z., Dong H. J., QiaoMing Z., 'Tree Kernel-based Relation Extraction with Context-Sensitive Structured Parse Tree Information,' Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, pp.728-736, June 2007
  10. Zhang, M., Zhang, J., Su, J., Zhou, G., “A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features,” 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp.825-832, 2006 https://doi.org/10.3115/1220175.1220279
  11. Moschitti A., "Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees," Pro-ceedings of the 17th European Conference on Machine Learning, Berlin, Germany, 2006
  12. Joachims T., SVM Light, http://svmlight.joachims.org/, 2008
  13. Roth D., Yih W., “Probabilistic Reasoning for Entity & Relation Recognition,” COLING'02, Aug. 2002 https://doi.org/10.3115/1072228.1072379
  14. D. Roth and W. Yih, “A Linear Programming For-mulation for Global Inference in Natural Language Tasks,” CoNLL'04, May. 2004
  15. Rosario B., Hearst M., Multi-way Relation Classi-fication: Application to Protein-Protein interaction, HLT/EMNLP'05, Vancouver, 2006 https://doi.org/10.3115/1220575.1220667