• Title/Summary/Keyword: 엔티티 링크 모델

Search Result 4, Processing Time 0.022 seconds

Context-aware entity link framework using wikidata (wikidata를 이용하는 상황 인지 엔티티 링크 프레임워크)

  • Jang, SeoYoon;Park, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.587-589
    • /
    • 2020
  • 사용자의 관심사를 고려하면 상황 인지 서비스의 질을 높일 수 있다. 기존의 사용자의 관심사를 고려하는 서비스에는 지식베이스(KB)가 사용 되었으나, 최근 새로운 방법인 wikidata를 이용한 엔티티 링크를 활용한 방법도 활발히 연구가 진행되고 있다. wikidata가 적용된 엔티티 링크는 기존의 KB를 이용하는 방법보다 데이터의 변경, 보완이 쉽고 가볍다. 이에 본 논문에서는 wikidata가 적용된 엔티티 링크 모델을 이용한 상황인지 서비스를 제공 할 수 있는 프레임워크를 제안한다.

  • PDF

Entity Matching Method Using Semantic Similarity and Graph Convolutional Network Techniques (의미적 유사성과 그래프 컨볼루션 네트워크 기법을 활용한 엔티티 매칭 방법)

  • Duan, Hongzhou;Lee, Yongju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.5
    • /
    • pp.801-808
    • /
    • 2022
  • Research on how to embed knowledge in large-scale Linked Data and apply neural network models for entity matching is relatively scarce. The most fundamental problem with this is that different labels lead to lexical heterogeneity. In this paper, we propose an extended GCN (Graph Convolutional Network) model that combines re-align structure to solve this lexical heterogeneity problem. The proposed model improved the performance by 53% and 40%, respectively, compared to the existing embedded-based MTransE and BootEA models, and improved the performance by 5.1% compared to the GCN-based RDGCN model.

Probabilistic based Web Contents Mining (확률 기반 웹 콘텐츠 마이닝)

  • Yun, Bo-Hyun;Cho, Kwang-Moon
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.16-20
    • /
    • 2006
  • In Web contents mining, it is important to recognize the unlabeled entities and to integrate the sub-linked information and the extracted results. This paper presents the probabilistic based method which can recognize the unlabeled entity by using the Baysien model. Moreover, we propose the method that can use the information of the sub-linked web pages and integrate the extracted results. In the experimental results, we can see that the probabilistic based entity and information integration show the most significant precision.

  • PDF

Study on the Improvement of Extraction Performance for Domain Knowledge based Wrapper Generation (도메인 지식 기반 랩퍼 생성의 추출 성능 향상에 관한 연구)

  • Jeong Chang-Hoo;Choi Yun-Soo;Seo Jeong-Hyeon;Yoon Hwa-Mook
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.67-77
    • /
    • 2006
  • Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention.

  • PDF