• Title/Summary/Keyword: Hypernym Extraction

Search Result 3, Processing Time 0.016 seconds

A Study of the Automatic Extraction of Hypernyms arid Hyponyms from the Corpus (코퍼스를 이용한 상하위어 추출 연구)

  • Pang, Chan-Seong;Lee, Hae-Yun
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.2
    • /
    • pp.143-161
    • /
    • 2008
  • The goal of this paper is to extract the hyponymy relation between words in the corpus. Adopting the basic algorithm of Hearst (1992), I propose a method of pattern-based extraction of semantic relations from the corpus. To this end, I set up a list of hypernym-hyponym pairs from Sejong Electronic Dictionary. This list is supplemented with the superordinate-subordinate terms of CoroNet. Then, I extracted all the sentences from the corpus that include hypemym-hyponym pairs of the list. From these extracted sentences, I collected all the sentences that contain meaningful constructions that occur systematically in the corpus. As a result, we could obtain 21 generalized patterns. Using the PERL program, we collected sentences of each of the 21 patterns. 57% of the sentences are turned out to have hyponymy relation. The proposed method in this paper is simpler and more advanced than that in Cederberg and Widdows (2003), in that using a word net or an electronic dictionary is generally considered to be efficient for information retrieval. The patterns extracted by this method are helpful when we look fer appropriate documents during information retrieval, and they are used to expand the concept networks like ontologies or thesauruses. However, the word order of Korean is relatively free and it is difficult to capture various expressions of a fired pattern. In the future, we should investigate more semantic relations than hyponymy, so that we can extract various patterns from the corpus.

  • PDF

Incremental Enrichment of Ontologies through Feature-based Pattern Variations (자질별 관계 패턴의 다변화를 통한 온톨로지 확장)

  • Lee, Sheen-Mok;Chang, Du-Seong;Shin, Ji-Ae
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.365-374
    • /
    • 2008
  • In this paper, we propose a model to enrich an ontology by incrementally extending the relations through variations of patterns. In order to generalize initial patterns, combinations of features are considered as candidate patterns. The candidate patterns are used to extract relations from Wikipedia, which are sorted out according to reliability based on corpus frequency. Selected patterns then are used to extract relations, while extracted relations are again used to extend the patterns of the relation. Through making variations of patterns in incremental enrichment process, the range of pattern selection is broaden and refined, which can increase coverage and accuracy of relations extracted. In the experiments with single-feature based pattern models, we observe that the features of lexical, headword, and hypernym provide reliable information, while POS and syntactic features provide general information that is useful for enrichment of relations. Based on observations on the feature types that are appropriate for each syntactic unit type, we propose a pattern model based on the composition of features as our ongoing work.

Learning Rules for Identifying Hypernyms in Machine Readable Dictionaries (기계가독형사전에서 상위어 판별을 위한 규칙 학습)

  • Choi Seon-Hwa;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.171-178
    • /
    • 2006
  • Most approaches for extracting hypernyms of a noun from its definitions in an MRD rely on lexical patterns compiled by human experts. Not only these approaches require high cost for compiling lexical patterns but also it is very difficult for human experts to compile a set of lexical patterns with a broad-coverage because in natural languages there are various expressions which represent same concept. To alleviate these problems, this paper proposes a new method for extracting hypernyms of a noun from its definitions in an MRD. In proposed approach, we use only syntactic (part-of-speech) patterns instead of lexical patterns in identifying hypernyms to reduce the number of patterns with keeping their coverage broad. Our experiment has shown that the classification accuracy of the proposed method is 92.37% which is significantly much better than that of previous approaches.