• Title/Summary/Keyword: Lexical semantic network

Search Result 26, Processing Time 0.026 seconds

The Method of the Evaluation of Verbal Lexical-Semantic Network Using the Automatic Word Clustering System (단어클러스터링 시스템을 이용한 어휘의미망의 활용평가 방안)

  • Kim, Hae-Gyung;Song, Mi-Young
    • Korean Journal of Oriental Medicine
    • /
    • v.12 no.3 s.18
    • /
    • pp.1-15
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network. However, it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic word clustering systems, which are called system A and system B respectively. 68,455,856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3,656 '-ha' verbs. The target data is the 'multilingual Word Net-CoreNet'.When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A, 45.3%.

  • PDF

The Extraction of Head words in Definition for Construction of a Semi-automatic Lexical-semantic Network of Verbs (동사 어휘의미망의 반자동 구축을 위한 사전정의문의 중심어 추출)

  • Kim Hae-Gyung;Yoon Ae-Sun
    • Language and Information
    • /
    • v.10 no.1
    • /
    • pp.47-69
    • /
    • 2006
  • Recently, there has been a surge of interests concerning the construction and utilization of a Korean thesaurus. In this paper, a semi-automatic method for generating a lexical-semantic network of Korean '-ha' verbs is presented through an analysis of the lexical definitions of these verbs. Initially, through the use of several tools that can filter out and coordinate lexical data, pairs constituting a word and a definition were prepared for treatment in a subsequent step. While inspecting the various definitions of each verb, we extracted and coordinated the head words from the sentences that constitute the definition of each word. These words are thought to be the main conceptual words that represent the sense of the current verb. Using these head words and related information, this paper shows that the creation of a thesaurus could be achieved without any difficulty in a semi-automatic fashion.

  • PDF

The Method of Using the Automatic Word Clustering System for the Evaluation of Verbal Lexical-Semantic Network (동사 어휘의미망 평가를 위한 단어클러스터링 시스템의 활용 방안)

  • Kim Hae-Gyung;Yoon Ae-Sun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.175-190
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network However it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic vol·d clustering systems, which are called system A and system B respectively. 68.455.856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3.656 '-ha' verbs. The target data is the 'multilingual Word Net-CoroNet'. When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A. 45.3%.

Automatic Construction of Syntactic Relation in Lexical Network(U-WIN) (어휘망(U-WIN)의 구문관계 자동구축)

  • Im, Ji-Hui;Choe, Ho-Seop;Ock, Cheol-Young
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.627-635
    • /
    • 2008
  • An extended form of lexical network is explored by presenting U-WIN, which applies lexical relations that include not only semantic relations but also conceptual relations, morphological relations and syntactic relations, in a way different with existing lexical networks that have been centered around linking structures with semantic relations. So, This study introduces the new methodology for constructing a syntactic relation automatically. First of all, we extract probable nouns which related to verb based on verb's sentence type. However we should decided the extracted noun's meaning because extracted noun has many meanings. So in this study, we propose that noun's meaning is decided by the example matching rule/syntactic pattern/semantic similarity, frequency information. In addition, syntactic pattern is expanded using nouns which have high frequency in corpora.

Cross-Enrichment of the Heterogenous Ontologies Through Mapping Their Conceptual Structures: the Case of Sejong Semantic Classes and KorLexNoun 1.5 (이종 개념체계의 상호보완방안 연구 - 세종의미부류와 KorLexNoun 1.5 의 사상을 중심으로)

  • Bae, Sun-Mee;Yoon, Ae-Sun
    • Language and Information
    • /
    • v.14 no.1
    • /
    • pp.165-196
    • /
    • 2010
  • The primary goal of this paper is to propose methods of enriching two heterogeneous ontologies: Sejong Semantic Classes (SJSC) and KorLexNoun 1.5 (KLN). In order to achieve this goal, this study introduces the pros and cons of two ontologies, and analyzes the error patterns found during the fine-grained manual mapping processes between them. Error patterns can be classified into four types: (1) structural defectives involved in node branching, (2) errors in assigning the semantic classes, (3) deficiency in providing linguistic information, and (4) lack of the lexical units representing specific concepts. According to these error patterns, we propose different solutions in order to correct the node branching defectives and the semantic class assignment, to complement the deficiency of linguistic information, and to increase the number of lexical units suitably allotted to their corresponding concepts. Using the results of this study, we can obtain more enriched ontologies by correcting the defects and errors in each ontology, which will lead to the enhancement of practicality for syntactic and semantic analysis.

  • PDF

Korean Compound Noun Decomposition and Semantic Tagging System using User-Word Intelligent Network (U-WIN을 이용한 한국어 복합명사 분해 및 의미태깅 시스템)

  • Lee, Yong-Hoon;Ock, Cheol-Young;Lee, Eung-Bong
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.63-76
    • /
    • 2012
  • We propose a Korean compound noun semantic tagging system using statistical compound noun decomposition and semantic relation information extracted from a lexical semantic network(U-WIN) and dictionary definitions. The system consists of three phases including compound noun decomposition, semantic constraint, and semantic tagging. In compound noun decomposition, best candidates are selected using noun location frequencies extracted from a Sejong corpus, and re-decomposes noun for semantic constraint and restores foreign nouns. The semantic constraints phase finds possible semantic combinations by using origin information in dictionary and Naive Bayes Classifier, in order to decrease the computation time and increase the accuracy of semantic tagging. The semantic tagging phase calculates the semantic similarity between decomposed nouns and decides the semantic tags. We have constructed 40,717 experimental compound nouns data set from Standard Korean Language Dictionary, which consists of more than 3 characters and is semantically tagged. From the experiments, the accuracy of compound noun decomposition is 99.26%, and the accuracy of semantic tagging is 95.38% respectively.

网络流行语"X+人"探析 - 从"打工人", "尾款人", "工具人"等谈起

  • Yu, Cheol
    • 중국학논총
    • /
    • no.71
    • /
    • pp.41-59
    • /
    • 2021
  • With the progress of social economy and science and technology, network media technology has developed rapidly, China has ushered in the network information age, and the network buzzwords emerged to reflect the interaction and influence between language and society. The network buzzwords of "X+ ren "indirectly show the social psychology and value orientation of modern people with their unique structural characteristics, semantic connotation and cultural deposits, and so on. Based on this, we have conducted a multi-angle investigation on the network buzzwords "X+ ren". This paper first analyzes the structure types and syntactic functions of the lexical model of "X+ ren ", then makes a semantic analysis of the lexical model of "X+ Ren ", and finally investigates the causes and influences of the popularity of "X+ ren ". Through the investigation, we believe that "X+ ren "will continue to grow, and "X+ ren" will continue to attract the attention of the academic community.

Disambiguation of Homograph Suffixes using Lexical Semantic Network(U-WIN) (어휘의미망(U-WIN)을 이용한 동형이의어 접미사의 의미 중의성 해소)

  • Bae, Young-Jun;Ock, Cheol-Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.1
    • /
    • pp.31-42
    • /
    • 2012
  • In order to process the suffix derived nouns of Korean, most of Korean processing systems have been registering the suffix derived nouns in dictionary. However, this approach is limited because the suffix is very high productive. Therefore, it is necessary to analyze semantically the unregistered suffix derived nouns. In this paper, we propose a method to disambiguate homograph suffixes using Korean lexical semantic network(U-WIN) for the purpose of semantic analysis of the suffix derived nouns. 33,104 suffix derived nouns including the homograph suffixes in the morphological and semantic tagged Sejong Corpus were used for experiments. For the experiments first of all we semantically tagged the homograph suffixes and extracted root of the suffix derived nouns and mapped the root to nodes in the U-WIN. And we assigned the distance weight to the nodes in U-WIN that could combine with each homograph suffix and we used the distance weight for disambiguating the homograph suffixes. The experiments for 35 homograph suffixes occurred in the Sejong corpus among 49 homograph suffixes in a Korean dictionary result in 91.01% accuracy.

Improvement of Korean Homograph Disambiguation using Korean Lexical Semantic Network (UWordMap) (한국어 어휘의미망(UWordMap)을 이용한 동형이의어 분별 개선)

  • Shin, Joon-Choul;Ock, Cheol-Young
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.71-79
    • /
    • 2016
  • Disambiguation of homographs is an important job in Korean semantic processing and has been researched for long time. Recently, machine learning approaches have demonstrated good results in accuracy and speed. Other knowledge-based approaches are being researched for untrained words. This paper proposes a hybrid method based on the machine learning approach that uses a lexical semantic network. The use of a hybrid approach creates an additional corpus from subcategorization information and trains this additional corpus. A homograph tagging phase uses the hypernym of the homograph and an additional corpus. Experimentation with the Sejong Corpus and UWordMap demonstrates the hybrid method is to be effective with an increase in accuracy from 96.51% to 96.52%.

Entity Matching Method Using Semantic Similarity and Graph Convolutional Network Techniques (의미적 유사성과 그래프 컨볼루션 네트워크 기법을 활용한 엔티티 매칭 방법)

  • Duan, Hongzhou;Lee, Yongju
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.5
    • /
    • pp.801-808
    • /
    • 2022
  • Research on how to embed knowledge in large-scale Linked Data and apply neural network models for entity matching is relatively scarce. The most fundamental problem with this is that different labels lead to lexical heterogeneity. In this paper, we propose an extended GCN (Graph Convolutional Network) model that combines re-align structure to solve this lexical heterogeneity problem. The proposed model improved the performance by 53% and 40%, respectively, compared to the existing embedded-based MTransE and BootEA models, and improved the performance by 5.1% compared to the GCN-based RDGCN model.