• Title/Summary/Keyword: Semantic Relation Extraction

Search Result 27, Processing Time 0.168 seconds

Developing a Test-Bed Toolkit for Scientific Document Analysis (기술 문헌 분석 테스트베드 툴킷 개발)

  • Choi, Sung-Pil;Song, Sa-Kwang;Jung, Han-Min
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.8
    • /
    • pp.13-19
    • /
    • 2012
  • This paper introduces a test-bed toolkit for evaluating and enhancing text analysis engines which extract technological knowledge from articles, patents, reports and so forth. The toolkit consists of two test-beds for technical entity recognition and relation extraction engines, which are capable of identifying technical entities and predicting semantic relation types between the entities. With using the introduced toolkits, users and developers can efficiently perform the execution monitoring and error analysis of the technical text analysis engines.

Relation Extraction based on Composite Kernel combining Pattern Similarity of Predicate-Argument Structure (술어-논항 구조의 패턴 유사도를 결합한 혼합 커널 기반관계 추출)

  • Jeong, Chang-Hoo;Choi, Sung-Pil;Choi, Yun-Soo;Song, Sa-Kwang;Chun, Hong-Woo
    • Journal of Internet Computing and Services
    • /
    • v.12 no.5
    • /
    • pp.73-85
    • /
    • 2011
  • Lots of valuable textual information is used to extract relations between named entities from literature. Composite kernel approach is proposed in this paper. The composite kernel approach calculates similarities based on the following information:(1) Phrase structure in convolution parse tree kernel that has shown encouraging results. (2) Predicate-argument structure patterns. In other words, the approach deals with syntactic structure as well as semantic structure using a reciprocal method. The proposed approach was evaluated using various types of test collections and it showed the better performance compared with those of previous approach using only information from syntactic structures. In addition, it showed the better performance than those of the state of the art approach.

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables (단어 임베딩(Word Embedding) 기법을 적용한 키워드 중심의 사회적 이슈 도출 연구: 장애인 관련 뉴스 기사를 중심으로)

  • Choi, Garam;Choi, Sung-Pil
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.231-250
    • /
    • 2018
  • In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

Auto-Analysis of Traffic Flow through Semantic Modeling of Moving Objects (움직임 객체의 의미적 모델링을 통한 차량 흐름 자동 분석)

  • Choi, Chang;Cho, Mi-Young;Choi, Jun-Ho;Choi, Dong-Jin;Kim, Pan-Koo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.6
    • /
    • pp.36-45
    • /
    • 2009
  • Recently, there are interested in the automatic traffic flowing and accident detection using various low level information from video in the road. In this paper, the automatic traffic flowing and algorithm, and application of traffic accident detection using traffic management systems are studied. To achieve these purposes, the spatio-temporal relation models using topological and directional relations have been made, then a matching of the proposed models with the directional motion verbs proposed by Levin's verbs of inherently directed motion is applied. Finally, the synonym and antonym are inserted by using WordNet. For the similarity measuring between proposed modeling and trajectory of moving object in the video, the objects are extracted, and then compared with the trajectories of moving objects by the proposed modeling. Because of the different features with each proposed modeling, the rules that have been generated will be applied to the similarity measurement by TSR (Tangent Space Representation). Through this research, we can extend our results to the automatic accident detection of vehicle using CCTV.

  • PDF

Detection of Protein Subcellular Localization based on Syntactic Dependency Paths (구문 의존 경로에 기반한 단백질의 세포 내 위치 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.375-382
    • /
    • 2008
  • A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and WordNet thesaurus. In the first step, we constructed syntactic dependency paths from each protein to its location candidate, and then converted the syntactic dependency paths into dependency trees. In the second step, we retrieved root information of the syntactic dependency trees. In the final step, we extracted syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extracted syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees from the training data, we extracted (protein, localization) pairs from the test sentences. Even with no biomolecular knowledge, our method showed reasonable performance in experimental results using Medline abstract data. Our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.

Pattern Construction for Semantic Relation Extraction using Verb Information (동사 정보를 활용한 의미 관계 추출을 위한패턴 구축)

  • Kim, Se-Jong;Lee, Yong-Hun;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.118-123
    • /
    • 2008
  • 온톨로지란 실세계에 존재하는 사물 및 개념, 그리고 용어들 간의 관계들을 컴퓨터가 이해할 수 있는 형태로 표현한 것이다. 온톨로지 구축에 있어서 대용량 코퍼스의 활용은 해당코퍼스에서 등장하는 용어들과 이들 사이에서 나타나는 문자열을 일종의 패턴으로 취급하여 특정 패턴과 함께 나타나는 용어 쌍들을 해당 패턴이 대표하는 의미 관계로 설정하는 방식을 취한다. 그러나 기존의 방법은 주로 두 용어들 사이에서 나타나는 문자열만을 고려하여 패턴을 추출하기 때문에 해당 문장에 포함된 보다 다양한 문장 정보들을 활용할 수 없다. 본 논문은 이러한 한계점을 감안하여, 용어 쌍 사이에서 나타나는 문자열과 주변 동사 정보를 함께 고려함으로써 패턴의 정교성을 향상시키는 방법을 제안한다. 또한 동사들의 동의어를 활용하여 다양한 용어들을 포괄할 수 있는 일반화된 패턴을 구축한다. 본 방법론은 is-a 관계의 경우 64%, part-of 관계의 경우 83%, made-of 관계의 경우 73%, use 관계의 경우 72%의 정확률을 보였으며 모두 기존 방법보다 향상된 결과를 가져왔다.

  • PDF

An Algorithm for Translation from RDB Schema Model to XML Schema Model Considering Implicit Referential Integrity (묵시적 참조 무결성을 고려한 관계형 스키마 모델의 XML 스키마 모델 변환 알고리즘)

  • Kim, Jin-Hyung;Jeong, Dong-Won;Baik, Doo-Kwon
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.526-537
    • /
    • 2006
  • The most representative approach for efficient storing of XML data is to store XML data in relational databases. The merit of this approach is that it can easily accept the realistic status that most data are still stored in relational databases. This approach needs to convert XML data into relational data or relational data into XML data. The most important issue in the translation is to reflect structural and semantic relations of RDB to XML schema model exactly. Many studies have been done to resolve the issue, but those methods have several problems: Not cover structural semantics or just support explicit referential integrity relations. In this paper, we propose an algorithm for extracting implicit referential integrities automatically. We also design and implement the suggested algorithm, and execute comparative evaluations using translated XML documents. The proposed algorithm provides several good points such as improving semantic information extraction and conversion, securing sufficient referential integrity of the target databases, and so on. By using the suggested algorithm, we can guarantee not only explicit referential integrities but also implicit referential integrities of the initial relational schema model completely. That is, we can create more exact XML schema model through the suggested algorithm.