• Title/Summary/Keyword: 의미 연관성 기반 추출

Search Result 50, Processing Time 0.02 seconds

Identification of Conserved Protein Domain Combination based on Association Rule (연관성 규칙에 기반한 보존된 단백질 도베인 조합의 식별)

  • Jung, Suk-Hoon;Jang, Woo-Hyuk;Han, Dong-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.5
    • /
    • pp.375-379
    • /
    • 2009
  • Protein domain is the conserved unit of compact tree-dimensional structure and evolution, which carries specific function. Domains may appear in patterns in proteins, since they have been conserved through the evolution for functional formation of proteins. In this paper, we propose a formulated method for conservation analysis of domain combination based on association rule. Proposed method measures mutual dependency of domains in a combination, as well as co-occurrence frequency of them, which is conventionally used. Based on the method, we extracted conserve domain combinations in S.cerevisiae proteins and analyzed their functions based on Gene Ontology. From the results, we drew conclusions that domains in S.cerevisiae proteins form patterns whose members are highly affiliated to one another, and that extracted patterns tend to be associated with molecular function. Moreover, the results testified to proposed method superior to conventional ones for identifying domain combinations conserved for functional cooperation.

Document Summarization Considering Entailment Relation between Sentences (문장 수반 관계를 고려한 문서 요약)

  • Kwon, Youngdae;Kim, Noo-ri;Lee, Jee-Hyong
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.179-185
    • /
    • 2017
  • Document summarization aims to generate a summary that is consistent and contains the highly related sentences in a document. In this study, we implemented for document summarization that extracts highly related sentences from a whole document by considering both similarities and entailment relations between sentences. Accordingly, we proposed a new algorithm, TextRank-NLI, which combines a Recurrent Neural Network based Natural Language Inference model and a Graph-based ranking algorithm used in single document extraction-based summarization task. In order to evaluate the performance of the new algorithm, we conducted experiments using the same datasets as used in TextRank algorithm. The results indicated that TextRank-NLI showed 2.3% improvement in performance, as compared to TextRank.

Keyword-based networked knowledge map expressing content relevance between knowledge (지식 간 내용적 연관성을 표현하는 키워드 기반 네트워크형 지식지도 개발)

  • Yoo, Keedong
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.119-134
    • /
    • 2018
  • A knowledge map as the taxonomy used in a knowledge repository should be structured to support and supplement knowledge activities of users who sequentially inquire and select knowledge for problem solving. The conventional knowledge map with a hierarchical structure has the advantage of systematically sorting out types and status of the knowledge to be managed, however it is not only irrelevant to knowledge user's process of cognition and utilization, but also incapable of supporting user's activity of querying and extracting knowledge. This study suggests a methodology for constructing a networked knowledge map that can support and reinforce the referential navigation, searching and selecting related and chained knowledge in term of contents, between knowledge. Regarding a keyword as the semantic information between knowledge, this research's networked knowledge map can be constructed by aggregating each set of knowledge links in an automated manner. Since a keyword has the meaning of representing contents of a document, documents with common keywords have a similarity in content, and therefore the keyword-based document networks plays the role of a map expressing interactions between related knowledge. In order to examine the feasibility of the proposed methodology, 50 research papers were randomly selected, and an exemplified networked knowledge map between them with content relevance was implemented using common keywords.

A WordNet-based Open Market Category Search System for Efficient Goods Registration (효율적인 상품등록을 위한 워드넷 기반의 오픈마켓 카테고리 검색 시스템)

  • Hong, Myung-Duk;Kim, Jang-Woo;Jo, Geun-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.9
    • /
    • pp.17-27
    • /
    • 2012
  • Open Market is one of the key factors to accelerate the profit. Usually retailers sell items in several Open Market. One of the challenges for retailers is to assign categories of items with different classification systems. In this research, we propose an item category recommendation method to support appropriate products category registration. Our recommendations are based on semantic relation between existing and any other Open Market categorization. In order to analyze correlations of categories, we use Morpheme analysis, Korean Wiki Dictionary, WordNet and Google Translation API. Our proposed method recommends a category, which is most similar to a guide word by measuring semantic similarity. The experimental results show that, our system improves the system accuracy in term of search category, and retailers can easily select the appropriate categories from our proposed method.

Component-based Reuse using Semantic Network (의미망을 이용한 컴포넌트 기반 재사용)

  • Han Jung-Soo;Kim Gui-Jug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.357-360
    • /
    • 2004
  • 본 논문은 소프트웨어의 효율적인 재사용을 위해 소스 코드 기반 컴포넌트 검색 방법을 제안하였다. 제안한 방법은 2단계로 이루어지는데, 먼저 라이브러리에 저장된 클래스를 기반으로 한 컴포넌트는 파싱 과정을 거쳐 의미망을 구성하고, 다음으로 사용자가 질의한 소스 코드를 이용하여 검색이 이루어진다. 소스 코드에서 추출된 식별자가 컴포넌트의 의미망을 활성화시켜 연관된 컴포넌트를 검색한다. 본 연구에서 제안한 검색방법은 프로그래머의 관심을 라이브러리 내에 있는 컴포넌트로 유도하여 재사용성을 높일 수 있으며, 프로그래밍 패턴을 제공함으로써 프로그래머로 하여금 프로그램의 가이드 라인으로 사용할 수 있도록 도움을 줄 수 있다.

  • PDF

Design and Implementation of Analysis System for Answer Dataset with Data Mining (데이터 마이닝을 이용한 시험 응답데이터 분석시스템 설계 및 구현)

  • Kwak, Eun-Young;Kim, Hyeoncheol
    • The Journal of Korean Association of Computer Education
    • /
    • v.11 no.1
    • /
    • pp.65-74
    • /
    • 2008
  • In this paper, we introduce an analysis system for answer dataset by using a data mining method. We analyze students' answer data collected from a test including multiple choice question items, and find associations between the items. Analysis of evaluation results based on our system will not only provide correct information on students' achievement levels but also provides a basis for modifying weaknesses of the evaluation procedures, question items, or teaching/learning procedures. Furthermore, it will enable us to improve the quality of question items for future use so that we can secure itemsets of high quality.

  • PDF

Document Summarization Based on Sentence Clustering Using Graph Division (그래프 분할을 이용한 문장 클러스터링 기반 문서요약)

  • Lee Il-Joo;Kim Min-Koo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.149-154
    • /
    • 2006
  • The main purpose of document summarization is to reduce the complexity of documents that are consisted of sub-themes. Also it is to create summarization which includes the sub-themes. This paper proposes a summarization system which could extract any salient sentences in accordance with sub-themes by using graph division. A document can be represented in graphs by using chosen representative terms through term relativity analysis based on co-occurrence information. This graph, then, is subdivided to represent sub-themes through connected information. The divided graphs are types of sentence clustering which shows a close relationship. When salient sentences are extracted from the divided graphs, summarization consisted of core elements of sentences from the sub-themes can be produced. As a result, the summarization quality will be improved.

The Scheme for Path-based Query Processing on the Semantic Data (시맨틱 웹 데이터의 경로 기반 질의 처리 기법)

  • Kim, Youn-Hee;Kim, Jee-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.10
    • /
    • pp.31-41
    • /
    • 2009
  • In the Semantic Web, it is possible to provide intelligent information retrieval and automated web services by defining a concept of information resource and representing a semantic relation between resources with meta data and ontology. It is very important to manage semantic data such as ontology and meta data efficiently for implementing essential functions of the Semantic Web. Thus we propose an index structure to support more accurate search results and efficient query processing by considering semantic and structural features of the semantic data. Especially we use a graph data model to express semantic and structural features of the semantic data and process various type of queries by using graph model based path expressions. In this paper the proposed index aims to distinguish our approach from earlier studies and involve the concept of the Semantic Web in its entirety by querying on primarily extracted structural path information and secondary extracted one through semantic inferences with ontology. In the experiments, we show that our approach is more accurate and efficient than the previous approaches and can be applicable to various applications in the Semantic Web.

Statistical Word Sense Disambiguation based on using Variant Window Size (가변길이 윈도우를 이용한 통계 기반 동형이의어의 중의성 해소)

  • Park, Gi-Tae;Lee, Tae-Hoon;Hwang, So-Hyun;Lee, Hyun Ah
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.40-44
    • /
    • 2012
  • 어휘가 갖는 의미적 중의성은 자연어의 특성 중 하나로 자연어 처리의 정확도를 떨어트리는 요인으로, 이러한 중의성을 해소하기 위해 언어적 규칙과 다양한 기계 학습 모델을 이용한 연구가 지속되고 있다. 의미적 중의성을 가지고 있는 동형이의어의 의미분별을 위해서는 주변 문맥이 가장 중요한 자질이 되며, 자질 정보를 추출하기 위해 사용하는 문맥 창의 크기는 중의성 해소의 성능과 밀접한 연관이 있어 신중히 결정되어야 한다. 본 논문에서는 의미분별과정에 필요한 문맥을 가변적인 크기로 사용하는 가변길이 윈도우 방식을 제안한다. 세종코퍼스의 형태의미분석 말뭉치로 학습하여 12단어 32,735문장에 대해 실험한 결과 용언의 경우 평균 정확도 92.2%로 윈도우를 고정적으로 사용한 경우에 비해 향상된 결과를 보였다.

  • PDF

Relation Extraction based on Composite Kernel using Pattern Similarity of Predicate-Argument Structure (술어-논항 구조의 패턴 유사도를 활용한 혼합 커널 기반 관계 추출)

  • Jeong, Chang-Hoo;Chun, Hong-Woo;Choi, Yun-Soo;Song, Sa-Kwang;Choi, Sung-Pil
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.276-279
    • /
    • 2011
  • 문서 내에 존재하는 개체 간의 관계를 자동으로 추출할 때 다양한 형태의 문서 분석 결과를 활용할 수 있다. 본 논문에서는 기존에 개발되어 비교적 높은 성능을 보여준 트리 커널의 구절 구조 유사성 정보와 두 개체 사이의 유의미한 연관관계를 표현하는 술어-논항 구조 패턴의 유사성 정보를 활용하는 혼합 커널을 제안한다. 구문적 구조를 이용하는 기존의 트리 커널 기법에 술어와 논항 간의 의미적 구조를 활용하는 술어-논항 구조 패턴 유사도 커널을 결합하여 상호보완적인 혼합 커널을 구성하였고, 실험을 통하여 개발된 커널의 성능을 측정하였다. 실험 결과 구절 구조 정보를 이용하는 트리 커널만을 단독으로 사용했을 때보다 술어-논항 구조의 패턴 정보를 결합한 혼합 커널을 사용했을 때에 더 좋은 성능을 보이는 것을 확인할 수 있었다. 이는 관계 인스턴스에 대한 구절 구조 정보뿐만 아니라 개체 간의 유의미한 연관관계를 표현해주는 술어-논항 구조 패턴 또한 관계 추출 작업에 매우 유용한 정보임을 입증하고 있다.