• Title/Summary/Keyword: 의미 기반 정보 추출

Search Result 678, Processing Time 0.029 seconds

A study on unstructured text mining algorithm through R programming based on data dictionary (Data Dictionary 기반의 R Programming을 통한 비정형 Text Mining Algorithm 연구)

  • Lee, Jong Hwa;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2015
  • Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

Storing Scheme based on Graph Data Model for Managing RDF/S Data (RDF/S 데이터의 관리를 위한 그래프 데이터 모델 기반 저장 기법)

  • Kim, Youn-Hee;Choi, Jae-Yeon;Lim, Hae-Chull
    • Journal of Digital Contents Society
    • /
    • v.9 no.2
    • /
    • pp.285-293
    • /
    • 2008
  • In Semantic Web, metadata and ontology for representing semantics and conceptual relationships of information resources are essential factors. RDF and RDF Schema are W3C standard models for describing metadata and ontology. Therefore, many studies to store and retrieve RDF and RDF Schema documents are required. In this paper, we focus on some results of analyzing available query patterns considering both RDF and RDF Schema and classify queries on RDF and RDF Schema into the three patterns. RDF and RDF Schema can be represented as graph models. So, we proposed some strategies to store and retrieve using the graph models of RDF and RDF Schema. We can retrieve entities that can be arrived from a certain class or property in RDF and RDF Schema without a loss of performance on account of multiple joins with tables.

  • PDF

Korean Coreference Resolution using the Multi-pass Sieve (Multi-pass Sieve를 이용한 한국어 상호참조해결)

  • Park, Cheon-Eum;Choi, Kyoung-Ho;Lee, Changki
    • Journal of KIISE
    • /
    • v.41 no.11
    • /
    • pp.992-1005
    • /
    • 2014
  • Coreference resolution finds all expressions that refer to the same entity in a document. Coreference resolution is important for information extraction, document classification, document summary, and question answering system. In this paper, we adapt Stanford's Multi-pass sieve system, the one of the best model of rule based coreference resolution to Korean. In this paper, all noun phrases are considered to mentions. Also, unlike Stanford's Multi-pass sieve system, the dependency parse tree is used for mention extraction, a Korean acronym list is built 'dynamically'. In addition, we propose a method that calculates weights by applying transitive properties of centers of the centering theory when refer Korean pronoun. The experiments show that our system obtains MUC 59.0%, $B_3$ 59.5%, Ceafe 63.5%, and CoNLL(Mean) 60.7%.

Similarity Evaluation of Popular Music based on Emotion and Structure of Lyrics (가사의 감정 분석과 구조 분석을 이용한 노래 간 유사도 측정)

  • Lee, Jaehwan;Lim, Hyewon;Kim, Hyoung-Joo
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.10
    • /
    • pp.479-487
    • /
    • 2016
  • People can listen to almost every type of music by music streaming services without possessing music. Ironically it is difficult to choose what to listen to. A music recommendation system helps people in making a choice. However, existing recommendation systems have high computation complexity and do not consider context information. Emotion is one of the most important context information of music. Lyrics can be easily computed with various language processing techniques and can even be used to extract emotion of music from itself. We suggest a music-level similarity evaluation method using emotion and structure. Our result shows that it is important to consider semantic information when we evaluate similarity of music.

Prediction of Prosodic Break Using Syntactic Relations and Prosodic Features (구문 관계와 운율 특성을 이용한 한국어 운율구 경계 예측)

  • Jung, Young-Im;Cho, Sun-Ho;Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.1
    • /
    • pp.89-105
    • /
    • 2008
  • In this paper, we suggest a rule-based system for the prediction of natural prosodic phrase breaks from Korean texts. For the implementation of the rule-based system, (1) sentence constituents are sub-categorized according to their syntactic functions, (2) syntactic phrases are recognized using the dependency relations among sub-categorized constituents, (3) rules for predicting prosodic phrase breaks are created. In addition, (4) the length of syntactic phrases and sentences, the position of syntactic phrases in a sentence, sense information of contextual words have been considered as to determine the variable prosodic phrase breaks. Based on these rules and features, we obtained the accuracy over 90% in predicting the position of major break and no break which have high correlation with the syntactic structure of the sentence. As for the overall accuracy in predicting the whole prosodic phrase breaks, the suggested system shows Break_Correct of 87.18% and Juncture Correct of 89.27% which is higher than that of other models.

  • PDF

Linking Korean Predicates to Knowledge Base Properties (한국어 서술어와 지식베이스 프로퍼티 연결)

  • Won, Yousung;Woo, Jongseong;Kim, Jiseong;Hahm, YoungGyun;Choi, Key-Sun
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1568-1574
    • /
    • 2015
  • Relation extraction plays a role in for the process of transforming a sentence into a form of knowledge base. In this paper, we focus on predicates in a sentence and aim to identify the relevant knowledge base properties required to elucidate the relationship between entities, which enables a computer to understand the meaning of a sentence more clearly. Distant Supervision is a well-known approach for relation extraction, and it performs lexicalization tasks for knowledge base properties by generating a large amount of labeled data automatically. In other words, the predicate in a sentence will be linked or mapped to the possible properties which are defined by some ontologies in the knowledge base. This lexical and ontological linking of information provides us with a way of generating structured information and a basis for enrichment of the knowledge base.

An efficient Decision-Making using the extended Fuzzy AHP Method(EFAM) (확장된 Fuzzy AHP를 이용한 효율적인 의사결정)

  • Ryu, Kyung-Hyun;Pi, Su-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.828-833
    • /
    • 2009
  • WWW which is an applicable massive set of document on the Web is a thesaurus of various information for users. However, Search engines spend a lot of time to retrieve necessary information and to filter out unnecessary information for user. In this paper, we propose the EFAM(the Extended Fuzzy AHP Method) model to manage the Web resource efficiently, and to make a decision in the problem of specific domain definitely. The EFAM model is concerned with the emotion analysis based on the domain corpus information, and it composed with systematic common concept grids by the knowledge of multiple experts. Therefore, The proposed the EFAM model can extract the documents by considering on the emotion criteria in the semantic context that is extracted concept from the corpus of specific domain and confirms that our model provides more efficient decision-making through an experiment than the conventional methods such as AHP and Fuzzy AHP which describe as a hierarchical structure elements about decision-making based on the alternatives, evaluation criteria, subjective attribute weight and fuzzy relation between concept and object.

Medicine Ontology Building based on Semantic Relation and Its Application (의미관계 정보를 이용한 약품 온톨로지의 구축과 활용)

  • Lim Soo-Yeon;Park Seong-Bae;Lee Sang-Jo
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.428-437
    • /
    • 2005
  • An ontology consists of a set and definition of concepts that represents the characteristics of a given domain and relationship between the elements. To reduce time-consuming and cost in building ontology, this paper proposes a semiautomatic method to build a domain ontology using the results of text analysis. To do this, we Propose a terminology processing method and use the extracted concepts and semantic relations between them to build ontology. An experiment domain is selected by the pharmacy field and the built ontology is applied to document retrieval. In order to represent usefulness for retrieving a document using the hierarchical relations in ontology, we compared a typical keyword based retrieval method with an ontology based retrieval method, which uses related information in an ontology for a related feedback. As a result, the latter shows the improvement of precision and recall by $4.97\%$ and $0.78\%$ respectively.

Construction of Global Finite State Machine from Message Sequence Charts for Testing Task Interactions (태스크 상호작용 테스팅을 위한 MSC 명세로부터의 전체 유한 상태 기계 생성)

  • Lee, Nam-Hee;Kim, Tai-Hyo;Cha, Sung-Deok;Shin, Seog-Jong;Hong, H-In-Pyo;Park, Ki-Wung
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.9
    • /
    • pp.634-648
    • /
    • 2001
  • Message Sequence Charts(MSC) has been used to describe the interactions of numerous concurrent tasks in telecommunication software. After the MSC specification is verified in requirement analysis phase, it can be used not only to synthesize state-based design models, but also to generate test sequences. Until now, the verification is accomplished by generating global state transition graph using the location information only. In this paper, we extend the condition statement of MSC to describe the activation condition of scenarios and the change of state variables, and propose an approach to construct global finite state machine (GFSM) using this information. The GFSM only includes feasible states and transitions of the system. We can generate the test sequences using the existing FSM-based test sequence generation technology.

  • PDF

Development of a Facet Classification System for Presidential Gift Search in Presidential Archives (대통령기록관 대통령선물 검색을 위한 패싯 분류체계 개발)

  • Yoon, Gyubin;Kim, Daeun;Jang, Hyo-Jeong
    • The Korean Journal of Archival Studies
    • /
    • no.76
    • /
    • pp.119-157
    • /
    • 2023
  • This study attempted to propose a faceted search function to supplement metadata for existing presidential gifts. To this end, based on 3,574 presidential gifts provided online by the Presidential Archives, identified the characteristics of records extracted from the gift name, gift giver, gift country, gift date, and receipt process, specifications, and characteristics of the presidential gift. Based on this, study designed a facet-based classification of presidential gifts with 5 basic facets and 51 sub-facets and structured facets define each facet element and assign an arrangement order and symbol. This classification system can be expected to be utilized as a basis for building faceted navigation by applying it to a search system. Through the study, it was confirmed that it was necessary to develop a new classification system for presidential gifts, and it was proposed to apply facet classification as an alternative classification system for this purpose.