• Title/Summary/Keyword: 지식 베이스 확장

Search Result 103, Processing Time 0.028 seconds

MRQUTER : A Parallel Qualitative Temporal Reasoner Using MapReduce Framework (MRQUTER: MapReduce 프레임워크를 이용한 병렬 정성 시간 추론기)

  • Kim, Jonghoon;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.5
    • /
    • pp.231-242
    • /
    • 2016
  • In order to meet rapid changes of Web information, it is necessary to extend the current Web technologies to represent both the valid time and location of each fact and knowledge, and reason their relationships. Until recently, many researches on qualitative temporal reasoning have been conducted in laboratory-scale, dealing with small knowledge bases. However, in this paper, we propose the design and implementation of a parallel qualitative temporal reasoner, MRQUTER, which can make reasoning over Web-scale large knowledge bases. This parallel temporal reasoner was built on a Hadoop cluster system using the MapReduce parallel programming framework. It decomposes the entire qualitative temporal reasoning process into several MapReduce jobs such as the encoding and decoding job, the inverse and equal reasoning job, the transitive reasoning job, the refining job, and applies some optimization techniques into each component reasoning job implemented with a pair of Map and Reduce functions. Through experiments using large benchmarking temporal knowledge bases, MRQUTER shows high reasoning performance and scalability.

MRSPAKE : A Web-Scale Spatial Knowledge Extractor Using Hadoop MapReduce (MRSPAKE : Hadoop MapReduce를 이용한 웹 규모의 공간 지식 추출기)

  • Lee, Seok-Jun;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.569-584
    • /
    • 2016
  • In this paper, we present a spatial knowledge extractor implemented in Hadoop MapReduce parallel, distributed computing environment. From a large spatial dataset, this knowledge extractor automatically derives a qualitative spatial knowledge base, which consists of both topological and directional relations on pairs of two spatial objects. By using R-tree index and range queries over a distributed spatial data file on HDFS, the MapReduce-enabled spatial knowledge extractor, MRSPAKE, can produce a web-scale spatial knowledge base in highly efficient way. In experiments with the well-known open spatial dataset, Open Street Map (OSM), the proposed web-scale spatial knowledge extractor, MRSPAKE, showed high performance and scalability.

Building an Ontology for Structured Diagnosis Data Entry of Educating Underachieving Students (구조화된 학습부진아 진단 자료 입력을 위한 온톨로지 개발)

  • Ha, Tae-Hyeon;Baek, Hyeon-Gi
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.545-555
    • /
    • 2005
  • 본 연구는 학습 부진아 진단 지식을 온톨로지로 표현함으로써 교사와 학생 간에 발생하는 학습 용어의 불일치성을 해소할 수 있으며 진단 과정에 있어 학습 부진아의 정보를 기반으로 한 추론을 기능하도록 한다. 또한 특정한 진단을 보여주는 일반적인 학습부진아 진단시스템과는 달리, 이러한 지식베이스를 이용하여 사용자에게 정확한 개념어(정답어)를 습득하게끔 해주고, 사용자의 인지 체계 속에 내포되어 있는 개념적 지식을 더욱 더 표면적으로 확장해 나갈 수 있는 온톨로지를 구축하는 방안을 제시한다.

  • PDF

Ontology-aware Deduct ive Inference System (온톨로지 연계 연역 추론 시스템의 설계 및 개발)

  • 장민수;손주찬
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.133-135
    • /
    • 2003
  • 시맨틱 웹은 지식을 구조적으로 표현할 수 있는 수단과, 논리를 기반으로 지식을 처리하는 기술을 주요 요소로 포함하고 있다. 후자에 대한 유력한 기술로 기호 논리를 기반으로 한 연역 추론 기법이 폭넓게 응용되고 있으나 아직 초보적인 단계에 머물러 있다. 본 논문은 시맨틱 웹 환경에서 효과적인 추론 기능을 수행할 수 있는 연역 추론 시스템의 설계 및 구현 내용을 담고 있다. 본 논문에서 제시하는 추론 시스템은 표준 기술 논리(Description Logic)의 상당 부분과 혼 논리(Horn Logic) 기반의 논리 프로그래밍을 아우르는 확장된 표현력을 제공하여, RETE 알고리즘 기반의 생성 시스템을 활용하여 추론한다. 또한, 규칙베이스를 구성하는 단위 지식들을 웹 자원화함으로써 온톨로지로 대표되는 시맨틱 웹의 지식 표현력을 확장하였다. 본 논문이 제시하는 추론 시스템을 이용하면 웹 온톨로지 위에 규칙 및 논리 계층[1]을 효과적으로 구현할 수 있다.

  • PDF

Extended Knowledge Graph using Relation Modeling between Heterogeneous Data for Personalized Recommender Systems (이종 데이터 간 관계 모델링을 통한 개인화 추천 시스템의 지식 그래프 확장 기법)

  • SeungJoo Lee;Seokho Ahn;Euijong Lee;Young-Duk Seo
    • Smart Media Journal
    • /
    • v.12 no.4
    • /
    • pp.27-40
    • /
    • 2023
  • Many researchers have investigated ways to enhance recommender systems by integrating heterogeneous data to address the data sparsity problem. However, only a few studies have successfully integrated heterogeneous data using knowledge graph. Additionally, most of the knowledge graphs built in these studies only incorporate explicit relationships between entities and lack additional information. Therefore, we propose a method for expanding knowledge graphs by using deep learning to model latent relationships between heterogeneous data from multiple knowledge bases. Our extended knowledge graph enhances the quality of entity features and ultimately increases the accuracy of predicted user preferences. Experiments using real music data demonstrate that the expanded knowledge graph leads to an increase in recommendation accuracy when compared to the original knowledge graph.

The Internet based Expert System for Gastroenteritis (소화기 질환을 위한 인터넷 기반 전문가 시스템)

  • 하상호;유일대;천인국;박상흠;김선주
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.6
    • /
    • pp.1079-1087
    • /
    • 2003
  • A lot of expert systems have been developed in the area of medical diagnosis and prescription, but there has been relatively little effort to develop medical expert systems using domestic technologies. In this paper, a medical expert system for gastroenteritis is developed, which can be easily used on the Internet. This system is implemented using Java technologies and Jess, which is a Java expert system shell. Therefore. the system can be used independently from a specific platform on the Internet. In addition, facts and rules of a knowledge base in the system ate represented separately making the system updatable and expandable with ease.

  • PDF

Research and Development of Document Recognition System for Utilizing Image Data (이미지데이터 활용을 위한 문서인식시스템 연구 및 개발)

  • Kwag, Hee-Kue
    • The KIPS Transactions:PartB
    • /
    • v.17B no.2
    • /
    • pp.125-138
    • /
    • 2010
  • The purpose of this research is to enhance document recognition system which is essential for developing full-text retrieval system of the document image data stored in the digital library of a public institution. To achieve this purpose, the main tasks of this research are: 1) analyzing the document image data and then developing its image preprocessing technology and document structure analysis one, 2) building its specialized knowledge base consisting of document layout and property, character model and word dictionary, respectively. In addition, developing the management tool of this knowledge base, the document recognition system is able to handle the various types of the document image data. Currently, we developed the prototype system of document recognition which is combined with the specialized knowledge base and the library of document structure analysis, respectively, adapted for the document image data housed in National Archives of Korea. With the results of this research, we plan to build up the test-bed and estimate the performance of document recognition system to maximize the utilization of full-text retrieval system.

Design Approach of Fault Diagnosis System for Network User (네트워크 사용자를 위한 장애진단시스템 설계)

  • 김홍주;이태경
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 1998.04a
    • /
    • pp.400-405
    • /
    • 1998
  • 현재 네트워크 환경과 컴퓨터시스템 환경에서의 고장 또는 장애에 의해서 통신이 않되는 경우에는 사용자들의 불편이 가중되고 있는 실정이다. 네트워크의 확장으로 인하여 네트워크를 관리하기 위한 도구들이 개발되어 왔다. 이 문제를 해결하기 위해서 본 논문에서는 네트워크 및 컴퓨터시스템 환경에서 발생하는 장애에 대한 원인 분석과 이에 따른 장애의 진단과 처치를 위하여 전문가시스템의 기법을 도입하였다. 장애의 원인을 탐색하기 위하여서는 추론기관과 지식베이스를 구성하였으며, 장애요인에 대한 지식은 장애를 하나의 객체로 하는 기법을 사용하였다.

  • PDF

A Hybrid Knowledge Representation Method for Pedagogical Content Knowledge (교수내용지식을 위한 하이브리드 지식 표현 기법)

  • Kim, Yong-Beom;Oh, Pill-Wo;Kim, Yung-Sik
    • Korean Journal of Cognitive Science
    • /
    • v.16 no.4
    • /
    • pp.369-386
    • /
    • 2005
  • Although Intelligent Tutoring System(ITS) offers individualized learning environment that overcome limited function of existent CAI, and consider many learners' variable, there is little development to be using at the sites of schools because of inefficiency of investment and absence of pedagogical content knowledge representation techniques. To solve these problem, we should study a method, which represents knowledge for ITS, and which reuses knowledge base. On the pedagogical content knowledge, the knowledge in education differs from knowledge in a general sense. In this paper, we shall primarily address the multi-complex structure of knowledge and explanation of learning vein using multi-complex structure. Multi-Complex, which is organized into nodes, clusters and uses by knowledge base. In addition, it grows a adaptive knowledge base by self-learning. Therefore, in this paper, we propose the 'Extended Neural Logic Network(X-Neuronet)', which is based on Neural Logic Network with logical inference and topological inflexibility in cognition structure, and includes pedagogical content knowledge and object-oriented conception, verify validity. X-Neuronet defines that a knowledge is directive combination with inertia and weights, and offers basic conceptions for expression, logic operator for operation and processing, node value and connection weight, propagation rule, learning algorithm.

  • PDF

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.