• Title/Summary/Keyword: 온톨로지 평가

Search Result 159, Processing Time 0.025 seconds

Comparison Shopping Systems using Image Retrieval based on Semantic Web (시맨틱 웹 기반의 이미지 정색을 이용한 비교 쇼핑 시스템)

  • Lee, Kee-Sung;Yu, Young-Hoon;Jo, Gun-Sik;Kim, Heung-Nam
    • Journal of Intelligence and Information Systems
    • /
    • v.11 no.2
    • /
    • pp.1-15
    • /
    • 2005
  • The explosive growth of the Internet leads to various on-line shopping malls and active E-Commerce. however, as the internet has experienced continuous growth, users have to face a variety and a huge amount of items, and often waste a lot of time on purchasing items that are relevant to their interests. To overcome this problem the comparison shopping systems, which can help to compare items' information with those other shopping malls, have been issued as a solution. However, when users do not have much knowledge what they want to find, a keyword-based searching in the existing comparison shopping systems lead users to waste time for searching information. Thereby, the performance is fell down. To solve this problem in this research, we suggest the Comparison Shopping System using Image Retrieval based on Semantic Web. The proposed system can assist users who don't know items' information that they want to find and serve users for quickly comparing information among the items. In the proposed system we use semantic web technology. We insert the Semantic Annotation based on Ontology into items' image of each shopping mall. Consequently, we employ those images for searching the items instead of using a complex keyword. In order to evaluate performance of the proposed system we compare our experimental results with those of Keyword-based Comparison Shopping System and simple Semantic Web-based Comparison Shopping System. Our result shows that the proposed system has improved performance in comparison with the other systems.

  • PDF

SWAT: A Study on the Efficient Integration of SWRL and ATMS based on a Distributed In-Memory System (SWAT: 분산 인-메모리 시스템 기반 SWRL과 ATMS의 효율적 결합 연구)

  • Jeon, Myung-Joong;Lee, Wan-Gon;Jagvaral, Batselem;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.45 no.2
    • /
    • pp.113-125
    • /
    • 2018
  • Recently, with the advent of the Big Data era, we have gained the capability of acquiring vast amounts of knowledge from various fields. The collected knowledge is expressed by well-formed formula and in particular, OWL, a standard language of ontology, is a typical form of well-formed formula. The symbolic reasoning is actively being studied using large amounts of ontology data for extracting intrinsic information. However, most studies of this reasoning support the restricted rule expression based on Description Logic and they have limited applicability to the real world. Moreover, knowledge management for inaccurate information is required, since knowledge inferred from the wrong information will also generate more incorrect information based on the dependencies between the inference rules. Therefore, this paper suggests that the SWAT, knowledge management system should be combined with the SWRL (Semantic Web Rule Language) reasoning based on ATMS (Assumption-based Truth Maintenance System). Moreover, this system was constructed by combining with SWRL reasoning and ATMS for managing large ontology data based on the distributed In-memory framework. Based on this, the ATMS monitoring system allows users to easily detect and correct wrong knowledge. We used the LUBM (Lehigh University Benchmark) dataset for evaluating the suggested method which is managing the knowledge through the retraction of the wrong SWRL inference data on large data.

Real-time and Parallel Semantic Translation Technique for Large-Scale Streaming Sensor Data in an IoT Environment (사물인터넷 환경에서 대용량 스트리밍 센서데이터의 실시간·병렬 시맨틱 변환 기법)

  • Kwon, SoonHyun;Park, Dongwan;Bang, Hyochan;Park, Youngtack
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.54-67
    • /
    • 2015
  • Nowadays, studies on the fusion of Semantic Web technologies are being carried out to promote the interoperability and value of sensor data in an IoT environment. To accomplish this, the semantic translation of sensor data is essential for convergence with service domain knowledge. The existing semantic translation technique, however, involves translating from static metadata into semantic data(RDF), and cannot properly process real-time and large-scale features in an IoT environment. Therefore, in this paper, we propose a technique for translating large-scale streaming sensor data generated in an IoT environment into semantic data, using real-time and parallel processing. In this technique, we define rules for semantic translation and store them in the semantic repository. The sensor data is translated in real-time with parallel processing using these pre-defined rules and an ontology-based semantic model. To improve the performance, we use the Apache Storm, a real-time big data analysis framework for parallel processing. The proposed technique was subjected to performance testing with the AWS observation data of the Meteorological Administration, which are large-scale streaming sensor data for demonstration purposes.

Mass Spectrometry-based Comparative Analysis of Membrane Protein: High-speed Centrifuge Method Versus Reagent-based Method (질량분석기를 활용한 막 단백질 비교분석: High-speed Centrifuge법과 Reagent-based법)

  • Lee, Jiyeong;Seok, Ae Eun;Park, Arum;Mun, Sora;Kang, Hee-Gyoo
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.51 no.1
    • /
    • pp.78-85
    • /
    • 2019
  • Membrane proteins are involved in many common diseases, including heart disease and cancer. In various disease states, such as cancer, abnormal signaling pathways that are related to the membrane proteins cause the cells to divide out of control and the expression of membrane proteins can be altered. Membrane proteins have the hydrophobic environment of a lipid bilayer, which makes an analysis of the membrane proteins notoriously difficult. Therefore, this study evaluated the efficacy of two different methods for optimal membrane protein extraction. High-speed centrifuge and reagent-based method with a -/+ filter aided sample preparation (FASP) were compared. As a result, the high-speed centrifuge method is quite effective in analyzing the mitochondrial inner membranes, while the reagent-based method is useful for endoplasmic reticulum membrane analysis. In addition, the function of the membrane proteins extracted from the two methods were analyzed using GeneGo software. GO processes showed that the endoplasmic reticulum-related responses had higher significance in the reagent-based method. An analysis of the process networks showed that one cluster in the high-speed centrifuge method and four clusters in the reagent-based method were visualized. In conclusion, the two methods are useful for the analysis of different subcellular membrane proteins, and are expected to assist in selecting the membrane protein extraction method by considering the target subcellular membrane proteins for study.

Improving Archival Descriptive Standard Based on the Analysis of the Reviews by Archival Communities on RiC-CM Draft (RiC에 대한 기록공동체의 리뷰를 통해 본 기록물 기술표준 개선을 위한 제안)

  • Park, Ziyoung
    • The Korean Journal of Archival Studies
    • /
    • no.54
    • /
    • pp.81-109
    • /
    • 2017
  • This study suggests an analysis of the reviews provided by international archival professionals on the RiC-CM draft published by ICA EGAD. Some implications for the Korean archival management environment were also suggested. Some professional reviews were accessible through the internet. Italian archival professionals held workshops at various levels for the analysis and discussion of the draft. Duranti, the project director of InterPARES, also gave opinions about the draft in cluding the perspective of digital preservation. In the review of Artefactual, the draft was discussed in terms of system implementation. Reed, the director of Recordkeeping Innovation, also gave a feedback based on the record management experiences in Australia. Some implications can be suggested based on these professional opinions. First, we should try to build a test bed for the adoption of RiC to archival description in the Korean environment. Second, a minimum level of data elements that can secure authenticity and integrity will also be needed. Third and lastly, rich authority data for agents and functions related to archival records and records groups are essential to take full advantage of the standard.

A Study of the Governance Discussion on Community Archives in North America (북미지역 공동체 아카이브의 '거버넌스' 논의와 비판적 독해)

  • Lee, Kyong-Rae
    • The Korean Journal of Archival Studies
    • /
    • no.38
    • /
    • pp.225-264
    • /
    • 2013
  • The Purpose of this study is to analyze an active discussion in North America about the issue of community archives governance which mainly focused on 'participatory archives' model and from it, draws implications for the present stage of domestic community archives development. Traditionally in the United States and Canada, local community archives have been built mostly by mainstream cultural institutions such as public archives, public libraries, museums, and historical societies as a part of comprehensive documentation of the society at large. At the same time, they have been processed and managed in accordance with the institution's collection development policy. As a result, most community archives in North America are characterized as top-down community archives model (in contrast with down-up model of 'independent' community archives as a part of grass roots movement in the UK). Recently, the community archives in North America with these characteristics try to overcome their limitations, which result in 'the others' of community archives, through governance, that is, community-institution partnership. Participatory archives model which assumes active community participation in all archives processes is being suggested by archival communities as the effective alternative of governance model of top-down community archives. This discussion of community archives governance suggests progressive direction for the present stage of domestic community archives, which has been built mostly by various mainstream cultural institutions and still has been stayed in 'about the community' stage. Particularly, community outreach strategies that participatory archives model concretely suggests are useful as a conceptual framework in building community archives based on community-institution partnership in reality.

Change Acceptable In-Depth Searching in LOD Cloud for Efficient Knowledge Expansion (효과적인 지식확장을 위한 LOD 클라우드에서의 변화수용적 심층검색)

  • Kim, Kwangmin;Sohn, Yonglak
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.171-193
    • /
    • 2018
  • LOD(Linked Open Data) cloud is a practical implementation of semantic web. We suggested a new method that provides identity links conveniently in LOD cloud. It also allows changes in LOD to be reflected to searching results without any omissions. LOD provides detail descriptions of entities to public in RDF triple form. RDF triple is composed of subject, predicates, and objects and presents detail description for an entity. Links in LOD cloud, named identity links, are realized by asserting entities of different RDF triples to be identical. Currently, the identity link is provided with creating a link triple explicitly in which associates its subject and object with source and target entities. Link triples are appended to LOD. With identity links, a knowledge achieves from an LOD can be expanded with different knowledge from different LODs. The goal of LOD cloud is providing opportunity of knowledge expansion to users. Appending link triples to LOD, however, has serious difficulties in discovering identity links between entities one by one notwithstanding the enormous scale of LOD. Newly added entities cannot be reflected to searching results until identity links heading for them are serialized and published to LOD cloud. Instead of creating enormous identity links, we propose LOD to prepare its own link policy. The link policy specifies a set of target LODs to link and constraints necessary to discover identity links to entities on target LODs. On searching, it becomes possible to access newly added entities and reflect them to searching results without any omissions by referencing the link policies. Link policy specifies a set of predicate pairs for discovering identity between associated entities in source and target LODs. For the link policy specification, we have suggested a set of vocabularies that conform to RDFS and OWL. Identity between entities is evaluated in accordance with a similarity of the source and the target entities' objects which have been associated with the predicates' pair in the link policy. We implemented a system "Change Acceptable In-Depth Searching System(CAIDS)". With CAIDS, user's searching request starts from depth_0 LOD, i.e. surface searching. Referencing the link policies of LODs, CAIDS proceeds in-depth searching, next LODs of next depths. To supplement identity links derived from the link policies, CAIDS uses explicit link triples as well. Following the identity links, CAIDS's in-depth searching progresses. Content of an entity obtained from depth_0 LOD expands with the contents of entities of other LODs which have been discovered to be identical to depth_0 LOD entity. Expanding content of depth_0 LOD entity without user's cognition of such other LODs is the implementation of knowledge expansion. It is the goal of LOD cloud. The more identity links in LOD cloud, the wider content expansions in LOD cloud. We have suggested a new way to create identity links abundantly and supply them to LOD cloud. Experiments on CAIDS performed against DBpedia LODs of Korea, France, Italy, Spain, and Portugal. They present that CAIDS provides appropriate expansion ratio and inclusion ratio as long as degree of similarity between source and target objects is 0.8 ~ 0.9. Expansion ratio, for each depth, depicts the ratio of the entities discovered at the depth to the entities of depth_0 LOD. For each depth, inclusion ratio illustrates the ratio of the entities discovered only with explicit links to the entities discovered only with link policies. In cases of similarity degrees with under 0.8, expansion becomes excessive and thus contents become distorted. Similarity degree of 0.8 ~ 0.9 provides appropriate amount of RDF triples searched as well. Experiments have evaluated confidence degree of contents which have been expanded in accordance with in-depth searching. Confidence degree of content is directly coupled with identity ratio of an entity, which means the degree of identity to the entity of depth_0 LOD. Identity ratio of an entity is obtained by multiplying source LOD's confidence and source entity's identity ratio. By tracing the identity links in advance, LOD's confidence is evaluated in accordance with the amount of identity links incoming to the entities in the LOD. While evaluating the identity ratio, concept of identity agreement, which means that multiple identity links head to a common entity, has been considered. With the identity agreement concept, experimental results present that identity ratio decreases as depth deepens, but rebounds as the depth deepens more. For each entity, as the number of identity links increases, identity ratio rebounds early and reaches at 1 finally. We found out that more than 8 identity links for each entity would lead users to give their confidence to the contents expanded. Link policy based in-depth searching method, we proposed, is expected to contribute to abundant identity links provisions to LOD cloud.

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.