• Title/Summary/Keyword: Semantic Hierarchy

Search Result 95, Processing Time 0.024 seconds

Learning Rules for Identifying Hypernyms in Machine Readable Dictionaries (기계가독형사전에서 상위어 판별을 위한 규칙 학습)

  • Choi Seon-Hwa;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.171-178
    • /
    • 2006
  • Most approaches for extracting hypernyms of a noun from its definitions in an MRD rely on lexical patterns compiled by human experts. Not only these approaches require high cost for compiling lexical patterns but also it is very difficult for human experts to compile a set of lexical patterns with a broad-coverage because in natural languages there are various expressions which represent same concept. To alleviate these problems, this paper proposes a new method for extracting hypernyms of a noun from its definitions in an MRD. In proposed approach, we use only syntactic (part-of-speech) patterns instead of lexical patterns in identifying hypernyms to reduce the number of patterns with keeping their coverage broad. Our experiment has shown that the classification accuracy of the proposed method is 92.37% which is significantly much better than that of previous approaches.

An Efficient RDF Query Validation for Access Authorization in Subsumption Inference (포함관계 추론에서 접근 권한에 대한 효율적 RDF 질의 유효성 검증)

  • Kim, Jae-Hoon;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.6
    • /
    • pp.422-433
    • /
    • 2009
  • As an effort to secure Semantic Web, in this paper, we introduce an RDF access authorization model based on an ontology hierarchy and an RDF triple pattern. In addition, we apply the authorization model to RDF query validation for approved access authorizations. A subscribed SPARQL or RQL query, which has RDF triple patterns, can be denied or granted according to the corresponding access authorizations which have an RDF triple pattern. In order to efficiently perform the query validation process, we first analyze some primary authorization conflict conditions under RDF subsumption inference, and then we introduce an efficient query validation algorithm using the conflict conditions and Dewey graph labeling technique. Through experiments, we also show that the proposed validation algorithm provides a reasonable validation time and when data and authorizations increase it has scalability.

A Study on Development of Digital Compilation Management System for Local Culture Contents: Focusing on the Case of The Encyclopedia of Korean Local Culture (향토문화 콘텐츠를 위한 디지털 편찬 관리시스템 개발에 관한 연구: "한국향토문화전자대전"의 사례를 중심으로)

  • Kim, Su-Young
    • Journal of the Korean Society for information Management
    • /
    • v.26 no.3
    • /
    • pp.213-237
    • /
    • 2009
  • Local culture is a cultural heritage that has come down from generation to generation in the natural environment of a region. It includes history, tradition, natural features, art, and historic relics. The Academy of Korean Studies has complied "The Encyclopedia of Korean Local Culture" using those local culture contents. Local culture content shave the features of documentary, such as authenticating the source, and managing hierarchy structure. Thus, to deal with local culture contents, a "circular knowledge information management system" is sought for that helps basic, fragmentary, and high-level information to circulate to create new knowledge information within the system. A user of this circular knowledge information management system is able not only to collect data directly in it, but also to fetch data from other database. Besides, processing the collected data helps to create new knowledge information. But, it's very difficult to sustain the features of the original hierarchy bearing meaning contained in the various kinds of local culture contents when building a new database. Moreover, this kind of work needs many times of correction over a long period of time. Therefore, a system in which compilation, correction, and service can be done simultaneously is needed. Therefore, in this study, focusing on the case of "The Encyclopedia of Korean Local Culture", I propose a XML-based digital compilation management system that can express hierarchy information and sustain the semantic features of the local culture contents containing lots of ancient documents, and introduce the expanded functions developed to manage contents in the system.

Exploring the Cognitive Factors that Affect Pedestrian-Vehicle Crashes in Seoul, Korea : Application of Deep Learning Semantic Segmentation (서울시 보행자 교통사고에 영향을 미치는 인지적 요인 분석 : 딥러닝 기반의 의미론적 분할기법을 적용하여)

  • Ko, Dong-Won;Park, Seung-Hoon;Lee, Chang-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.288-304
    • /
    • 2022
  • Walking is an eco-friendly and sustainable means of transportation that promotes health and endurance. Despite the positive health benefits of walking, pedestrian safety is a serious problem in Korea. Therefore, it is necessary to investigate with various studies to reduce pedestrian-vehicle crashes. In this study, the cognitive characteristics affecting pedestrian-vehicle crashes were considered by applying deep learning semantic segmentation. The main results are as follows. First, it was found that the risk of pedestrian-vehicle crashes increased when the ratio of buildings among cognitive factors increased and when the ratio of vegetation and the ratio of sky decreased. Second, the humps were shown to reduce the risk of pedestrian-related collisions. Third, the risk of pedestrian-vehicle crashes was found to increase in areas with many neighborhood roads with lower hierarchy. Fourth, traffic lights, crosswalks, and traffic signs do not have a practical effect on reducing pedestrian-vehicle crashes. This study considered existing physical neighborhood environmental factors as well as factors in cognitive aspects that comprise the visual elements of the streetscape. In fact, the cognitive characteristics were shown to have an effect on the occurrence of pedestrian- related collisions. Therefore, it is expected that this study will be used as fundamental research to create a pedestrian-friendly urban environment considering cognitive characteristics in the future.

A Study of Ontology-based Cataloguing System Using OWL (OWL을 이용한 온톨로지 기반의 목록시스템 설계 연구)

  • 이현실;한성국
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.2
    • /
    • pp.249-267
    • /
    • 2004
  • Although MARC can define the detail cataloguing data, it has complex structures and frameworks to represent bibliographic information. On account of these idiosyncratic features of MARC, XML DTD or RDF/S that supports simple hierarchy of conceptual vocabularies cannot capture MARC formalism effectively. This study implements bibliographic ontology by means of abstracting conceptual relationships between bibliographic vocabularies of MARC. The bibliographic ontology is formalized with OWL that can represent the logical relations between conceptual elements and specify cardinality and property value restrictions. The bibliographic ontology in this study will provide metadata for cataloguing data and resolve compatibility problems between cataloguing systems. And it can also contribute the development of next generation bibliographic information system using semantic Web services.

A Study of Effective Creating Methods of Philosophy Digital Knowledge Resources (철학 디지털 지식 자원의 효과적인 구축 방향에 대한 연구)

  • Choi Byung-Il;Chung Hyun-Sook
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.2
    • /
    • pp.39-51
    • /
    • 2005
  • A study of philosophy is a process that archive, reorganize and analyze the earlier works to discover new facts. Philosophy digital resources is necessary to research philosophy because they provide lots of electronic texts, philosophical information, forums, etc. In this paper, we introduce . our result of a research on philosophy digital resources existing in domestic or oversea web sites. We describe the problems which existing resources have and our solution to solve them. Also we provide a guideline to creating philosophy ontology based on topic maps which are data model of ontology. Our philosophy ontology defines hierarchy and associative relationships between philosophical knowledge and support retrieval and exploring of knowledge using semantic information.

  • PDF

사용자 의도 정보를 사용한 웹문서 분류

  • Jang, Yeong-Cheol
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2008.10b
    • /
    • pp.292-297
    • /
    • 2008
  • 복잡한 시맨틱을 포함한 웹 문서를 정확히 범주화하고 이 과정을 자동화하기 위해서는 인간의 지식체계를 수용할 수 있는 표준화, 지능화, 자동화된 문서표현 및 분류기술이 필요하다. 이를 위해 키워드 빈도수, 문서내 키워드들의 관련성, 시소러스의 활용, 확률기법 적용 등에 사용자의도(intention) 정보를 활용한 범주화와 조정 프로세스를 도입하였다. 웹 문서 분류과정에서 시소러스 등을 사용하는 지식베이스 문서분류와 비 감독 학습을 하는 사전 지식체계(a priori)가 없는 유사성 문서분류 방법에 의도정보를 사용할 수 있도록 기반체계를 설계하였고 다시 이 두 방법의 차이는 Hybrid조정프로세스에서 조정하였다. 본 연구에서 설계된 HDCI(Hybrid Document Classification with Intention) 모델은 위의 웹 문서 분류과정과 이를 제어 및 보조하는 사용자 의도 분석과정으로 구성되어 있다. 의도분석과정에 키워드와 함께 제공된 사용자 의도는 도메인 지식(domain Knowledge)을 이용하여 의도간 계층트리(intention hierarchy tree)를 구성하고 이는 문서 분류시 제약(constraint) 또는 가이드의 역할로 사용자 의도 프로파일(profile) 또는 문서 특성 대표 키워드를 추출하게 된다. HDCI는 문서간 유사성에 근거한 상향식(bottom-up)의 확률적인 접근에서 통제 및 안내의 역할을 수행하고 지식베이스(시소러스) 접근 방식에서 다양성에 한계가 있는 키워들 간 관계설정의 정확도를 높인다.

  • PDF

Cooperative Query Answering Based on Abstraction Database (추상화 정보 데이터베이스 기반 협력적 질의 응답)

  • 허순영;이정환
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.24 no.1
    • /
    • pp.99-117
    • /
    • 1999
  • Since query language is used as a handy tool to obtain information from a database, a more intelligent query answering system is needed to provide user-friendly and fault-tolerant human-machine Interface. Frequently, database users prefer less rigid querying structure, one which allows for vagueness in composing queries, and want the system to understand the intent behind a query. When there is no matching data available, users would rather receive approximate answers than a null information response. This paper presents a knowledge abstraction database that facilitates the development of such a fault-tolerant and intelligent database system. The proposed knowledge abstraction database adepts a multilevel knowledge representation scheme called the knowledge abstraction hierarchy(KAH), extracts semantic data relationships from the underlying database, and provides query transformation mechanisms using query generalization and specialization steps. In cooperation with the underlying database, the knowledge abstraction database accepts vague queries and allows users to pose approximate queries as well as conceptually abstract queries. Specifically. four types of vague queries are discussed, including approximate selection, approximate join, conceptual selection, and conceptual Join. A prototype system has been implemented at KAIST and is being tested with a personnel database system to demonstrate the usefulness and practicality of the knowledge abstraction database in ordinary database application systems.

  • PDF

Improving methods for normalizing biomedical text entities with concepts from an ontology with (almost) no training data at BLAH5 the CONTES

  • Ferre, Arnaud;Ba, Mouhamadou;Bossy, Robert
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.20.1-20.5
    • /
    • 2019
  • Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.