• Title/Summary/Keyword: Korean Thesaurus

Search Result 224, Processing Time 0.028 seconds

Query Expansion Using Augmented Terms in an Extended Boolean Model

  • Nguyen, Tuan-Quang;Heo, Jun-Seok;Lee, Jung-Hoon;Kim, Yi-Reun;Whang, Kyu-Young
    • Journal of Computing Science and Engineering
    • /
    • v.2 no.1
    • /
    • pp.26-43
    • /
    • 2008
  • We propose a new query expansion method in the extended Boolean model that improves precision without degrading recall. For improving precision, our method promotes the ranks of documents having more query terms since users typically prefer such documents. The proposed method consists of the following three steps: (1) expanding the query by adding new terms related to each term of the query, (2) further expanding the query by adding augmented terms, which are conjunctions of the terms, (3) assigning a weight on each term so that augmented terms have higher weights than the other terms. We conduct extensive experiments to show the effectiveness of the proposed method. The experimental results show that the proposed method improves precision by up to 102% for the TREC-6 data compared with the existing query expansion method using a thesaurus proposed by Kwon et al.

Cross-Lingual Text Retrieval Based on a Knowledge Base (지식베이스에 기반한 다언어 문서 검색)

  • Choi, Myeong-Bok;Jo, Jun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.1
    • /
    • pp.21-32
    • /
    • 2010
  • User query formation highly acts on the effectiveness of information retrieval when we retrieve documents from the general domain as a web. This thesis proposes a intelligent information retrieval method based on a cross-lingual knowledge base to effectively perform a cross-lingual text retrieval from the web. The inferred knowledge from the cross-lingual knowledge base helps user's word association to make up user query easily and exactly for effective cross-lingual text information retrieval. This thesis develops user's query reformation algorithm and experiments it with Korean and English web. Experimental results show that the algorithm based on the proposed knowledge base is much more effective than without knowledge base in the cross-lingual text retrieval.

Web Version Management System for Software Cooperation Development Environment (소프트웨어 공동 개발 환경을 위한 웹 버전 관리 시스템)

  • Kim Soo-Yong;Choi Dong-Oun
    • Journal of Internet Computing and Services
    • /
    • v.4 no.2
    • /
    • pp.21-30
    • /
    • 2003
  • This paper is to describe how to refer to software objects and reuse them through the web browser so that ail the members of the software development team can collaborate with each other on the development procedure, In addition, we describe how to convert the $ ^*$,mdl design information generated from the UML editor into their corresponding XML data, which is arranged to be saved in the relational database system. Furthermore, we provide a facet retrieval system based on which makes use of a object-oriented thesaurus, which supports an integrated environment through which all the project team members can share a lot of source codes and execution files as weil as object files produced from the web-based collaborative development environment. Finally, we have designed and implemented a web version management system that facilitates software developers to manage in their web-based search for the relationship of design informations.

  • PDF

YDK : A Thesaurus Developing System for Korean Language (한국어 통합정보사전 시스템)

  • Hwang, Do-Sam;Choi, Key-Sun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.9
    • /
    • pp.2885-2893
    • /
    • 2000
  • Dictionaries are indispensable for NLP(natural language processing) systems. Sophisticated algorithms in the NLP systems can be fully appreciated only with matching dictionaries that are built systematically based on computational linguistics. Only few dictionaries are developed for natural language processing. Available dictionaries are far from complete specifications for practical uses. So, it is necessary to develop an integrated information dictionary that includes useful lexical information for processing and understanding natural languages such as morphology and syntactic and semantic information. In this paper, we propose a method to build an integrated dictionary, and introduce a dictionary developing system.

  • PDF

Document Thematic words Extraction using Principal Component Analysis (주성분 분석을 이용한 문서 주제어 추출)

  • Lee, Chang-Beom;Kim, Min-Soo;Lee, Ki-Ho;Lee, Guee-Sang;Park, Hyuk-Ro
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.10
    • /
    • pp.747-754
    • /
    • 2002
  • In this paper, We propose a document thematic words extraction by using principal component analysis(PCA) which is one of the multivariate statistical methods. The proposed PCA model understands the flow of words in the document by using an eigenvalue and an eigenvector, and extracts thematic words. The proposed model is estimated by applying to document summarization. Experimental results using newspaper articles show that the proposed model is superior to the model using either word frequency or information retrieval thesaurus. We expect that the Proposed model can be applied to information retrieval , information extraction and document summarization.

A Study on the Development of Ontology based on the Jewelry Brand Information (귀금속.보석 상품정보 온톨로지 구축에 관한 연구)

  • Lee, Ki-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.7
    • /
    • pp.247-256
    • /
    • 2008
  • This research is to develop product retrieval system through simplified communication by applying intelligent agent technology based on automatically created domain ontology to present solution on problems with e-commerce system which searches in the web documents with a simple keyword. Ontology development extracts representative term based on classification information of international product classification code(UNSPSC) and jewelry websites that is applied to analogy relationship thesaurus to establish standardized ontology. The intelligent agent technology is applied to retrieval stage to support efficiency of information collection for users by designing and developing e-commerce system supported with semantic web. Moreover, it designs user profile to personalized search environment and provide personalized retrieval agent and retrieval environment with inference function to make available with fast information collection and accurate information search.

  • PDF

Applying Traditional Korean Medical Terms to SUI in the Unified Medical Language System(UMLS) Metathesaurus

  • Hong, Seong-Cheon;Jeong, Heon-Young;Jeon, Byong-Uk
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.16 no.1
    • /
    • pp.1-8
    • /
    • 2010
  • Objective: Various controlled vocabulary such as thesaurus and classification make us to reuse and share effectively by defining different concept and linking terms each other. The UMLS(Unified Medical Language System) is one of the most universal medical terminology systems. It is needed various methods to share and reuse information of traditional Korean medicine. We will research on method that adopt SUI of the UMLS(that is de facto standard in medical terminology system) in traditional Korean medical terminology. Method: We described major problems and applying process when we tried to add traditional Korean medicine in the part of meridian into the UMLS metathesaurus. Comparing western medical terms and traditional Korean medical terms for applying UMLS metathesaurus, there is not only many consistency, but also differences. Result: We confirmed what is the differences and consistency between western medical terms and traditional Korean medical terms. And then reviewed methods that apply the CUI, LUI, SUI in traditional Korean medical terms. Traditional Korean medical terms are not discriminated by singular or plural string. In addition, traditional Korean medical terms have vary string by initial law: the law of initial sound of a syllable. Character is described with Korean, traditional Chinese, modern Chinese, etc. According to meaning, language, initial law, SUI has a distinct value respectively. Conclusion: There are many differences to apply the UMLS between western medical terms and traditional Korean medical terms. For the better implementation to traditional Korean medicine into the UMLS, further research is needed in standardization and classification of traditional Korean medical terms, medical information system, etc. We hope this study helps the implementation UMLS, EHR, knowledge based system in Oriental medicine in the future.

  • PDF

A Korean Emotion Features Extraction Method and Their Availability Evaluation for Sentiment Classification (감정 분류를 위한 한국어 감정 자질 추출 기법과 감정 자질의 유용성 평가)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.4
    • /
    • pp.499-517
    • /
    • 2008
  • In this paper, we propose an effective emotion feature extraction method for Korean and evaluate their availability in sentiment classification. Korean emotion features are expanded from several representative emotion words and they play an important role in building in an effective sentiment classification system. Firstly, synonym information of English word thesaurus is used to extract effective emotion features and then the extracted English emotion features are translated into Korean. To evaluate the extracted Korean emotion features, we represent each document using the extracted features and classify it using SVM(Support Vector Machine). In experimental results, the sentiment classification system using the extracted Korean emotion features obtained more improved performance(14.1%) than the system using content-words based features which have generally used in common text classification systems.

  • PDF

A study of Korean Medicine Terminology that Meaning Breast Diseases During Breastfeeding (수유부의 유선질환을 의미하는 한의학 용어 연구)

  • Lee, Seon-Young;Oh, Jun-Ho;Cha, Wung-Seok;Kim, Nam-Il
    • Korean Journal of Oriental Medicine
    • /
    • v.16 no.2
    • /
    • pp.75-81
    • /
    • 2010
  • Objective : This study aims to clearly define the concept of Korean medicine terminology related with breast disease that occurs during breastfeeding. It attempts to suggest aguideline so that identical terms can be used to explain the medical conditions of breast-feeders from the perspective of oriental medicine. Method : This paper is based on what is recorded in medical books. It has organized the relations between the terms grounded on the analysis of similarities and differences in the concepts of the terms contained in them. The medical book chiefly used here was "Uibangyuchwi(醫方類聚)". To organize the terms, thesaurus was utilized. Result & Conclusion : The terminology of Korean medicine related with breast disease that occurs during breastfeeding is prescribed from the aspects of the causes, affected areas, or pathological conditions. The clinically typical terms of korean medicine are 'Tuyu(妬乳)' and 'Yuong(乳癰)'. The two are distinguished by whether one has systemic symptoms or not. If one has no systemic symptom, it is 'Chwiyu(吹乳)' or 'Tuyu', and these two are distinguished by whether one has 'Chang(瘡; sores)' or not. It is significant to organize the concepts of korean medicine terminology since they are directly related with treatments in the field.

Construction of Immunology Thesaurus and Ontology (면역학 시소러스 및 온톨로지 구축)

  • Im, Ji-Hui;Choe, Ho-Seop;Bae, Young-Jun;Ock, Cheol-Young;Choi, Sung-Pil;Sung, Won-Kyung;Park, Dong-In
    • Annual Conference on Human and Language Technology
    • /
    • 2005.10a
    • /
    • pp.21-27
    • /
    • 2005
  • 본 논문에서는 국가에서 추진하는 차세대신성장동력산업과 관련된 특정 분야('바이오 신약/장기' 분야 중 '면역 기능 제어')를 선택하여, 기구축된 면역학 전문용어사전을 비롯하여 의학용어사전, 표준국어대사전 등을 참조하여 핵심 용어와 관련 용어를 중심으로 면역학 시소러스(어휘 3,462개) 및 온톨로지(개념 노드 4,703개)를 구축하였다. 이것은 전문용어사전부터 온톨로지에 이르기까지 통일화된 표준 체계를 가지고 있으며, 도메인 온톨로지를 구축하여 향후 온톨로지 개발 방향을 설정할 수 있는 계기가 되었다고 할 수 있다. 또한 면역학 시소러스는 검색의 성능을 향상시킬 수 있도록 충분한 양의 데이터를 구축하였고 면역학 온톨로지는 언어처리적 관점에서의 온톨로지를 표현하였다. 이는 정보검색에서의 효율성을 비롯하여, 특정 웹 온톨로지 언어를 이용한 웹 온톨로지로의 변환성, 대규모 도메인 온톨로지라는 점에서 의미를 가진다고 할 수 있다.

  • PDF