• Title/Summary/Keyword: Korean Thesaurus

Search Result 224, Processing Time 0.029 seconds

Study on Acceleration of Building a Thesaurus by Means of Pre-applying of $\alpha$-cut ($\alpha$-cut 선적용에 의한 시소러스 구축의 가속화에 관한 연구)

  • 김창민;김용기
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1997.10a
    • /
    • pp.233-236
    • /
    • 1997
  • 퍼지 관계 개념을 응용한 퍼지 정보 검색은 형태론에 입각한 기존의 정보 검색과는 달리 문서와 용어의 의미론에 근거하는 정보검색을 할 수 있다. 퍼지 정보 검색은 문헌의 집합 용어의 집합으로 나누고 문헌과 용어의 관계성을 문서 $\times$ 용어이 관계 행렬로 나타내며 퍼지 관계곱 연산을 이용하여 시소러스(thesaurus)를 형성하고 사용자로부터 주어진 질의 적합한 문서를 제공한다. 그러나 이러한 퍼지 관계곱 연산은 매우 큰 시간 복합도를 요구하는 연산이고 퍼지값은 부동소수점으로 표현해야하므로 대용량의 문서 시스템에 적용할 수 없어 비현실적이다. 부동소수점 연산은 연산속도가 느리고 저장공간도 많이 요구하므로 부동소수점 연산을 비트 연산으로 대체할 수 있다면 처리속도와 처리공간에 있어 성능 향상을 기대할 수 있다. 본 연구는 퍼지 정보 검색의 시소러스 형성에 있어 $\alpha$-cut 적용의 시기를 조정하여 성능을 향상하는 방법을 제안한다.

  • PDF

A Study on Korean Spoken Language Understanding Model (한국어 구어 음성 언어 이해 모델에 관한 연구)

  • 노용완;홍광석
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2435-2438
    • /
    • 2003
  • In this paper, we propose a Korean speech understanding model using dictionary and thesaurus. The proposed model search the dictionary for the same word with in input text. If it is not in the dictionary, the proposed model search the high level words in the high level word dictionary based on the thesaurus. We compare the probability of sentence understanding model with threshold probability, and we'll get the speech understanding rate. We evaluated the performance of the sentence speech understanding system by applying twenty questions game. As the experiment results, we got sentence speech understanding accuracy of 79.8%. In this case probability of high level word is 0.9 and threshold probability is 0.38.

  • PDF

An Study on the Performance of the Concept-Based Information Retrieval Model Using a Relation of Thesaurus (개념기반 검색을 위한 시소러스 관계의 효과적 활용방안에 관한 연구)

  • 노영희
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.4
    • /
    • pp.47-65
    • /
    • 2000
  • This study aims lo enhance the perfor~nance 01 concept-based information retr~eval through the use of the lraditional thesaurus which, clearly delmes relalions among terms. To achwe lhls, thc study purports to construcl relation-value-based, relalion-bad, and inlegated kumwledge bases tluough the use ol ihc lhcsau~ub. To cornpale and a~alyze retrieval perlor~nance among knowledge bases, two methods weue al~plied. Sequential bnb algorithm is ap~lied to the I-clation-ualue-based and intzgralcd knowledge base while heuristic bnb algorithm is applied to the relal~on-based knowlcdgc base.

  • PDF

A Real-Time Concept-Based Text Categorization System using the Thesauraus Tool (시소러스 도구를 이용한 실시간 개념 기반 문서 분류 시스템)

  • 강원석;강현규
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.1
    • /
    • pp.167-167
    • /
    • 1999
  • The majority of text categorization systems use the term-based classification method. However, because of too many terms, this method is not effective to classify the documents in areal-time environment. This paper presents a real-time concept-based text categorization system,which classifies texts using thesaurus. The system consists of a Korean morphological analyzer, athesaurus tool, and a probability-vector similarity measurer. The thesaurus tool acquires the meaningsof input terms and represents the text with not the term-vector but the concept-vector. Because theconcept-vector consists of semantic units with the small size, it makes the system enable to analyzethe text with real-time. As representing the meanings of the text, the vector supports theconcept-based classification. The probability-vector similarity measurer decides the subject of the textby calculating the vector similarity between the input text and each subject. In the experimentalresults, we show that the proposed system can effectively analyze texts with real-time and do aconcept-based classification. Moreover, the experiment informs that we must expand the thesaurustool for the better system.

A Study on Form of Folksonomy Tags in University Libraries (대학도서관 폭소노미 태그의 형태적 특성에 관한 연구)

  • Lee, Sung-Sook
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.42 no.4
    • /
    • pp.463-480
    • /
    • 2008
  • This study was to review the possible characteristics and patterns that occur when comparing control language constructing guidelines, by analyzing the formal characteristics of folksonomy tags in university libraries. Based on subjected tags at university libraries for a period of 6 months the structure and form of folksonomy was examined. The object tags were analyzed based on the thesaurus development guidelines. The results for this research will provide baseline data for the use of folksonomy tag applications in digital libraries.

A Study on the Model of History Ontology: A Focus on Korean Modern Historical Person (역사용어 온톨로지 모형 적용 방안 연구 - 한국근현대사 인물을 중심으로 -)

  • Lee, Hye-Won;Yoon, So-Young
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.1
    • /
    • pp.263-280
    • /
    • 2011
  • The purpose of the study is to construct a History Ontology Model for historical person to analyse issues of Korean History Thesaurus and interview history specialists who are use information systems in National Institute of Korean History. This study verifies the difference between both descriptions through comparative analysis of term concept in Korean History Thesaurus and mind-map written by history major. Based on this, we build history ontology model to meet users' information needs and adapt to information retrieval system. First, to organize unique features of history, we define class and attribute and then enlisted considerations for instance input. The study suggests a possibility of new service through combination multiple features using concept extension that is a strength of ontology.

Construction of Korean WordNet (한국어 워드넷의 구축)

  • Lim, Sung-Shin;Lee, Eun-Ryoung;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.106-111
    • /
    • 2004
  • 사람의 언어를 이해하는 자연언어처리 시스템을 개발하기 위해서는 의미처리를 위한 지식 베이스(knowledge base)가 필요하다. 지금까지 사람이 가진 지식 베이스를 컴퓨터에 도입하려는 많은 노력을 기울이고 있고 그 결과물로 온톨로지(ontology)와 시소러스(thesaurus)가 만들어지고 있다. 외국에서는 지식 베이스의 중요성을 알고 많은 연구를 수행하고 있으며 그 대표적인 사례들에는 Roget's Thesaurus, WordNet, EDR 개념사전, CYC, Euro WordNet 등이 있다. 이 중에서 가장 대표적이며 많은 활용을 보이는 것이 Princeton 대학의 WordNet이다. WordNet은 인간의 어휘지식에 대한 심리 언어학적인 연구의 결과물로써 심리학자와 언어학자들에 의해 10여 년 동안 구축되고 있는 영어에 대한 어휘데이터베이스이다. 본 논문에서는 WordNet을 기반으로 명사에 대해서 영한사전과 국어사전을 이용하여 구축한 한국어 워드넷을 소개하구 구축시 고려한 기본지침을 소개하도록 하겠다.

  • PDF

A Web-document Recommending System using the Korean Thesaurus (한국어 시소러스를 이용한 웹 문서 추천 에이전트)

  • Seo, Min-Rye;Lee, Song-Wook;Seo, Jung-Yun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.103-109
    • /
    • 2009
  • We build the web document recommending agent system which offers a certain amount of web documents to each user by monitoring and learning the user's action of web browsing. We also propose a method of query expansion using the Korean thesaurus. The queries to search for new web documents generate a candidate set using the Korean thesaurus. We extract the words which are mostly correlated with the queries, among the words in the candidate set, by using TF-IDF and mutual information. Then, we expand the query. If we adopt the system of query expansion, we can recommend a lot of web documents which have potential interests to users. We thus conclude that the system of query expansion is more effective than a base system of recommending web-documents to users.

Automatic semantic annotation of web documents by SVM machine learning (SVM 기계학습을 이용한 웹문서의 자동 의미 태깅)

  • Hwang, Woon-Ho;Kang, Sin-Jae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.2
    • /
    • pp.49-59
    • /
    • 2007
  • This paper is about an system which can perform automatic semantic annotation to actualize "Semantic Web." Since it is impossible to tag numerous documents manually in the web, it is necessary to gather large Korean web documents as training data, and extract features by using natural language techniques and a thesaurus. After doing these, we constructed concept classifiers through the SVM (support vector machine) teaming algorithm. According to the characteristics of Korean language, morphological analysis and syntax analysis were used in this system to extract feature information. Based on these analyses, the concept code is mapped with Kadokawa thesaurus, which made it possible to map similar words and phrase to one concept code, to make training vectors. This contributed to rise the recall of our system. Results of the experiment show the system has a some possibility of semantic annotation.

  • PDF