• Title/Summary/Keyword: Korean Thesaurus

Search Result 224, Processing Time 0.031 seconds

A Automatic Document Summarization Method based on Principal Component Analysis

  • Kim, Min-Soo;Lee, Chang-Beom;Baek, Jang-Sun;Lee, Guee-Sang;Park, Hyuk-Ro
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.491-503
    • /
    • 2002
  • In this paper, we propose a automatic document summarization method based on Principal Component Analysis(PCA) which is one of the multivariate statistical methods. After extracting thematic words using PCA, we select the statements containing the respective extracted thematic words, and make the document summary with them. Experimental results using newspaper articles show that the proposed method is superior to the method using either word frequency or information retrieval thesaurus.

Using WordNet for the Automatic Construction of Korean Thesaurus (WordNet을 이용한 한국어 시소러스 자동 구축)

  • Lee, Chang-Ki;Lee, Geun-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10e
    • /
    • pp.156-163
    • /
    • 1999
  • 최근의 자연어 처리 분야의 연구들에서 광범위하고 완전한 어휘 지식 베이스의 필요성이 입증되었다. 영어권의 경우, 이에 대한 연구가 오래 전부터 있어 왔고, 그 결과로 현재 주로 사용되고 있는 개념체계에는 Roget's Thesaurus와 WordNet 등이 있다. 이러한 개념체계들은 자연어 처리의 여러 응용 분야에서 중요한 역할을 담담하고 있지만, 다른 언어의 경우 널리 사용되고 있는 개념체계가 없는 실정이다. 본 논문에서는 Princeton 대학의 WordNet을 기반으로 한영 사전과 국어 사전을 이용하여 한국어 명사의 개념체계를 자동으로 구축함으로써, 이미 구축되어진 다른 언어의 개념체계를 이용하여 새로운 언어의 개념체계를 자동으로 구축할 수 있음을 보인다. 먼저 한영 사전과 국어 사전으로부터 뽑아낸 한국어 단어 일부의 의미를 다양한 WSD(Word Sense Disambiguation) 방법을 적용시켜 WordNet의 synset에 자동으로 연결시킬 수 있음을 보인다. 그리고 각각의 자동변환으로 나온 결과들에 대해서 적용율과 정확도를 비교하도록 한다.

  • PDF

The type of associative relationships of Thesaurus described in literature of science and technology (과학기술 문헌에 나타난 시소러스의 연관관계 유형에 관한 연구)

  • Song, Yoo-Hwa;Choe, Ho-Seop
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.117-122
    • /
    • 2011
  • 시소러스의 연관관계는 유형의 세분화에 관한 원칙과 방법론의 부재로 시소러스를 구축하는 기관에서 개별적인 분류를 사용하고 있다. 분류에 적용되는 패싯지시어 모형에 관한 연구는 계속 되고 있지만 그 타당성을 뒷받침 할 실증적 사례연구는 찾아볼 수 없다. 본 연구에서는 Inspec에서 구축한 시소러스 중에 일정 기준으로 선정한 우선어와 관련어를 대상으로 IEL에서 제공하는 문헌에서 두 용어가 동시에 출현하는 문장을 찾아 그 연관관계 모형을 제안한다.

  • PDF

A Study on the Information Searching Behavior of MEDLINE Retrieval in Medical Librarian (의학전문사서의 정보이용행위에 관한 연구)

  • Lee Jin-Young;Jeong Sang-Kyung
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.2
    • /
    • pp.123-153
    • /
    • 1999
  • This article aims at finding the ways, on the basis of the studies about the behaviors to search the existing CD-ROM databases, so that the searchers who retrieve the on-line MEDLINE used in the medical libraries can use the data more efficiently than now. We gave the questionnaires to the librarians in 60 medical libraries and searched the literatures and realities on the behaviors of the data uses to examine the search behaviors of the MEDLINE in the medical libraries. The result is as follows: 1) The medical data system rate for single users was $53\%$ and the ons for multi users $43\%$. As for the time which users retrieve for a week, under two hours was $75\%$, between 3 and 8 hours $18.3\%$, and eve. 9 hours $6.7\%$. 2) The increasing factors of the search result are (1) an enough discussion and interview between librarians and users, and (2) the use of the correct indexing terms, Thesaurus, and Keyword. In principle users must search directly. However, the librarians searched instead in case that the retrieval result was under two hours a week$(75\%)$. 3) As for the search fee, $91\%$ was free and $9\%$ was charged. Also search effectiveness was enhanced by the means of Inter-Library Loan Service & Information Network. 4) The medical librarians answered the questionnaire that they need the application education of professional knowledge, medical terms(thesaurus) and electronic medium, and also they need the computer education, interview technique and reeducation to give a satisfactory service. 5) As for the satisfactory degree of MEDLINE application, they answered $44.6\%$ for economy, $38.2\%$ for the conveniency of the time required, and $58.9\%$ for the users' search satisfaction answered respectively. 6) The application of MEDLINE system enhanced the medical libraries' image and had an effect on the users' satisfaction of using the data and search, the data activities and the research achievement. 7) In the past MeSH was used but as the time passes CD-ROM MEDLINE search behavior was preferred to On-line one.

  • PDF

A Korean Sentence and Document Sentiment Classification System Using Sentiment Features (감정 자질을 이용한 한국어 문장 및 문서 감정 분류 시스템)

  • Hwang, Jaw-Won;Ko, Young-Joong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.3
    • /
    • pp.336-340
    • /
    • 2008
  • Sentiment classification is a recent subdiscipline of text classification, which is concerned not with the topic but with opinion. In this paper, we present a Korean sentence and document classification system using effective sentiment features. Korean sentiment classification starts from constructing effective sentiment feature sets for positive and negative. The synonym information of a English word thesaurus is used to extract effective sentiment features and then the extracted English sentiment features are translated in Korean features by English-Korean dictionary. A sentence or a document is represented by using the extracted sentiment features and is classified and evaluated by SVM(Support Vector Machine).

Word Network Analysis based on Mutual Information for Ontology of Korean Rural Planning (한국농촌계획 온톨로지 구축을 위한 상호정보 기반 단어연결망 분석)

  • Lee, Jemyung
    • Journal of Korean Society of Rural Planning
    • /
    • v.23 no.3
    • /
    • pp.37-51
    • /
    • 2017
  • There has been a growing concern on ontology especially in recent knowledge-based industry and defining a field-customized semantic word network is essential for building it. In this paper, a word network for ontology is established with 785 publications of Korean Society of Rural Planning(KSRP), from 1995 to 2017. Semantic relationships between words in the publications were quantitatively measured with the 'normalized pointwise mutual information' based on the information theory. Appearance and co-appearance frequencies of nouns and adjectives in phrases are analyzed based on the assumption that a 'noun phrase' represents a single 'concept'. The word network of KSRP was compared with that of $WordNet^{TM}$, a world-wide thesaurus network, for the verification. It is proved that the KSRP's word network, established in this paper, provides words' semantic relationships based on the common concepts of Korean rural planning research field. With the results, it is expecting that the established word network can present more opportunity for preparation of the fourth industrial revolution to the field of the Korean rural planning.

Extending the MARTIF and TEI for Korean Lexical Entities (한국어사전 인코딩체계의 확장에 관한 연구: MARTIF와 TEI를 중심으로)

  • 백지원;최석두
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.295-322
    • /
    • 2001
  • The purpose of this study is to present a scheme to encode all possible lexical entities in dictionaries, glossaries, encyclopedias, and thesaurus, etc. First, it discussed the nature and structure of dictionaries. Second, two current major terminological data encoding schemes, MARTIF and TEI were analyzed in terms of their flexibility for extension to encompass all lexical entities. Third, an integrated microstructure of dictionaries was presented and compared with the MARTIF and TEI for print dictionaries. Then, the need and 17 suggestions for extended MARTIF and TEI formats were addressed with specific cases, which combined with the suggestions from two studies concerning MARTIF and TEI DTD modification for the markup of Korean dictionary entries.

  • PDF

Meta Information Retrieval using Sentence Analysis of Korean Dialogue Style (한국어 대화체 문장 분석을 이용한 메타 정보검색)

  • 박인철
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.703-712
    • /
    • 2003
  • Today, documents existing on internet by the development of communication network increase in number. And it is required the information retrieval system that can efficiently acquire the necessary information. Most information retrieval systems retrieve documents using a simple keyword or a boolean query of keywords. But, the method is not fit for novice users to use and has many difficulties than user's dialogue query from the viewpoint of convenience and precise understanding for query. So, this paper has an aim to suggest the method that will cope with above problems and to design and implement a meta query processing system for information retrieval using Korean dialogue sentences. The system implemented in this paper can generates a new boolean query for a given Korean dialogue sentence and resolve lexical ambiguities through morphological analysis, syntactic analysis and extension of query using thesaurus.

  • PDF

A Study on the Curriculum of the Library and information Science Education Programs Prepared for the Changing Environment (변화하는도서관환경에 대비한 문헌정보학과의 교과과정 연구)

  • Hahn Bock-Hee
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.2
    • /
    • pp.179-198
    • /
    • 1999
  • The scope and the magnitude of change that are occurring in libraries today are exciting and the new developments in information technology challenging. A change in vision as well as activities and operations is required. Librarians need to make full use of information and multimedia technology to support this greatly expanded teaching venture. During the period of the 1st to 5th conference of Korean Society for Information Management produces 199 articles. The articles composed of 25 sub subjects of the knowledge of the library and information science. Some of the interesting sub-subjects were as follows: Information retrieval, Indexing, Classification, Library management, Information service, Cataloging and Digital library. The professional librarians pointed out the essential curricula of the library and informations science education. These are Introduction to Library Science, Organization of Information Resources, Information Retrieval, Multimedia technology, Information System, Library Management, Networks, Data Base, Indexing & thesaurus, Collection development, User Studies, New Media, Online Search.

  • PDF

Function-Based Classification System for Public Records of Government-General of Chosun (조선총독부 기록물을 위한 기능분류체계 개발 연구)

  • 설문원
    • Journal of the Korean Society for information Management
    • /
    • v.20 no.1
    • /
    • pp.457-488
    • /
    • 2003
  • Public records, produced during the period of Government-General of Chosun. are essential sources for Korean modern history research. The purpose of this study is to provide a guideline for developing function-based classification scheme for the records. This present paper begins with analyzing archival principles regarding the function-based classification. and examines the problems of current arrangement practices. Based on these analyses, it suggests a guideline for constructing a classification system and a functional thesaurus for the public records of Government-General of Chosun. This guideline also covers functional analysis process and some considerations of different classification aspects which are conceptual, verbal and notational.