• Title/Summary/Keyword: Korean Thesaurus

Search Result 224, Processing Time 0.028 seconds

Implementation of Annotation and Thesaurus for Remote Sensing

  • Chae, Gee-Ju;Yun, Young-Bo;Park, Jong-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.222-224
    • /
    • 2003
  • Many users want to add some their own information to data which was on the web and computer without actually needing to touch data. In remote sensing, the result data for image classification consist of image and text file in general. To overcome these inconvenience problems, we suggest the annotation method using XML language. We give the efficient annotation method which can be applied to web and viewing of image classification. We can apply the annotation for web and image classification with image and text file. The need for thesaurus construction is the lack of information for remote sensing and GIS on search engine like Empas, Naver and Google. In search engine, we can’t search the information for word which has many different names simultaneously. We select the remote sensing data from different sources and make the relation between many terms. For this process, we analyze the meaning for different terms which has similar meaning.

  • PDF

Automatic Text Summarization Using Thesaurus (시소러스를 이용한 문서 자동 요약)

  • 이창범;박혁로
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.352-354
    • /
    • 2001
  • 문서 자동요약은 입력된 문서에 대해 컴퓨터가 자동으로 요약을 생성하는 과정을 의미한다. 즉, 컴퓨터가 문서의 기본적인 내용을 유지하면서 문서의 복잡도 즉 문서의 길이를 줄이는 작업이다. 효율적인 정보 접근을 제공함과 동시에 정보 과적재를 해결하기 하기 위한 하나의 방법으로 문서 자동요약에 관한 연구가 활발히 진행되고 있다. 본 논문에서는 의미기반 정보검색용 시소러스(thesaurus)를 이용한 문서 자동요약을 제안한다. 제안한 방법에서는 단어간의 연관 관계 즉, 동의어, 유의어, 상위어, 하위어 관계를 문서 요약에 이용한다. 크게 연관 사슬 형성 단계, 중심 문장 추출 단계, 요약 생성 단계의 새단계로 나누어 요약을 생성한다. 수동 요약된 신문기사를 대상으로 평가한 결과 평균 66%가 일치하였다.

  • PDF

Automatic Korean to English Cross Language Keyword Assignment Using MeSH Thesaurus (MeSH 시소러스를 이용한 한영 교차언어 키워드 자동 부여)

  • Lee Jae-Sung;Kim Mi-Suk;Oh Yong-Soon;Lee Young-Sung
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.155-162
    • /
    • 2006
  • The medical thesaurus, MeSH (Medical Subject Heading), has been used as a controlled vocabulary thesaurus for English medical paper indexing for a long time. In this paper, we propose an automatic cross language keyword assignment method, which assigns English MeSH index terms to the abstract of a Korean medical paper. We compare the performance with the indexing performance of human indexers and the authors. The procedure of index term assignment is that first extracting Korean MeSH terms from text, changing these terms into the corresponding English MeSH terms, and calculating the importance of the terms to find the highest rank terms as the keywords. For the process, an effective method to solve spacing variants problem is proposed. Experiment showed that the method solved the spacing variant problem and reduced the thesaurus space by about 42%. And the experiment also showed that the performance of automatic keyword assignment is much less than that of human indexers but is as good as that of authors.

A Study on Classification System of Korean Literatures Thesaurus (고전 용어 시소러스의 분류 체계에 관한 연구)

  • Yoo Yeong-Jun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.2
    • /
    • pp.415-434
    • /
    • 2006
  • This study aim to develop a classification system to classify the descriptors, which is been in korean literatures. Firstly this classification structure is categorized on six facets and the classification system is constructed on a deductive method based on korean literature knowledge. The study compared the classification system with various thesaurus's classification system in humane studies and by the comparison, the classification system of korean literature's terms find out having some merits as using the facet method. On account of these merits the classification system has achieved a consistency of categorization independently and reduced a complexity of classification structure. And by categorizing the common categories, the study has reduced the size of schedules. Finally, the classification system has advanced the structure in the process of classifying the descriptors.

Development of Online Fashion Thesaurus and Taxonomy for Text Mining (텍스트마이닝을 위한 패션 속성 분류체계 및 말뭉치 웹사전 구축)

  • Seyoon Jang;Ha Youn Kim;Songmee Kim;Woojin Choi;Jin Jeong;Yuri Lee
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.46 no.6
    • /
    • pp.1142-1160
    • /
    • 2022
  • Text data plays a significant role in understanding and analyzing trends in consumer, business, and social sectors. For text analysis, there must be a corpus that reflects specific domain knowledge. However, in the field of fashion, the professional corpus is insufficient. This study aims to develop a taxonomy and thesaurus that considers the specialty of fashion products. To this end, about 100,000 fashion vocabulary terms were collected by crawling text data from WSGN, Pantone, and online platforms; text subsequently was extracted through preprocessing with Python. The taxonomy was composed of items, silhouettes, details, styles, colors, textiles, and patterns/prints, which are seven attributes of clothes. The corpus was completed through processing synonyms of terms from fashion books such as dictionaries. Finally, 10,294 vocabulary words, including 1,956 standard Korean words, were classified in the taxonomy. All data was then developed into a web dictionary system. Quantitative and qualitative performance tests of the results were conducted through expert reviews. The performance of the thesaurus also was verified by comparing the results of text mining analysis through the previously developed corpus. This study contributes to achieving a text data standard and enables meaningful results of text mining analysis in the fashion field.

A Comparative Study of Subject Headings Related to Korea and Japan in the Chinese Classified Thesaurus ("중국분류주제사표(中國分類主題詞表)"의 한.일 관련 주제명에 대한 비교 분석)

  • Moon, Ji-Hyun;Kim, Jeong-Hyen
    • Journal of Korean Library and Information Science Society
    • /
    • v.42 no.3
    • /
    • pp.331-350
    • /
    • 2011
  • This study compared and analyzed, after extracting the subject titles related to Korea and Japan from the second version of Chinese Classified Thesaurus, the number of titles and characteristics according to the subjects. The analysis result shows that total number of Korea-related titles including proper nouns was 215, which is limited in comparison to that of Japan, in terms of the number and diversity of the subjects. Particularly, the CCT does not accurately reflect the current state of Korea as it uses the word 'Josun' to denote Korea and calls Korean War 'Josun War' as well as only recording it in North Korean history. Meanwhile, Japan-related subject titles include many that show the complicated historical relationship between Japan and China, such as Manchurian Incident and Japan-China War.

A SVM-based Spam Filtering System for Short Message Service (SMS) (휴대폰 SMS를 위한 SVM 기반의 스팸 필터링 시스템)

  • Joe, In-Whee;Shim, Hye-Taek
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.9B
    • /
    • pp.908-913
    • /
    • 2009
  • Mobile phones became important household appliance that cannot be without in our daily lives. And the short messaging service (SMS) in these mobile phones is 1.5 to 2 times more than the voice service. However, the spam filtering functions installed in mobile phones take a method to receive specific number patterns or words and recognize spam messages when those numbers or words are present. However, this method cannot properly filters various types of spam messages currently dispatched. This paper proposes a more powerful and more adaptive spam filtering system using SVM and thesaurus. The system went through a process of isolating words from sample data through pro-processing device and integrating meanings of isolated words using a thesaurus. Then it generated characteristics of integrated words through the chi-square statistics and studied the characteristics. The proposed system is realized in a Window environment and the performance is confirmed through experiments.

A Study on the Improvement of Accessibility to Public Records: Based on the Construction of Subject Thesaurus for Presidential Archives (공공기록에 대한 접근성 제고 방안에 관한 연구 - 대통령기록관 주제시소러스 개발 사례를 중심으로 -)

  • Rieh, Hae-Young;Kwon, Yongchan;Seong, Hyojoo;Yoo, Byonghoo
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.14 no.4
    • /
    • pp.127-151
    • /
    • 2014
  • To search based on the functional classification or provenance is not easy for users, and the key word-based information retrieval presents only simple words matching with the title of the records. The Presidential Archive of Korea developed a subject classification scheme to improve the convenience of searching for various records and came up with a subject thesaurus based on the scheme that utilizes the terms appearing on the title of the records and the terms used by the users who searched the portal or requested information disclosure. This research presents the development process of subject thesaurus. It also presents the utilization methods for records management work and services.

Study on the Development of Guidelines for Thesaurus Construction at University Archives: Case Study of Myongji University Archives Center (대학기록관 시소러스 구축 지침의 개발 연구 - 명지대학교 대학사료실의 사례를 중심으로 -)

  • Rieh, Hae-Young;Lee, Mi-Yeong;Lee, Eun-Yeong;Lee, Hyeok-Jun;Lee, Hyeon-Jeong;Choe, Yeong-Sil;Park, Mi-Ja
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.8 no.1
    • /
    • pp.189-210
    • /
    • 2008
  • Some issues and solutions considered for the various situations that we faced in the process of developing guidelines of thesaurus construction are described in this paper. There were many proper names and proper nouns among the terms considered in the process. The thesaurus needed to include a function of an authority file. Preferred terms were selected based on what the university's official records would use. The scope of the proper names for inclusion was the people who held official positions in the university and the people who were the subject of the materials. However, when the system allows synthesized retrieval of the field of creator and donor, inclusion of too many names were considered unnecessary.

A Study on the Expansion of Fundamental Categories Based on Thesaurus International Standards (시소러스 국제표준 기반 기본 범주의 확장에 관한 연구)

  • Chang, Inho
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.1
    • /
    • pp.273-291
    • /
    • 2019
  • This study aims to extend fundamental categories from Clause 11, "facet analysis" in International Standards for thesaurus(ISO 25964-1) by analyzing fundamental categories of Clause 11 and concept and their scope in a thesaurus of Clause 5. For to do this, the fundamental categories were established by adjusting partially and adding mental entities explicitly referencing the highest concepts(YAMATO which is the upper ontology of Mizoguchi, and ISO 2788) and existing fundamental categories(PMEST, FRBR group 3 entities, 13 categories in CRG). Also, established fundamental categories were reorganized and structured based on concreteness/abstraction of PMEST in Ranganathan and independence/dependence of YAMATO in Mizoguchi. And the upper categories were divided into independent and dependent entities. Under these entities 28 criteria are included in the independent ones and 2 criteria in the dependent ones. In the further study, the result of this study can be expected to reuse and refer as controlled vocabulary in the field like classification, taxonomies and thesauri where expected to utilize fundamental categories and as the high-level concept when constructing an ontology for information retrieval.