• Title/Summary/Keyword: 주제용어

Search Result 286, Processing Time 0.031 seconds

A Comparison Study of Subject Words of Korean Medical Papers: Author Keywords vs MeSH Terms Assigned by MEDLINE (한국 의학학술논문의 저자선정 주제어와 MeSH 용어의 비교 분석 연구)

  • 이춘실;문혜원
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2000.08a
    • /
    • pp.67-70
    • /
    • 2000
  • 본 연구에서는 국내 의학학술논문의 저자가 선정한 주제용어(저자용어)와 MEDLINE 레코드의 MeSH 용어를 비교하여 국내 의학 학술논문 저자들이 얼마나 정확히 MeSH 용어를 사용하는지 일치도를 측정하였고, 사용방법상 어떠한 특징을 보이는지, 일치하지 않는 이유가 무엇인지 분석하였다. 1989년부터 1998년까지 Korean Journal of Parasitology에 발표된 415편의 논문에 사용된 1,826개의 저자용어 가운데 MEDLINE 레코드의 MeSH 용어와 일치한다고 볼 수 있는 용어는 35.5% (649개)로 한 논문에 평균 1.6개의 용어가 일치하였다. 이 가운데 완전히 일치하는 용어는 10.1%밖에 되지 않았다. 이와 같이 국내 의학학술논문 저자들은 MeSH 용어를 정확히 사용하기 위해 필수적인 체크태그 (Check tag), 계층구조 (Tree Structure), 부표목 사용 등 MeSH 용어 사용방법에 대한 지식이 부족한 것으로 나타났다.

  • PDF

Automatic Generating Stopword Methods for Improving Topic Model (토픽모델의 성능 향상을 위한 불용어 자동 생성 기법)

  • Lee, Jung-Been;In, Hoh Peter
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.869-872
    • /
    • 2017
  • 정보검색(Information retrieval) 및 텍스트 분석을 위해 수집하는 비정형 데이터 즉, 자연어를 전처리하는 과정 중 하나인 불용어(Stopword) 제거는 모델의 품질을 높일 수 있는 쉽고, 효과적인 방법 중에 하나이다. 특히 다양한 텍스트 문서에 잠재된 주제를 추출하는 기법인 토픽모델링의 경우, 너무 오래되거나, 수집된 문서의 도메인이나 성격과 무관한 불용어의 제거로 인해, 해당 토픽 모델에서 학습되어 생성된 주제 관련 단어들의 일관성이 떨어지게 된다. 따라서 분석가가 분류된 주제를 올바르게 해석하는데 있어 많은 어려움이 따르게 된다. 본 논문에서는 이러한 문제점을 해결하기 위해 일반적으로 사용되는 표준 불용어 대신 관련 도메인 문서로부터 추출되는 점별 상호정보량(PMI: Pointwise Mutual Information)을 이용하여 불용어를 자동으로 생성해주는 기법을 제안한다. 생성된 불용어와 표준 불용어를 통해 토픽 모델의 품질을 혼잡도(Perplexity)로써 측정한 결과, 본 논문에서 제안한 기법으로 생성한 30개의 불용어가 421개의 표준 불용어보다 더 높은 모델 성능을 보였다.

Comparison and Analysis of Keywords in the Korean Ophthalmic Optics Society Articles to MeSH Terms (한국안광학회지 게재 논문의 주제어와 MeSH 용어의 비교·분석)

  • Kim, Daeyoon;Lee, Min Hyung;Choi, Moonsung
    • Journal of Korean Ophthalmic Optics Society
    • /
    • v.21 no.2
    • /
    • pp.83-90
    • /
    • 2016
  • Purpose: The purpose of this study is to compare and analyze keywords of articles in the Korean Ophthalmic Optics Society to MeSH (Medical Subject Headings) terms. The study hopes to enhance the understanding and usage of MeSH and give fundamental information to the Korean Ophthalmic Optics Society in advance. Methods: A total of 1952 keywords from 409 informative articles published from 2004, Vol 9(1) to 2016, Vol 21(1) were compared with MeSH terms according to the criteria of complete coincidence, incomplete coincidence and complete incoincidence. Results: 439 keywords (22.4%) were completely coincident with MeSH terms, 815 keywords (41.8%) were incompletely coincident with MeSH terms and 693 keywords (35.5%) were completely incoincident with MeSH terms. The most used keyword in MeSH terms is in the order of Myopia, Astigmatism and visual acuity. For the incompletely coincident keywords Refractive error, Soft contact lens, and Phoria were used the most. Finally, the most used keywords in the category of completely incoincident were Accommodative lag and Pseudomonas aeruginosa. Conclusions: It is highly recommended that MeSH terms are selected as controlled keywords to increase usage of searced Korean Ophthalmic Optics Society articles in MEDLINE.

Enhancing Document Clustering Method using Synonym of Cluster Topic and Similarity (군집 주제의 유의어와 유사도를 이용한 문서군집 향상 방법)

  • Park, Sun;Kim, Chul-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1538-1541
    • /
    • 2011
  • 본 논문은 군집 주제의 유의어와 유사도를 이용하여 문서군집의 성능을 향상시키는 방법을 제안한다. 제안된 방법은 비음수행렬분해의 의미특징을 이용하여 군집 주제(topic)의 용어들을 선택함으로서 문서 군집 집합의 내부구조를 잘 표현할 수 있으며, 군집 주제의 용어들에 워드넷의 유의어를 사용하여서 확장함으로써 문서를 용어집합(bag-of-words)으로 표현하는 문제를 해결할 수 있다. 또한 확장된 군집 주제의 용어와 문서집합에 코사인 유사도를 이용하여서 군집의 주제에 적합한 문서를 잘 군집하여서 성능을 높일 수 있다. 실험결과 제안방법을 적용한 문서군집방법이 다른 문서군집 방법에 비하여 좋은 성능을 보인다.

Visualization of Conference Paper Topics and Trends According to Author-Assigned Index Terms (저자 지정 색인 용어에 따른 컨퍼런스 논문 주제 및 동향 시각화)

  • Snowberger, Aaron Daniel;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.340-342
    • /
    • 2022
  • Index Terms, or keywords, are an important component of research papers because they present a quick overview of the main subjects covered in the research paper by highlighting the most important nouns. In this study, we extracted the author-assigned index terms from KIICE Conference Proceedings dating back to 2018 for seasonal conferences, and 2016 for the international conference (ICFICE). The extracted index terms were standardized and analyzed to gain an understanding of research topic trends and any over or under-represented research topics. This kind of index term analysis is expected to be useful in helping researchers not only identify additional potential topics for their own research, but also aid them in selecting from a common vocabulary of keywords when they assign index terms to their research papers.

  • PDF

A Study on the Enhancement of Korean Diaspora-related Subject Headings: Focusing on Korean-related Terminology in the National Library of Korea Subject Headings (한인디아스포라 관련 주제명표목 개선 방안 연구 - 국립중앙도서관 주제명표목표의 한인 관련 용어를 중심으로 -)

  • Yeo, Ji-Suk;Yang, Kiduk;ITO, HIROKO;Lee, HyeKyung
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.1
    • /
    • pp.103-124
    • /
    • 2022
  • This paper suggests a way to improve Korean diaspora-related subject headings based on the analysis of terminology about Koreans in Korean diaspora-related manuscripts and investigation of related terms in the National Library of Korea subject headings. After selecting three KCI journals with high ratios of diaspora-related papers, the study extracted Korean-related terminology from the journal papers and examined their term frequencies. Additional Korean-related terms were investigated by manually reviewing the articles in which extracted terms appear. Based on these analyses, the study proposes several supplemental enhancements to Korean-related topic names in the National Library of Korea's subject headings, such as changing the English notation, adding non-preferred words, and changing the hierarchical relationship of the existing topic names.

A Study on the Factors Influencing Semantic Relation in Building a Structured Glossary (구조적 학술용어사전 데이터베이스 구축에 있어서 용어의 의미관계 형성에 영향을 미치는 요인에 관한 연구)

  • Kwon, Sun-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.48 no.2
    • /
    • pp.353-378
    • /
    • 2014
  • The purpose of this study is to find factors to affect on the formation of semantic relation from terminology and what is to be affected by these factors to build the database scheme of terminology dictionary by a structural definition. In this research, 826,905 keywords of 88,874 social science articles and 985,580 keywords of 125,046 humanities science articles in the KCI journals from 2007 to 2011 were collected. From collected data, subject complexity, structural hole, term frequency, occurrence pattern and an effect between the number of nodes and the number of patterns which were derived from the semantic relation of linked terms of established 'STNet' System were analyzed. The summarized results from analyzed data and network patterns are as follows. Betweenness Centrality, term frequency, and effective size affect the numbers of semantic relation node. Among these factors, betweenness centrality was the most effective and effective size. But term frequency was the least effective. Betweenness Centrality, term frequency, and effective size affect the numbers of semantic relation type. Term frequency is the most effective. Therefore, when building a terminology dictionary, factors of betweenness centrality, term frequency, effective size, and complexity of subject are needed to select term. As a result, these factors can be expected to improve the quality of terminology dictionary.

A Study on the Improvement of Accessibility to Public Records: Based on the Construction of Subject Thesaurus for Presidential Archives (공공기록에 대한 접근성 제고 방안에 관한 연구 - 대통령기록관 주제시소러스 개발 사례를 중심으로 -)

  • Rieh, Hae-Young;Kwon, Yongchan;Seong, Hyojoo;Yoo, Byonghoo
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.14 no.4
    • /
    • pp.127-151
    • /
    • 2014
  • To search based on the functional classification or provenance is not easy for users, and the key word-based information retrieval presents only simple words matching with the title of the records. The Presidential Archive of Korea developed a subject classification scheme to improve the convenience of searching for various records and came up with a subject thesaurus based on the scheme that utilizes the terms appearing on the title of the records and the terms used by the users who searched the portal or requested information disclosure. This research presents the development process of subject thesaurus. It also presents the utilization methods for records management work and services.

A Comparision Study of Subject Words of Korean Medical Journal Papers: Author Keywords vs MeSH Terms Assgned by MEDLINE (한국의학학술 논문의 저자선정 주제어와 MeSH 용어의 비교 분석)

  • 이춘실;문혜원
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.3
    • /
    • pp.109-124
    • /
    • 2000
  • In order to analyze how accurately authors of Korean medical papers use MeSH terms, the key words of Korean medical papers assigned by authors (author terms) are compared with the MeSH terms listed in the corresponding MEDLINE records. A total of 1,826 author terms were used in the 415 Korean Journal of Parasitology papers published between 1989 and 1998. An average of 4.4. author terms and 9.9 MeSH terms were assigned to each paper. 35.5% of author terms matched exactly or partially with MeSH terms, the average being 1.6 terms per paper. The exact match terms consisted only 10.1%. The result of this study shows that the major difference between author terms and MeSH terms are in the use of subheadings and check tags. It indicates that the Korean authors in general do not have sufficient knowledge in selecting and using MeSH terms.

  • PDF

A Study on Frequency of Subject on Content of Thesis in Field of Science and Technology (과학기술분야 학위논문 내용목차에 따른 주제어 출현빈도에 관한 연구)

  • Lee, Hye-Young;Kwak, Seung-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.1
    • /
    • pp.191-210
    • /
    • 2008
  • We would generally use subject terms such as subject indexing for searching and accessing documents. So then, there must be any relationship between document's full-text and its subject terms. This study is started in this question. Master's theses in field of science and technology are worked with because full-text is relatively formatted. This study is to study locations of subject term on Thesis, distribution patterns of subject terms on content of full-text; 'Contents', 'Introduction', 'Theory', 'Main subject', 'Conclusion' and 'References'. Thesis were averagely composed of 1226.3 terms. And Subject terms were averagely compose of $12{\sim}13$ terms. As a result, 'Contents' and 'Introduction' have had the most frequency of subject.