정보관리학회지 (Journal of the Korean Society for information Management)
- 제17권1호
- /
- Pages.129-148
- /
- 2000
- /
- 1013-0799(pISSN)
- /
- 2586-2073(eISSN)
2-포아송 모형을 이용한 한글 주제어 선정에 관한 연구
A Study on the Applicability of 2-Poisson Model for Selecting Korean Subject Words
초록
최근 구축된 한글 실험문헌 집단을 대상으로 2-포아송 모형의 Z값의 주제어 식별력을 측정하였으며, 역문헌빈도와 2 포아송 모형간의 상관관계를 분석하였다. 이를 위해 Z와 수정
Experiments were performed on three subsets of a Korean test collection in order to determine whether 2-Poisson model's Z value is a good measure for selecting subject words from a document to be indexed. It was found that subject word selection based on the Z value was effective for only one subset with short texts, i.e., the Science and Technology subset. Correlation analyses between 2-Poisson model's Z and TF.IDF weight for the three subsets showed that the correlation was relatively high for two test subsets with short texts, i.e., the Science and Technology subset and the Newspaper subset.