Journal of the Korean Society for Library and Information Science (한국문헌정보학회지)
- Volume 6
- /
- Pages.87-103
- /
- 1979
- /
- 1225-598X(pISSN)
Shannon's Information Theory and Document Indexing
Shannon의 정보이론과 문헌정보
Abstract
Information storage and retrieval is a part of general communication process. In the Shannon's information theory, information contained in a message is a measure of -uncertainty about information source and the amount of information is measured by entropy. Indexing is a process of reducing entropy of information source since document collection is divided into many smaller groups according to the subjects documents deal with. Significant concepts contained in every document are mapped into the set of all sets of index terms. Thus index itself is formed by paired sets of index terms and documents. Without indexing the entropy of document collection consisting of N documents is
Keywords