한국언어정보학회:학술대회논문집 (Proceedings of the Korean Society for Language and Information Conference)
- 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
- /
- Pages.217-226
- /
- 2002
Generating a Category Set of Words Using a Hierarchical Part-of-speech System and Tagged Corpus
- Kojima, Takeyuki (Dept. of Computer, Information and Communication Science, Tokyo University of Agric. and Tech.) ;
- Kotani, Yoshiyuki (Dept. of Computer, Information and Communication Science, Tokyo University of Agric. and Tech.)
- 발행 : 2002.02.01
초록
In this paper, we propose a method of generating a proper categorization of morphemes by giving a hierarchical part-of-speech system and a corpus tagged using this part-of-speech system. Our method use hierarchical information in the part-of-speech system and statistical information in the corpus to generate a category set. The statistical information is based on the context of occurrence of categories. First, we specify the format of given information. Then, we describe an algorithm to generate a proper categorization. Finally, we present the results of our experiments in applying this method. We obtained a moderately proper categorization and found several candidates for improvement .
키워드