Penn Korean Treebank: Development and Evaluation

  • Han, Chung-hye (Dept. of Linguistics, Simon Fraser University, 8888 University Drive, Burnaby BC V5A 156, Canada) ;
  • Han, Na-Rae (Dept. of Linguistics, University of Pennsylvania, 619 Williams Hall, Philadelphia, PA 19104, USA) ;
  • Ko, Eon-Suk (Dept. of Linguistics, University of Pennsylvania, 619 Williams Hall, Philadelphia, PA 19104, USA) ;
  • Martha Palmer (Dept, of Computer Information and Science, University of Pennsylvani, 256 Moore School, Philadephia, PA 19104, USA) ;
  • Heejong Yi (Dept. of Lingistics, University of Delaware, 46E. Delaware Ave., Newark, DE 19716,7SA)
  • Published : 2002.02.01

Abstract

This paper discusses issues in building a 54-thousand-word Korean Treebank using a phrase structure annotation, along with developing annotation guidelines based on the morpho-syntactic phenomena represented in the corpus. Various methods that were employed for quality control and the evaluation on the Treebank are also presented.

Keywords