한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가

Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS)

  • 권석봉 (한국정보통신대학교) ;
  • 윤성락 (한국과학기술원 전자전산학과) ;
  • 장규철 (한국과학기술원 전자전산학과) ;
  • 김용래 (충북대학교 제어계측공학과) ;
  • 김봉완 (원광대학교 음성정보기술산업지원센터) ;
  • 김회린 (한국정보통신대학교) ;
  • 유창동 (한국과학기술원 전자전산학과) ;
  • 이용주 (원광대학교 음성정보기술산업지원센터) ;
  • 권오욱 (충북대학교 전기전자컴퓨터공학부)
  • 발행 : 2006.09.30

초록

We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.

키워드