MALSORI (대한음성학회지:말소리)
- Issue 67
- /
- Pages.149-166
- /
- 2008
- /
- 1226-1173(pISSN)
N-gram Based Robust Spoken Document Retrievals for Phoneme Recognition Errors
음소인식 오류에 강인한 N-gram 기반 음성 문서 검색
- Lee, Su-Jang (KAIST) ;
- Park, Kyung-Mi (KAIST) ;
- Oh, Yung-Hwan (KAIST)
- 이수장 (한국과학기술원(KAIST) 전자전산학과 전산학전공 음성인터페이스 연구실) ;
- 박경미 (한국과학기술원(KAIST) 전자전산학과 전산학전공 음성인터페이스 연구실) ;
- 오영환 (한국과학기술원(KAIST) 전자전산학과 전산학전공 음성인터페이스 연구실)
- Published : 2008.09.30
Abstract
In spoken document retrievals (SDR), subword (typically phonemes) indexing term is used to avoid the out-of-vocabulary (OOV) problem. It makes the indexing and retrieval process independent from any vocabulary. It also requires a small corpus to train the acoustic model. However, subword indexing term approach has a major drawback. It shows higher word error rates than the large vocabulary continuous speech recognition (LVCSR) system. In this paper, we propose an probabilistic slot detection and n-gram based string matching method for phone based spoken document retrievals to overcome high error rates of phone recognizer. Experimental results have shown 9.25% relative improvement in the mean average precision (mAP) with 1.7 times speed up in comparison with the baseline system.