MALSORI (대한음성학회지:말소리)
- Issue 43
- /
- Pages.137-150
- /
- 2002
- /
- 1226-1173(pISSN)
Acoustic Modeling and Energy-Based Postprocessing for Automatic Speech Segmentation
자동 음성 분할을 위한 음향 모델링 및 에너지 기반 후처리
Abstract
Speech segmentation at phoneme level is important for corpus-based text-to-speech synthesis. In this paper, we examine acoustic modeling methods to improve the performance of automatic speech segmentation system based on Hidden Markov Model (HMM). We compare monophone and triphone models, and evaluate several model training approaches. In addition, we employ an energy-based postprocessing scheme to make correction of frequent boundary location errors between silence and speech sounds. Experimental results show that our system provides 71.3% and 84.2% correct boundary locations given tolerance of 10 ms and 20 ms, respectively.
Keywords