Speech Sciences (음성과학)
- Volume 10 Issue 2
- /
- Pages.7-25
- /
- 2003
- /
- 1226-5276(pISSN)
Separation of Voiced Sounds and Unvoiced Sounds for Corpus-based Korean Text-To-Speech
한국어 음성합성기의 성능 향상을 위한 합성 단위의 유무성음 분리
- Published : 2003.06.01
Abstract
Predicting the right prosodic elements is a key factor in improving the quality of synthesized speech. Prosodic elements include break, pitch, duration and loudness. Pitch, which is realized by Fundamental Frequency (F0), is the most important element relating to the quality of the synthesized speech. However, the previous method for predicting the F0 appears to reveal some problems. If voiced and unvoiced sounds are not correctly classified, it results in wrong prediction of pitch, wrong unit of triphone in synthesizing the voiced and unvoiced sounds, and the sound of click or vibration. This kind of feature is usual in the case of the transformation from the voiced sound to the unvoiced sound or from the unvoiced sound to the voiced sound. Such problem is not resolved by the method of grammar, and it much influences the synthesized sound. Therefore, to steadily acquire the correct value of pitch, in this paper we propose a new model for predicting and classifying the voiced and unvoiced sounds using the CART tool.