Search | Korea Science

Performance of Vocal Tract Area Estimation from Deaf and Normal Children's Speech (청각장애아동과 건청아동의 성도면적 추정 성능)

Kim Se-Hwan;Kim Nam;Kwon Oh-Wook
- MALSORI
- /
- no.56
- /
- pp.159-172
- /
- 2005
This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, we obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order on the estimated vocal tract shape. We compare the vocal tract shapes obtained from deaf and normal children's speech.
PDF

Vocal Tract Area Estimation from Deaf and Normal Children's Speech (청각장애아 및 건청아 음성으로부터 성도 면적 추정)

Kim, Se-Hwan;Kwon, Oh-Wook
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.51-54
- /
- 2005
This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order. We compare vocal tract shapes obtained from deaf and normal children's speech.
PDF

Vocal Tract Modeling with Unfixed Sectionlength Acoustic Tubes(USLAT) (비고정 구간 길이 음향 튜브를 이용한 성도 모델링)

Kim, Dong-Jun
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.59 no.6
- /
- pp.1126-1130
- /
- 2010
Speech production can be viewed as a filtering operation in which a sound source excites a vocal tract filter. The vocal tract is modeled as a chain of cylinders of varying cross-sectional area in linear prediction acoustic tube modeling. In this modeling the most common implementation assumes equal length of tube sections. Therefore, to model complex vocal tract shapes, a large number of tube sections are needed. This paper proposes a new vocal tract model with unfixed sectionlengths, which uses the reduced lattice filter for modeling the vocal tract. This model transforms the lattice filter to reduced structure and the Burg algorithm to modified version. When the conventional and the proposed models are implemented with the same order of linear prediction analysis, the proposed model can produce more accurate results than the conventional one. To implement a system within similar accuracy level, it may be possible to reduce the stages of the lattice filter structure. The proposed model produces the more similar vocal tract shape than the conventional one.
https://doi.org/10.5370/KIEE.2010.59.6.1126 인용 PDF KSCI

A Study on Speech Recognition using Vocal Tract Area function and Vector Quantization (성도 면적 함수와 벡터 양자화를 이용한 음성 인식에 관한 연구)

Song, Jei-Hyuck;Kim, Dong-Jun;Park, Sang-Hui
- Proceedings of the KOSOMBE Conference
- /
- v.1993 no.11
- /
- pp.171-174
- /
- 1993
We propose the vocal tract area function as the feature vector of speech recognition. Vocal tract area function is directly related to speech production. The vocal tract area function is not only showing mechanism of speech production but also can be used as an effective feature vector in speech, recognition in this study.
PDF

A Study on Vowel Formant Variation by Vocal Tract Modification (성도 변형에 따른 모음 포먼트의 변화 고찰)

Yang, Byung-Gon
- Speech Sciences
- /
- v.3
- /
- pp.83-92
- /
- 1998
Vowels are classified by vocal tract shapes. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as $F_l,\;F_2,\;F_3$, and so on. This study reviews the perturbation theory of the tract and determines the corresponding formant frequencies from modified vocal tracts using vocal tract area function. Then, formant variation is observed from the theory. Finally, each set of $F_l,\;F_2,\;and\;F_3$ frequency is input to a speech synthesis software to make a vowel sound. Auditory impression of each sound without any modification of its vocal tract shape is almost the same as the corresponding phonetic symbol. Formant frequencies of $F_l,\;F_2,\;F_3$ vary according to the perturbation theory. Generally, constriction along the node causes formant values to decrease while constriction along the anti-node cause it to increase. Vocal tracts modified by more than $3\;cm^2$ change vowel qualities of /a/ and /i/ into those of f /v/ and /$\varepsilon$/, respectively. This study will be helpful in simulating sounds from modified vocal tracts before any operation. Further studies are desirable to compare vocal tract shapes of various languages and their sounds together.
PDF

Determining the Relative Differences of Emotional Speech Using Vocal Tract Ratio

Wang, Jianglin;Jo, Cheol-Woo
- Speech Sciences
- /
- v.13 no.1
- /
- pp.109-116
- /
- 2006
In this paper, our study focuses on obtaining the differences of emotional speech in three different vocal tract sections. The vocal tract area was computed from the area function of the emotional speech. The total vocal tract was divided into 3 sections (vocal fold section, middle section and lip section) to acquire the differences in each vocal tract section of emotional speech. The experiment data include 6 emotional speeches from 3 males and 3 females. The 6 emotions consist of neutral, happiness, anger, sadness, fear and boredom. The measured difference is computed by the ratio through comparing each emotional speech with the normal speech. The experimental results present that there is not a remarkable difference at lip section, but the fear and sadness have a great change at the vocal fold part.
PDF

A Study on Speech Recognition using Vocal Tract Area Function (성도 면적 함수를 이용한 음성 인식에 관한 연구)

송제혁;김동준
- Journal of Biomedical Engineering Research
- /
- v.16 no.3
- /
- pp.345-352
- /
- 1995
The LPC cepstrum coefficients, which are an acoustic features of speech signal, have been widely used as the feature parameter for various speech recognition systems and showed good performance. The vocal tract area function is a kind of articulatory feature, which is related with the physiological mechanism of speech production. This paper proposes the vocal tract area function as an alternative feature parameter for speech recognition. The linear predictive analysis using Burg algorithm and the vector quantization are performed. Then, recognition experiments for 5 Korean vowels and 10 digits are executed using the conventional LPC cepstrum coefficients and the vocal tract area function. The recognitions using the area function showed the slightly better results than those using the conventional LPC cepstrum coefficients.
PDF

Measurement of the vocal tract area of vowels By MRI and their synthesis by area variation (MRI에 의한 모음의 성도 단면적 측정 및 면적 변이에 따른 합성 연구)

Yang, Byung-Gon
- Speech Sciences
- /
- v.4 no.1
- /
- pp.19-34
- /
- 1998
The author collected and compared midsagittal, coronal, coronal oblique, and transversal images of Korean monophthongs /a, i, e, o, u, i, v/ produced by a healthy male speaker using 1.5 T MR, VISION. Area was measured by computer software after tracing the cross-section at different points along the tract. Results showed that the width of the oral and pharyngeal cavities varied compensatorily from each other on the midsagittal dimension. Formant frequency values estimated from the area functions of the seven vowels showed a strong correlation (r=0.978) with those analyzed from the spoken vowels. Moreover, almost all of 35 students who listened to the synthesized vowels from area data perceived the synthesized vowels as equivalent to the spoken ones. Movement of constriction points of vowel /u/ with wider lip opening sounded /i/ and led to slight changes in vowel quality. Jaw and tongue movement led to major volume variation with an anatomical limitation. Each comer vowel varied systematically from a somewhat constant volume of the average area. Thus, the author proposed that any simulation studies related to vocal tract area variation should reflect its constant volume. The results may be helpful to verify exact measurement of the vocal tract area through vowel synthesis and a simulation study before having any operation of the vocal tract.
PDF

A Vowel Discrimination of Korean Monophthongs [i, e, a, o, u, ${\omega}$] Using Vocal Tract Magnetic Resonance Image and F1/F2 (성도 자기공명 영상과 음향정보(F1/F2)를 이용한 한국어 단모음 [이, 에, 아, 오, 우, 으] 판별)

Seong, Cheol-Jae;Park, Jong-Won;Kim, Gui-Ryong
- MALSORI
- /
- no.56
- /
- pp.103-125
- /
- 2005
We present a new method of measuring the volume and cross-sectional area of the vocal tract from magnetic resonance images. The vocal tract was divided by the 2 constriction points on the horizontal and vertical planes. The ratios of the volumes of the segment vocal tracts to that of the entire vocal tract play a crucial role in discriminating Korean monophthongs in that vowels were successfully discriminated by the ratios. The discriminant analysis also demonstrated that the acoustic parameters F1 and F2, in addition to the segment volumes, serve as significant parameters in discriminating Korean monophthongs.
PDF

A study on speech training aids for Deafs (청각장애자용 발음훈련기기 개발에 관한 연구)

Ahn, Sang-Pil;Lee, Jae-Hyuk;Yoon, Tae-Sung;Park, Sang-Hui
- Proceedings of the KIEE Conference
- /
- 1990.07a
- /
- pp.47-50
- /
- 1990
Deafs cannot speak straight voice as normal people in lack of feedback of their pronunciation, therefore speech training is required. In this study, fundamental frequency, intensity, formant frequencies, vocal tract graphic and vocal tract area function, extracted from speech signal, are used as feature parameter. AR model, whose coefficients are extracted using inverse filtering. is used as speech generation model. In connect ion between vocal tract graphic and speech parameter, articulation distances and articulation distance functions in selected 15-intervals are determined by extracted vocal tract areas and formant frequencies.
PDF

Search Result 20, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)