• Title/Summary/Keyword: vocal tract ratio

Search Result 9, Processing Time 0.02 seconds

Determining the Relative Differences of Emotional Speech Using Vocal Tract Ratio

  • Wang, Jianglin;Jo, Cheol-Woo
    • Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.109-116
    • /
    • 2006
  • In this paper, our study focuses on obtaining the differences of emotional speech in three different vocal tract sections. The vocal tract area was computed from the area function of the emotional speech. The total vocal tract was divided into 3 sections (vocal fold section, middle section and lip section) to acquire the differences in each vocal tract section of emotional speech. The experiment data include 6 emotional speeches from 3 males and 3 females. The 6 emotions consist of neutral, happiness, anger, sadness, fear and boredom. The measured difference is computed by the ratio through comparing each emotional speech with the normal speech. The experimental results present that there is not a remarkable difference at lip section, but the fear and sadness have a great change at the vocal fold part.

  • PDF

Effects of Semi-Occluded Vocal Tract Exercise in Patients with Functional Aphonia (반폐쇄성도훈련이 기능적 실성증 환자의 음성 개선에 미치는 효과)

  • Chae, Hye Rim;Kim, Ji sung;Lee, Dong Wook;Choi, Soeng Hee
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.30 no.1
    • /
    • pp.48-52
    • /
    • 2019
  • Background and Objectives : Functional aphonia is characterized by incomplete closure of the vocal folds. Semi-occluded vocal tract exercise (SOVTE) allows smoothly vocal folds collision without damage to the vocal folds tissues to produce normal vocal intensity. The purpose of this study is to report the effect of SOVTE in patients with functional aphonia. Materials and Method : Seven patients diagnosed with functional aphonia were treated with 1-3 voice therapy sessions using voiced lip-trill, humming, Lax Vox in SOVTE. To assess the effectiveness of semi-occluded vocal tract exercise, cepstral analysis and auditory perceptual assessment were performed before and after voice therapy. Results : F0 (fundamental frequency), CPP (cepstral peak prominence) and L/H ratio (low/high spectral ratio) were significantly increased, while CPP Standard deviation, L/H ratio Standard deviation were decreased. In addition, 'Grade', 'Breathiness' and 'Asthenia' were significantly decreased in the GRBAS scale after SOVTE (p<0.05). Conclusion : In our study, SOVTE seemed to be effective to elicit voice quickly and promote vocal folds vibration without muscular effort in patients with functional aphonia.

An Amplitude Warping Approach to Intra-Speaker Normalization for Speech Recognition (음성인식에서 화자 내 정규화를 위한 진폭 변경 방법)

  • Kim Dong-Hyun;Hong Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.4 no.3
    • /
    • pp.9-14
    • /
    • 2003
  • The method of vocal tract normalization is a successful method for improving the accuracy of inter-speaker normalization. In this paper, we present an intra-speaker warping factor estimation based on pitch alteration utterance. The feature space distributions of untransformed speech from the pitch alteration utterance of intra-speaker would vary due to the acoustic differences of speech produced by glottis and vocal tract. The variation of utterance is two types: frequency and amplitude variation. The vocal tract normalization is frequency normalization among inter-speaker normalization methods. Therefore, we have to consider amplitude variation, and it may be possible to determine the amplitude warping factor by calculating the inverse ratio of input to reference pitch. k, the recognition results, the error rate is reduced from 0.4% to 2.3% for digit and word decoding.

  • PDF

The characteristics of soprano students' voice related to the vocal methods (발성방법에 따른 소프라노 성악도의 음성 특성)

  • Kim, Jungtaek;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.75-83
    • /
    • 2017
  • The purpose of this study is to find clues to the risk of voice disorders in soprano students. The subjects of the study were 17 soprano students and 18 general students (women). The phonation of vowels /a/, /i/, and /u/ with C4 and F4 notes in each group were recorded. Then, only soprano students were made to record their classical vocalization containing vibrato. Formant, formant energy, bandwidth, VAI (vowel area index), VSA (vowel space area) and L/H ratio were analyzed. There was significant difference in F3 such that the singers' note was measured around 3 kHz which seems to be 400 Hz higher than one from general students. But, There was no significant difference in L/H ratio between soprano student and the general student. There was a significant difference in F3 in the comparison of the soprano students' two vocalization methods. Classical vocalization was measured at 200Hz higher than sustained phonation in F3. Vocal tract adjustment was made and vowel space changed, but there was no significant difference in F3 energy, which is the index of singers' formant according to the phonation method. The L/H ratio, which can be a direct indicator of vocal effort, has no difference in phonation method and is lowered in all phonation methods as the pitch increases. C4 and F4 pitches are lower than the singing range of the soprano. When the pitch changes, vocal effort increases like a general student which will be an indicator of the risk of vocalization. This will be a clue to the vocalization of the immature soprano student.

Voice therapy for pitch problems following thyroidectomy without laryngeal nerve injury (신경학적 손상이 없는 갑상선 술 후 음도문제의 음성치료)

  • Ji-sung Kim;Mi-jin Kim
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.53-58
    • /
    • 2023
  • After thyroidectomy, some patients who show normal vocal cord movement still complain of subjective voice problems, which could lead to a decrease in quality of life related to communication. This study aims to investigate the effectiveness of a newly designed voice therapy applying neck exercise and semi-occluded vocal tract exercise (SOVTE) to improve voice problems after thyroidectomy without neurological injury. For this purpose, voice therapy was randomly assigned to 10 women who received thyroidectomy. Acoustic analysis [fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, min Voice Range Profile (VRP), max VRP, VRP] was performed before and after surgery and immediately after voice therapy to compare voice changes. The study showed a statistically significant increase in max VRP and VRP after voice therapy compared to before surgery. These results suggest that the voice therapy methods in this study effectively improve a major symptom of voice problems after thyroidectomy, specifically the reduction in the high-frequency range. However, this study was limited in the number of s participants and did not control for the type of surgery. Therefore, further research utilizing larger sample sizes and controlled variables is needed to investigate the long-term effects of voice therapy.

On A Pitch Alteration using the Waveform Symmetry with Time - Frequency Conversion (시간 - 주파수 변환에 의한 파형 대칭 피치변경법)

  • 박형빈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.147-150
    • /
    • 1998
  • In the case of speech synthesis, the waveform coding method with high quality is mainly used to the synthesis by analysis. Because the parameters of this coding method are not classified as both excitation and vocal tract parameters, it is difficult to apply the waveform coding method to the synthesis by rule. Thus, in order to apply the waveform coding method to the synthesis by rule, a pitch alteration is required for the prosody control. In the speech synthesis method by the conventional PSOLA technique, applying symmetric window function to asymmetric speech waveform, it occurs the unbalance phenomenon of energy according to the overlapped degree of pitch interval adjustment. In this paper to overcome the unbalance phenomenon of energy, we proposed a new method that can convert asymmetric waveform to symmetric one by time-frequency conversion. As a result, we can obtain an average spectrum distortion ratio with 6.38% according to the pitch alteration ratio.

  • PDF

A Study on Formants of Vowels for Speaker Recognition (화자 인식을 위한 모음의 포만트 연구)

  • Ahn Byoung-seob;Shin Jiyoung;Kang Sunmee
    • MALSORI
    • /
    • no.51
    • /
    • pp.1-16
    • /
    • 2004
  • The aim of this paper is to analyze vowels in voice imitation and disguised voice, and to find the invariable phonetic features of the speaker. In this paper we examined the formants of monophthongs /a, u, i, o, {$\omega},{\;}{\varepsilon},{\;}{\Lambda}$/. The results of the present are as follows : $\circled1$ Speakers change their vocal tract features. $\circled2$ Vowels /a, ${\varepsilon}$, i/ appear to be proper for speaker recognition since they show invariable acoustic feature during voice modulation. $\circled3$ F1 does not change easily compared to higher formants. $\circled4$ F3-F2 appears to be constituent for a speaker identification in vowel /a/ and /$\varepsilon$/, and F4-F2 in vowel /i/. $\circled5$ Resulting of F-ratio, differences of each formants were more useful than individual formant of a vowel to speaker recognition.

  • PDF

An Acoustical Study of English Diphthongs Produced by American Males and Females (미국인 남성과 여성이 발음한 영어이중모음의 음향적 연구)

  • Yang, Byung-Gon
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.43-50
    • /
    • 2010
  • English vowels can be divided into monophthongs and diphthongs depending on the number of vocal tract shapes. Diphthongs are usually produced with more than one shape. This study attempts to collect acoustical data of English diphthongs published by Hillenbrand et al.(1995) online and to examine acoustic features of the diphthongs for phoneticians and English teachers. Sixty three American males and females were chosen after excluding those subjects with different target vowels or ambiguous formant tracks. The author used Praat to obtain the acoustical data systematically at eleven equidistant timepoints over the diphthongal segment. Obvious errors were corrected based on the spectrographic display of each diphthong. Results show that the formant trajectories of the diphthongs produced by the American males and females appeared quite similar. When the female formant values were uniformly normalized to those of the males, almost a perfect collapse occurred. Secondly, the diphthongal movements on the vowel space appeared not linear due to the coarticulatory gesture for the following consonant. Thirdly, the average duration of the diphthongs produced by the females was 1.156 times longer than that of the males while the pitch ratio between the two groups turned out to be 1.746 with a similar contour over measurement points. The author concludes that English diphthongs produced by various groups can be compared systematically when the acoustical values are obtained at proportional timepoints. Further studies will be desirable on the comparison of English diphthongs produced by native and nonnative speakers.

  • PDF

A Study on A Multi-Pulse Linear Predictive Filtering And Likelihood Ratio Test with Adaptive Threshold (멀티 펄스에 의한 선형 예측 필터링과 적응 임계값을 갖는 LRT의 연구)

  • Lee, Ki-Yong;Lee, Joo-Hun;Song, Iick-Ho;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.20-29
    • /
    • 1991
  • A fundamental assumption in conventional linear predictive coding (LPC) analysis procedure is that the input to an all-pole vocal tract filter is white process. In the case of periodic inputs, however, a pitch bias error is introduced into the conventional LP coefficient. Multi-pulse (MP) LP analysis can reduce this bias, provided that an estimate of the excitation is available. Since the prediction error of conventional LP analysis can be modeled as the sum of an MP excitation sequence and a random noise sequence, we can view extracting MP sequences from the prediction error as a classical detection and estimation problem. In this paper, we propose an algorithm in which the locations and amplitudes of the MP sequences are first obtained by applying a likelihood ratio test (LRT) to the prediction error, and LP coefficients free of pitch bias are then obtained from the MP sequences. To verify the performance enhancement, we iterate the above procedure with adaptive threshold at each step.

  • PDF