• Title/Summary/Keyword: female speakers

Search Result 126, Processing Time 0.023 seconds

The Effects of Pitch Increasing Training (PIT) on Voice and Speech of a Patient with Parkinson's Disease: A Pilot Study

  • Lee, Ok-Bun;Jeong, Ok-Ran;Shim, Hong-Im;Jeong, Han-Jin
    • Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.95-105
    • /
    • 2006
  • The primary goal of therapeutic intervention in dysarthric speakers is to increase the speech intelligibility. Decision of critical features to increase the intelligibility is very important in speech therapy. The purpose of this study is to know the effects of pitch increasing training (PIT) on speech of a subject with Parkinson's disease (PD). The PIT program is focused on increasing pitch while a vowel is sustained with the same loudness. The loudness level is somewhat higher than that of the habitual loudness. A 67-year-old female with PD participated in the study. Speech therapy was conducted for 4 sessions (200 minutes) for one week. Before and after the treatment, acoustic, perceptual and speech naturalness evaluation was peformed for data analysis. Speech and voice satisfaction index (SVSI) was obtained after the treatment. Results showed Improvements in voice quality and speech naturalness. In addition, the patient's satisfaction ratings (SVSI) indicated a positive relationship between improved speech production and their (the patient and care-givers) satisfaction.

  • PDF

Computation of Laryngeal Flow and Sound through a Dynamic Model of the Vocal Folds (동적 성대 모델을 이용한 후두 내 유동 및 음향장에 대한 수치 연구)

  • Bae, Young-Min;Moon, Young-J.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2008.03b
    • /
    • pp.21-24
    • /
    • 2008
  • The present study numerically investigates the glottal airflow characteristics as well as acoustic features of phonation fully coupled with dynamic behavior of vocal folds. The vocal folds are described by a low-dimensional body-covered model characterized by bio-mechanical parameters such as glottal width, vocal folds stiffness, and subglottal pressure. The flow in the vocal tract is modeled as an incompressible, axisymmetric form of the Navier-Stokes equations (INS), while the acoustic field is predicted by the linearized perturbed compressible equations (LPCE). The computed result shows that a two-mass model of vocal folds is sufficient to reproduce temporal variations in oral airflow and glottis motion produced by female speakers. It is also found that i) the glottal width has a significant effect on the amplitude of glottal flow, and thus on the amplitude of acoustic wave in the vocal tract, ii) the vocal fold tension is the main control parameter for the fundamental frequency of phonation, iii) the subglottal pressure plays an appreciable role on reproduction of the self-sustained oscillation of vocal folds, and iv) the strength of pulsating airflow and vortical structures are primarily affected by glottal width and subglottal pressure, and are closely related to pitch, loudness, and voice quality. Finally, more comprehensive explanation about the difference between one- and two-mass models is presented with discussion of effectiveness of vocal folds oscillation and voice quality.

  • PDF

Phonological processes of vowels from orthographic to pronounced words in the Buckeye Corpus by sex and age groups

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.25-31
    • /
    • 2018
  • This paper investigated the phonological processes of monophthongs and diphthongs in the pronounced words present in the Buckeye Corpus and compared the frequency distribution of these processes by sex and age groups to provide a clearer understanding of spoken English to linguists and phoneticians. Both orthographic and pronounced words were extracted from the transcribed label scripts of the Buckeye Corpus using R. Next, the phonological processes of monophthongs and diphthongs in the orthographic and pronounced labels were tabulated using R scripts, and a frequency distribution by vowel process types, as well as sex and age groups, was created. The results revealed that 95% of the orthographic words contained the same number of syllables, whereas 5% had different numbers of vowels, thereby proving that speakers tend to preserve vowels in spontaneous speech. In addition, deletion processes were preferred in natural speech. Most vowel deletions occurred with an unstressed syllable. Chi-square tests were performed to calculate dependence in the distribution of phonological process types for male and female groups and young and old groups. The results showed a very strong correlation. This finding indicates that vowel processes occurred in approximately the same pattern in natural and spontaneous speech data regardless of sex and age, as well as whether or not the vowel processes were identical. Based on these results, the author concludes that an analysis of phonological processes in spontaneous speech corpora can greatly enhance practical understanding of spoken English.

An Acoustical Study of English Diphthongs Produced by American Males and Females (미국인 남성과 여성이 발음한 영어이중모음의 음향적 연구)

  • Yang, Byung-Gon
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.43-50
    • /
    • 2010
  • English vowels can be divided into monophthongs and diphthongs depending on the number of vocal tract shapes. Diphthongs are usually produced with more than one shape. This study attempts to collect acoustical data of English diphthongs published by Hillenbrand et al.(1995) online and to examine acoustic features of the diphthongs for phoneticians and English teachers. Sixty three American males and females were chosen after excluding those subjects with different target vowels or ambiguous formant tracks. The author used Praat to obtain the acoustical data systematically at eleven equidistant timepoints over the diphthongal segment. Obvious errors were corrected based on the spectrographic display of each diphthong. Results show that the formant trajectories of the diphthongs produced by the American males and females appeared quite similar. When the female formant values were uniformly normalized to those of the males, almost a perfect collapse occurred. Secondly, the diphthongal movements on the vowel space appeared not linear due to the coarticulatory gesture for the following consonant. Thirdly, the average duration of the diphthongs produced by the females was 1.156 times longer than that of the males while the pitch ratio between the two groups turned out to be 1.746 with a similar contour over measurement points. The author concludes that English diphthongs produced by various groups can be compared systematically when the acoustical values are obtained at proportional timepoints. Further studies will be desirable on the comparison of English diphthongs produced by native and nonnative speakers.

  • PDF

A Prosodic Study of Korean Using a Large Database (대용량 데이터베이스를 이용한 한국어 운율 특성에 관한 연구)

  • Kim Jong-Jin;Lee Sook-Hyang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2
    • /
    • pp.117-126
    • /
    • 2005
  • This study investigates the prosodic characteristics of Korean through the analysis of a large database. One female and one male speakers each read 650 sentences and they were segmentally and prosodically labeled. Statistical analyses were done on these utterances regarding the tonal pattern and the size of prosodic units, correlation between the size of higher level prosodic units and the number of lower level prosodic units. and the slope and F0 of the falling and rising contours of an accentual phrase. The results showed that the duration and the number of words and syllables of a prosodic unit were significantly different not only between speakers but also between its positions within a higher level prosodic nit. The munber of a prosodic unit showed a high correlation with the duration and the number of syllables of its higher level units. The slope of the falling contour within an accentual phrase was inversely Proportional to the number of its syllables. The slope was different depending on the first tone type of an accentual phrase, which could be explained with the F0 rising and the different amount of rising between tones when an accentual phrase starts with an H tone. The slope of the falling contour across an accentual phrase boundary showed a constant and larger value compared to one within an accentual phrase. The rising contours in the beginning and end of an accentual Phrase were similar in their slopes but they differ in the amount of F0 change : the former showed a larger amount of change. The slope of the rising contour which forms an accentual Phrase on its own was inversely Proportional to the number of its syllables.

Coordinative movement of articulators in bilabial stop /p/

  • Son, Minjung
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.77-89
    • /
    • 2018
  • Speech articulators are coordinated for the purpose of segmental constriction in terms of a task. In particular, vertical jaw movements repeatedly contribute to consonantal as well as vocalic constriction. The current study explores vertical jaw movements in conjunction with bilabial constriction in bilabial stop /p/ in the context /a/-to-/a/. Revisiting kinematic data of /p/ collected using the electromagenetic midsagittal articulometer (EMMA) method from seven (four female and three male) speakers of Seoul Korean, we examined maximum vertical jaw position, its relative timing with respect to the upper and lower lips, and lip aperture minima. The results of those dependent variables are recapitulated in terms of linguistic (different word boundaries) and paralinguistic (different speech rates) factors as follows. Firstly, maximum jaw height was lower in the across-word boundary condition (across-word < within-word), but it did not differ as a function of different speech rates (comfortable = fast). Secondly, more reduction in the lip aperture (LA) gesture occurred in fast rate, while word-boundary effects were absent. Thirdly, jaw raising was still in progress after the lips' positional extrema were achieved in the within-word condition, while the former was completed before the latter in the across-word condition. Lastly, relative temporal lags between the jaw and the lips (UL and LL) were more synchronous in fast rate, compared to comfortable rate. When these results are considered together, it is possible to posit that speakers are not tolerant of lenition to the extent that it is potentially realized as a labial approximant in either word-boundary condition while jaw height still manifested lower jaw position in the across-word boundary condition. Early termination of vertical jaw maxima before vertical lower lip maxima across-word condition may be partly responsible for the spatial reduction of jaw raising movements. This may come about as a consequence of an excessive number of factors (e.g., upper lip height (UH), lower lip height (LH), jaw angle (JA)) for the representation of a vector with two degrees of freedom (x, y) engaged in a gesture-based task (e.g., lip aperture (LA)). In the task-dynamic application toolkit, the jaw angle parameter can be assigned numerical values for greater weight in the across-word boundary condition, which in turn gives rise to lower jaw position. Speech rate-dependent spatial reduction in lip aperture may be able to be resolved by means of manipulating activation time of an active tract variable in the gestural score level.

Korean Word Recognition Using Vector Quantization Speaker Adaptation (벡터 양자화 화자적응기법을 사용한 한국어 단어 인식)

  • Choi, Kap-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.27-37
    • /
    • 1991
  • This paper proposes the ESFVQ(energy subspace fuzzy vector quantization) that employs energy subspaces to reduce the quantizing distortion which is less than that of a fuzzy vector quatization. The ESFVQ is applied to a speaker adaptation method by which Korean words spoken by unknown speakers are recognized. By generating mapped codebooks with fuzzy histogram according to each energy subspace in the training procedure and by decoding a spoken word through the ESFVQ in the recognition proecedure, we attempt to improve the recognition rate. The performance of the ESFVQ is evaluated by measuring the quantizing distortion and the speaker adaptive recognition rate for DDD telephone area names uttered by 2 males and 1 female. The quatizing distortion of the ESFVQ is reduced by 22% than that of a vector quantization and by 5% than that of a fuzzy vector quantization, and the speaker adaptive recognition rate of the ESFVQ is increased by 26% than that without a speaker adaptation and by 11% than that of a vector quantization.

  • PDF

An Experimental Study on the English Vowel Lengths Using the Praat Software Program (Praat소프트웨어 프로그램을 이용한 영어모음 길이에 관한 실험적 연구)

  • Park, Hee-Suk
    • Journal of Digital Contents Society
    • /
    • v.13 no.3
    • /
    • pp.279-290
    • /
    • 2012
  • The purpose of this experimental study is to investigate and compare the vowel lengths of the English diphthongs, /eɪ/ and /aɪ/, and the front low vowel /æ/ among English-speaking natives with Korean college students using the Praat software program. To do this English sentences were uttered and recorded by twelve subjects, six Korean subjects and six English-speaking native subjects. All the subjects are female and their age ranges from 23 to 35. Acoustic features(duration) were measured from a sound spectrogram with the help of the Praat software program and analyzed through statistical analysis. Results showed that the vowel lengths of the English diphthongs and the front low vowel between native English speakers and Korean collegians were different. In the pronunciation of the diphthongs /eɪ/ and /aɪ/, Korean subjects pronounced longer than native subjects did, but the difference was not significant. However, in the pronunciation of the English front low vowel /æ/, native subjects pronounced significantly longer than Korean subjects did. From the data of the overall sum of words and vowels between the two subject groups, we were able to find out that the differences of lengths of both the three words and the two diphthongs /eɪ/ and /aɪ/ were not significant, but those of /æ/ were significant.

A Real-Time Embedded Speech Recognition System (실시간 임베디드 음성 인식 시스템)

  • 남상엽;전은희;박인정
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.40 no.1
    • /
    • pp.74-81
    • /
    • 2003
  • In this study, we'd implemented a real time embedded speech recognition system that requires minimum memory size for speech recognition engine and DB. The word to be recognized consist of 40 commands used in a PCS phone and 10 digits. The speech data spoken by 15 male and 15 female speakers was recorded and analyzed by short time analysis method, which window size is 256. The LPC parameters of each frame were computed through Levinson-Burbin algorithm and they were transformed to Cepstrum parameters. Before the analysis, speech data should be processed by pre-emphasis that will remove the DC component in speech and emphasize high frequency band. Baum-Welch reestimation algorithm was used for the training of HMM. In test phone, we could get a recognition rate using likelihood method. We implemented an embedded system by porting the speech recognition engine on ARM core evaluation board. The overall recognition rate of this system was 95%, while the rate on 40 commands was 96% and that 10 digits was 94%.

External photoglottography, intra-oral air pressure, airflow and acoustic data on the Korean fricatives /s', s/

  • Kim, Hyunsoon;Maeda, Shinji;Honda, Kiyoshi;Crevier-Buchman, Lise
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.11-25
    • /
    • 2022
  • From simultaneous recordings of the external photoglottography, intra-oral air pressure (Pio), airflow and acoustic data from four native Seoul Korean speakers (2 male and 2 female), we have found that the two fricatives are not significantly different in glottal opening peak and airflow peak height either word-initially or word-medially and that the duration of aspiration is significantly reduced in word-medial /s/, compared to those in word-initial /s/, not in /s'/. We have also found that the duration of a high Pio plateau is significantly longer in /s/ than in /s'/ both word-initially and word-medially and that airflow resistance (R=Pio/U) at the onset and offset of a Pio plateau and at the time of airflow peak height is significantly higher in /s'/ than in /s/ across the contexts. However, the differences in Pio peak and F0 are not significant. In addition, the transition time to reach airflow peak height from the offset of a Pio plateau is found to be significantly longer in /s/ than /s'/ in both word-initial and word-medial positions. No significant differences in glottal opening peak and airflow peak height confirm that /s/ is specified as [-spread glottis] like /s'/. As for the other significant differences, we propose that /s/ is [-tense], and /s'/ [+tense].