• Title/Summary/Keyword: experimental phonetics

Search Result 89, Processing Time 0.019 seconds

An Experimental Study of Comfortable Pitch and Loudness with Target Matching: Effects on Electroglottographic and Acoustic Measures

  • Choi, Seong Hee
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.139-146
    • /
    • 2012
  • This study was designed to examine comfort levels of pitch and loudness with target matching and their effects on electroglottographic (EGG) and acoustic measures. Twelve speakers, six males and six females, were instructed to produce /a/ sustained vowel for three seconds at a comfortable pitch and loudness level without any instruction and with a target matching procedure of either a certain f0 or SPL separately with visual and auditory feedback. The range of pitch for females and males were presented by progressing up and down randomly at intervals of 5Hz from 150 Hz to 310 Hz (total 33 frequency targets) and from 85 Hz to 190 Hz (total 22 frequency targets), respectively. The loudness levels were 65, 75, 85, 95 dB (total of four intensity targets) for both males and females. Subjective estimations of comfortable levels were obtained using a 10-point equal-appearing interval rating scale following each phonation. The results showed that males and females demonstrated similar trends in loudness levels with greatest comfort at 75 dB, whereas pitch comfort ratings showed a greater variability with females having a wider range with target matching. In the comfort levels of individuals, most male and female speakers rated higher comfort at soft, rather than loud phonations. On the other hand, most male speakers perceived highest comfort levels below the comfort pitch levels they phonated under natural conditions. Higher frequency ranges, however, were perceived to be more comfortable than those of natural condition in most female speakers, although the comfortable pitch levels in spontaneous phonations were within the comfort level ranges determined by targeted phonations. When comparing acoustic (%jitter, %shimmer, SNR) and EGG measures (CQ%) between spontaneous comfortable phonations and targeted phonations produced by the same subject at similar f0 and intensity, no significant differences were observed (p>0.05). Thus, target matching procedures may be considered a compatible and alternative method to reduce the variability of comfortable pitch and loudness levels by eliciting consistent comfortable phonations.

The influence of Chinese high and level tone and rising tone on the pitch of Sino-Korean words pronounced by Chinese learners: Focusing on synonym with the same letters (중국인의 한국어 한자어 발음에서 보이는 중국어 음평과 양평의 영향: 동형동의어를 중심으로)

  • Liu, Si-Yang;Kim, Young-Joo
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.35-47
    • /
    • 2011
  • The purpose of this study is to examine the influence of Chinese high and level vs. rising tone on the pitch pattern of corresponding Sino-Korean words delivered by Chinese learners of Korean and to examine the aspects how these two tones of corresponding Chinese words affect the pitch patterns of Sino-Korean words. Scope of this research is limited to the Chinese learners of Korean, especially when they pronounce same-form-same-meaning Sino-Korean words. In this study, Chinese learners pronounced both Chinese words and corresponding Sino-Korean words. By using the software learners' pitch pattern were recorded, analyzed, and compared with the tone of corresponding Chinese words. Experimental results showed that Sino-Korean words were affected by Chinese 'high and level tone - high and level tone', 'high and level tone - rising tone', 'high and level tone - falling-rising tone', 'high and level tone - falling tone' and 'rising tone - falling tone' when they started with lenis sounds. On the other hand when Sino-Korean words started with aspirated sounds they were affected by Chinese 'rising tone - high and level tone', 'rising tone - rising tone', 'rising tone - falling-rising tone', 'rising tone - falling tone'. In conclusion, the Chinese learners' pitch patterns of Sino-Korean words are affected by both Chinese high and level & rising tone, especially when Sino-Korean words started with lenis sounds they were more affected by Chinese high and level tone, on the other hand Chinese rising tone influence Sino-Korean words more when they were started with aspirated sounds.

  • PDF

How Different are Learner Speech and Loanword Phonology?

  • Kim, Jong-Mi
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.3-18
    • /
    • 2009
  • Do loanword properties emerge in the acquisition of a foreign language and if so, how? Classic studies in adult language learning assumed loanword properties that range from near-ceiling to near-chance level of appearance depending on speech proficiency. The present research argues that such variations reflect different phonological types, rather than speech proficiency. To investigate the difference between learner speech and loanword phonology, the current research analyzes the speech data from five different proficiency levels of 92 Korean speakers who read 19 pairs of English words and sentences that contained loanwords. The experimental method is primarily an acoustical one, by which the phonological cause in the loanwords (e.g., the insertion of [$\Box$] at the end of the word stamp) would be attested to appear in learner speech, in comparison with native speech from 11 English speakers and 11 Korean speakers. The data investigated for the research are of segment deletion, insertion, substitution, and alternation in both learner speech and the native speech. The results indicate that learner speech does not present the loanword properties in many cases, but depends on the types of phonological causes. The relatively easy acquisition of target pronunciation is evidenced in the cases of segment deletion, insertion, substitution, and alternation, except when the loanword property involves the successful command of the target phonology such as the de-aspiration of [p] in apple. Such a case of difficult learning draws a sharp distinction from the cases of easy learning in the development of learner speech, particularly beyond the intermediate level of proficiency. Overall, learner speech departs from loanword phonology and develops toward the native speech value, depending on phonological contrasts in the native and foreign languages.

  • PDF

Korean speech sound development in children from bilingual Japanese-Korean environments

  • Kim, Jeoung-Suk;Lee, Jun-Ho;Choi, Yoon-Mi;Kim, Hyun-Gi;Kim, Sung-Hwan;Lee, Min-Kyung;Kim, Sun-Jun
    • Clinical and Experimental Pediatrics
    • /
    • v.53 no.9
    • /
    • pp.834-839
    • /
    • 2010
  • Purpose: This study investigates Korean speech sound development, including articulatory error patterns, among the Japanese-Korean children whose mothers are Japanese immigrants to Korea. Methods: The subjects were 28 Japanese-Korean children with normal development born to Japanese women immigrants who lived in Jeonbuk province, Korea. They were assessed through Computerized Speech Lab 4500. The control group consisted of 15 Korean children who lived in the same area. Results: The values of the voice onset time of consonants /$p^h$/, /t/, /$t^h$/, and/$k^*$/ among the children were prolonged. The children replaced the lenis sounds with aspirated or fortis sounds rather than replacing the fortis sounds with lenis or aspirated sounds, which are typical among Japanese immigrants. The children showed numerous articulatory errors for /c/ and /I/ sounds (similar to Koreans) rather than errors on /p/ sounds, which are more frequent among Japanese immigrants. The vowel formants of the children showed a significantly prolonged vowel /o/ as compared to that of Korean children ($P$<0.05). The Japanese immigrants and their children showed a similar substitution /n/ for /ɧ/ [Japanese immigrants (62.5%) vs Japanese-Korean children (14.3%)], which is rarely seen among Koreans. Conclusion: The findings suggest that Korean speech sound development among Japanese-Korean children is influenced not only by the Korean language environment but also by their maternal language. Therefore, appropriate language education programs may be warranted not only or immigrant women but also for their children.

Pronunciation Variation Patterns of Loanwords Produced by Korean and Grapheme-to-Phoneme Conversion Using Syllable-based Segmentation and Phonological Knowledge (한국인 화자의 외래어 발음 변이 양상과 음절 기반 외래어 자소-음소 변환)

  • Ryu, Hyuksu;Na, Minsu;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.139-149
    • /
    • 2015
  • This paper aims to analyze pronunciation variations of loanwords produced by Korean and improve the performance of pronunciation modeling of loanwords in Korean by using syllable-based segmentation and phonological knowledge. The loanword text corpus used for our experiment consists of 14.5k words extracted from the frequently used words in set-top box, music, and point-of-interest (POI) domains. At first, pronunciations of loanwords in Korean are obtained by manual transcriptions, which are used as target pronunciations. The target pronunciations are compared with the standard pronunciation using confusion matrices for analysis of pronunciation variation patterns of loanwords. Based on the confusion matrices, three salient pronunciation variations of loanwords are identified such as tensification of fricative [s] and derounding of rounded vowel [ɥi] and [$w{\varepsilon}$]. In addition, a syllable-based segmentation method considering phonological knowledge is proposed for loanword pronunciation modeling. Performance of the baseline and the proposed method is measured using phone error rate (PER)/word error rate (WER) and F-score at various context spans. Experimental results show that the proposed method outperforms the baseline. We also observe that performance degrades when training and test sets come from different domains, which implies that loanword pronunciations are influenced by data domains. It is noteworthy that pronunciation modeling for loanwords is enhanced by reflecting phonological knowledge. The loanword pronunciation modeling in Korean proposed in this paper can be used for automatic speech recognition of application interface such as navigation systems and set-top boxes and for computer-assisted pronunciation training for Korean learners of English.

A Comparison of Parameters of Acoustic Vowel Space in Patients with Parkinson's Disease (파킨슨병 환자의 음향 모음 공간 파라미터 비교)

  • Kang, Young-Ae;Yoon, Kyu-Chul;Lee, Hak-Seung;Seong, Cheol-Jae
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.185-192
    • /
    • 2010
  • The acoustic vowel space has been used as an acoustic parameter in dysarthric speech. The aim of this work was to examine mathematical formulae for acoustic vowel space and to apply these to Korean speakers with idiopathic Parkinson's disease(IPD). Five acoustic parameters were chosen from earlier works and one new parameter was proposed, the pentagonal vowel space. The six parameters included triangular vowel space (3 area), irregular quadrilateral vowel space (4 area), irregular pentagonal vowel space (5 area), vowel articulatory index (VAI), formant centralization ratio (FCR) and F2i/F1u ratio (F2 ratio). An experimental group of 32 IPD patients(male:female=16:16) and a control group of twenty healthy people (male:female=8:12) participated in the study and repeated vowels (/a-i-u-e-o/) three times. A correlation analysis was performed among the six parameters, 2-way ANOVA was done with gender and groups as independent factors, and an independent sample t-test was conducted between the male and the female group as post hoc comparison. All parameters were highly correlated with each other and only the FCR showed a high negative correlation with the others. The results of ANOVA showed a significant difference in F2 ratio, 3 area, 4 area and 5 area between gender and in 4 area and 5 area between groups. For the male members of the two groups, significant statistical differences were found in all parameters whereas no such differences were found for the female members. These findings indicated that the vowel space of the female group was wider than the vowel space of the male group. These differences may have been caused by gender-specific speech styles rather than by patho-physiological mechanisms. We also claim that the pentagonal vowel space is better than the other vowel spaces at representing the disordered speech in natural speech situations.

  • PDF

Performance comparison of various deep neural network architectures using Merlin toolkit for a Korean TTS system (Merlin 툴킷을 이용한 한국어 TTS 시스템의 심층 신경망 구조 성능 비교)

  • Hong, Junyoung;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.57-64
    • /
    • 2019
  • In this paper, we construct a Korean text-to-speech system using the Merlin toolkit which is an open source system for speech synthesis. In the text-to-speech system, the HMM-based statistical parametric speech synthesis method is widely used, but it is known that the quality of synthesized speech is degraded due to limitations of the acoustic modeling scheme that includes context factors. In this paper, we propose an acoustic modeling architecture that uses deep neural network technique, which shows excellent performance in various fields. Fully connected deep feedforward neural network (DNN), recurrent neural network (RNN), gated recurrent unit (GRU), long short-term memory (LSTM), bidirectional LSTM (BLSTM) are included in the architecture. Experimental results have shown that the performance is improved by including sequence modeling in the architecture, and the architecture with LSTM or BLSTM shows the best performance. It has been also found that inclusion of delta and delta-delta components in the acoustic feature parameters is advantageous for performance improvement.

Statistical analysis on long-term change of jitter component on continuous speech signal (음성신호의 Jitter 성분의 장시간 변화에 관한 통계적 분석)

  • Jo, Cheolwoo
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.73-80
    • /
    • 2020
  • In this study, a method for measuring the jitter component in continuous speech is presented. In the conventional jitter measurement method, pitch variabilities are commonly measured from the sustained vowels. In the case of continuous speech, such as a spoken sentence, distortion occurs with the existing measurement method owing to the influence of prosody information according to the sentence. Therefore, we propose a method to reduce the pitch fluctuations of prosody information in continuous speech. To remove this pitch fluctuation component, a curve representing the fluctuation is obtained via polynomial interpolation for the pitch track in the analysis interval, and the shift is removed according to the curve. Subsequently, the variability of the pitch frequency is obtained by a method of measuring jitter from the trajectory of the pitch from which the shift is removed. To measure the effects of the proposed method, parameter values before and after the operations are compared using samples from the Kay Pentax MEEI database. The statistical analysis of the experimental results showed that jitter components from the continuous speech can be measured effectively by proposed method and the values are comparable to the parameters of sustained vowel from the same speaker.

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

Perceptual training on Korean obstruents for Vietnamese learners (베트남 한국어 학습자를 위한 한국어 자음 지각 훈련 연구)

  • Hyosung Hwang
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.17-26
    • /
    • 2023
  • This study aimed to reveal how Vietnamese adult learners at three different proficiency levels perceive Korean word-initial obstruents and whether errors can be corrected through perceptual training. To this end, 105 Vietnamese beginner, intermediate, and advanced learners were given perceptual training on Korean word-initial. The training materials were created by actively utilizing Korean minimal pairs as natural stimuli recorded by native speakers. Learners in the experimental group performed five 20-40 minute self-directed perceptual training sessions over a period of approximately two weeks, while learners in the control group only participated in the pretest and posttest. The results showed a significant improvement in the perception of sounds that were difficult to distinguish before training, and both beginners and advanced learners benefited from the training. This study confirmed that large-scale perceptual training can play an important role in helping Vietnamese learners learn the appropriate acoustic cues to distinguish different sounds in Korean.