• Title/Summary/Keyword: Formant frequency

Search Result 184, Processing Time 0.025 seconds

A study of /l/ velarization in American English based on the Buckeye Corpus (벅아이 코퍼스를 이용한 미국 영어의 /l/ 연구개음화 연구)

  • Sa, Jae-Jin
    • Phonetics and Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.19-25
    • /
    • 2021
  • It has been widely recognized that there are two varieties of lateral liquid /l/, which are light /l/ (a non-velarized allophone) and dark /l/ (a velarized allophone). However, this categorical view has been challenged in recent studies, both on articulatory and acoustic aspects. The purpose of this study is to investigate whether to consider /l/ velarization as a continuum in American English and provide supporting data. A spontaneous American English speech database called the Buckeye Speech Corpus was used for the material. The formant frequencies of /l/ in each syllable position were measured and analyzed statistically. The formant frequencies of /l/ in each syllable position, especially F2 values, were significantly different from each other. The results showed that there were other significantly different varieties of /l/ in American English, which support the continuum view on /l/ velarization. Regarding the effect of the adjacent vowel, the backness of the adjacent vowels was shown to affect the degree of /l/ velarization, regardless of the syllable position of the lateral liquid. This result will help provide a solid ground for the continuum view.

Statistical Speech Feature Selection for Emotion Recognition

  • Kwon Oh-Wook;Chan Kwokleung;Lee Te-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.144-151
    • /
    • 2005
  • We evaluate the performance of emotion recognition via speech signals when a plain speaker talks to an entertainment robot. For each frame of a speech utterance, we extract the frame-based features: pitch, energy, formant, band energies, mel frequency cepstral coefficients (MFCCs), and velocity/acceleration of pitch and MFCCs. For discriminative classifiers, a fixed-length utterance-based feature vector is computed from the statistics of the frame-based features. Using a speaker-independent database, we evaluate the performance of two promising classifiers: support vector machine (SVM) and hidden Markov model (HMM). For angry/bored/happy/neutral/sad emotion classification, the SVM and HMM classifiers yield $42.3\%\;and\;40.8\%$ accuracy, respectively. We show that the accuracy is significant compared to the performance by foreign human listeners.

A Design and Implementation of Speech Recognition Preprocessing System using Formant Frequency (포만트 주파수를 이용한 음성인식 전처리 시스템의 설계 및 구현)

  • 김태욱;한승진;김민성;이정현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.198-200
    • /
    • 1999
  • 인간이 발성하는 음성에는 의미에 대한 정보 뿐만 아니라 화자의 성별에 따라 고유한 특성을 가지고 있다. 즉 음성은 고음이 강한 여성음성과 남성음성으로 분류할 수 있다. 그러나, 기존의 HMM을 이용한 음성인식시스템에서는 남성과 여성음성의 이러한 특성이 있음에도 불구하고 이를 고려하지 않고, 하나의 HMM으로 구성하고 있다. 본 논문에서 제시하는 알고리즘으로 실험한 결과 남성과 여성의 포만트 주파수가 100~30Hzck이가 나는 것을 알 수 있었고, 이러한 특성을 고려하여 남성과 여성의 음성을 구별할 수 있는 방법을 제안한다. 또한 남성과 여성음성을 각각 구분하여 GMM을 훈련시킨 후 인식과정에서 입력된 음성의 포만트 특성에 따라 남성음성이면 남성 HMM으로 여성음성이면 여성 HMM으로 인식을 수행함으로써 기존의 인식방법보다 남성음성은 5.2% 여성음성은 4.4% 향상된 결과를 얻었다.

  • PDF

On a Performance Improvement of Speaker Recognition by using the Auditory Characteristics of Speech (음성의 청각특성을 이용한 화자식별시스템의 성능향상에 관한 연구)

  • 이윤주;오세영배재옥배명진
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1223-1226
    • /
    • 1998
  • The pre-emephasis filter as the conventional method emphasizes all components of high frequency that reflects the speaker characteristics. However this filter don't show the auditory characteristics of speaker's speech. In order to emphasize the perceptual characteristics, we propose the speaker recognition system that uses the perceptual weighting as the preprocessor because the Auditory characteristic of human is sensitive to the formant peaks. This filter has the characteristcs that both deemphasizes the low-formants and emphasizes the high formants. As a result of the proposed method, we improve the total recognition rate 1.7% better than the conventional method.

  • PDF

A Study of the Effects of Similarity on L2 Phone Acquisition: An Experimental Study of the Korean Vowels Produced by Japanese Learners

  • Kwon, Sung-Mi
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.93-103
    • /
    • 2007
  • The aims of this study were to examine the acoustic features of Korean and Japanese vowels, and to determine whether new phones that do not have counterparts in Japanese or similar phones that have counterparts improve more from learning. This study consisted of three parts. In Experiment I, a speech production test was performed to observe the acoustic features of Korean and Japanese vowels. In Experiment II, the speech production of Korean vowels produced by Koreans, advanced Japanese learners of Korean, and beginning Japanese learners of Korean was investigated. In Experiment III, a speech perception study of Korean vowels produced by the two Japanese learner groups was conducted to observe the effect of learning on acquiring L2 phones. The conclusion drawn from the study was that the similar phones produced by Japanese show more similarity with those of Koreans than new phones in terms of F1 and F2, but Japanese learners of Korean displayed more improvement in new phones from learning.

  • PDF

An acoustical analysis method of numeric sounds by Praat (Praat를 이용한 숫자음의 음향적 분석법)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.127-137
    • /
    • 2000
  • This paper presents a macro script to analyze numeric sounds by a speech analysis shareware, Praat, and analyzes those sounds produced by three students who were born and raised in Pusan. Recording was done in a quiet office. To make a meaningful comparison, dynamic time points in relation to the total duration of voicing segments were determined to measure acoustical values. Results showed that a strong correlation coefficient was found between the repetitive production of numeric sounds within and across the speakers. Very high coefficients among diphthongal numbers (0 and 6) which usually show wide formant variation were noticed. This supports that each speaker produced numbers quite coherently. Also, the frequency differences between the three subjects were within a perceptually similar range. To identify a speaker among others may require to find subtle individual differences within this range. Perceptual experiments by synthesized numeric sounds may lead to resolve the issue.

  • PDF

Correlation between Physical Fatigue and Speech Signals (육체피로와 음성신호와의 상관관계)

  • Kim, Taehun;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.11-17
    • /
    • 2015
  • This paper deals with the correlation between physical fatigue and speech signals. A treadmill task to increase fatigue and a set of subjective questionnaire for rating tiredness were designed. The results from the questionnaire and the collected bio-signals showed that the designed task imposes physical fatigue. The t-test for two-related-samples between the speech signals and fatigue showed that the parameters statistically significant to fatigue are fundamental frequency, first and second formant frequencies, long term average spectral slope, smoothed pitch perturbation quotient, relative average perturbation, pitch perturbation quotient, cepstral peak prominence, and harmonics to noise ratio. According to the experimental results, it is shown that mouth is opened small and voice is changed to be breathy as the physical fatigue accumulates.

A Comparison fo Formant frequency of Vowels Produed by Cochlear Implanted and Normal-Hearing Children (인공와우이식을 받은 아동과 건청 아동이 산출한 단모음의 음향음성학적 특성)

  • Lee, Joo-Eun;Yi, Bong-Won
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.64-66
    • /
    • 2007
  • The purpose of this study was to compare and analyze some acoustic parameters of the cochlear implanted children(N=20, aged 3-10) and to suggest a basic data on speech rehabilitaion for the cochlear implanted children. Acoustic analyses of seven Korean monophthongs produced by 4 contexts(V, CV, VC, CVC) were conducted for the cochler implanted children and normal hearing children(N=20, aged 3-10). Subjects were asked to pronounce a list of vowel repeating three times. The results of this study are the same as follows: First, in the case of the cochlear implanted group, there were no significant differences in F1 and F2. Second, in the case of the normal hearing group, there were significant differences in F2 /ㅜ/ between V and CVC, between VC and CVC. Third, there were significant differences in F1, F2 between CI group and normal hearing group.

  • PDF

A study on the voice command recognition at the motion control in the industrial robot (산업용 로보트의 동작제어 명령어의 인식에 관한 연구)

  • 이순요;권규식;김홍태
    • Journal of the Ergonomics Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.3-10
    • /
    • 1991
  • The teach pendant and keyboard have been used as an input device of control command in human-robot sustem. But, many problems occur in case that the usef is a novice. So, speech recognition system is required to communicate between a human and the robot. In this study, Korean voice commands, eitht robot commands, and ten digits based on the broad phonetic analysis are described. Applying broad phonetic analysis, phonemes of voice commands are divided into phoneme groups, such as plosive, fricative, affricative, nasal, and glide sound, having similar features. And then, the feature parameters and their ranges to detect phoneme groups are found by minimax method. Classification rules are consisted of combination of the feature parameters, such as zero corssing rate(ZCR), log engery(LE), up and down(UD), formant frequency, and their ranges. Voice commands were recognized by the classification rules. The recognition rate was over 90 percent in this experiment. Also, this experiment showed that the recognition rate about digits was better than that about robot commands.

  • PDF

Geophysics of Vowel Space in Bahasa Malaysia and Bahasa Indonesia (말레이시아어와 인도네시아어 모음 공간의 지형도)

  • Park Jeong-Sook;Chun Taihyun;Park Han-Sang
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.63-66
    • /
    • 2006
  • This present study investigates the vowels in Bahasa Malaysia and Bahasa Indonesia in terms of the first two formant frequencies. For this study, we recruited 30 male native speakers of Bahasa Malaysia and Bahasa Indonesia (15 each) which include 6 vowels (i, e, a, o, u, a) in various contexts. The present study provides a three-dimensional vowel space by plotting F1, F2, and the frequency of datapoints. This study is significant in that the geophysics of vowel space presents yet another view of the vowel space.

  • PDF