• Title/Summary/Keyword: Formant Frequency

Search Result 183, Processing Time 0.023 seconds

An Experimental Phonetic Analysis on Japanese Vowels of Japanese Natives (일본인 화자의 일본어 모음에 관한 실험음성학적 분석)

  • Lee Jae-Gang
    • MALSORI
    • /
    • no.33_34
    • /
    • pp.57-69
    • /
    • 1997
  • In this paper, 1 will try to examine the aspects of formants, based on the LPC analysis. In this analysis, five Japanese vowels (a, i, u, e, o) will experience two kinds of experiments: vowels in isolated forms, and vowels in carrier sentences. The analysis results of Japanese vowels of the Japanese natives show a peculiar feature that Japanese vowels form respective vowel groups. Each Japanese vowel makes a statistically significant difference. In the Fl analysis of the vowels grouped by the informant's sex, Japanese vowel (a) shows the greatest standard deviation without regard to the informant's sex. In the F2 analysis of Japanese vowels, each vowel has a statistically significant difference. The fact that the male's [u] shows great standard deviation means that there is a great difference of the frontness of the tongue among the Japanese males in articulating [u]. Isolated vowels and carried vowels show statistically little significance between Fl and F2 frequency values. In another contrastive analysis between the isolated vowel group and the carried vowel group, whether a vowel is articulated in isolation or in a sentence appears to have little effect on its formant frequency.

  • PDF

A Study on Speaker Recognition using the Peak and valley pitch detection and the Fuzzy (국부 봉우리와 골에 의한 피치 검출과 퍼지를 이용한 화자 인식에 관한 연구)

  • 김연숙;김희주;김경재
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.1
    • /
    • pp.213-219
    • /
    • 2004
  • This paper proposes speaker recognition algorithm which includes the pitch parameter for the peak and valley. The time-frequency hybrid method for pitch extraction is valuable in that it can improve resolution in the time domain and accuracy in the frequency domain at the same time. It makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance for proposed method, speaker recognition experiments are carried out using vowels and number sounds.

Error Correction and Praat Script Tools for the Buckeye Corpus of Conversational Speech (벅아이 코퍼스 오류 수정과 코퍼스 활용을 위한 프랏 스크립트 툴)

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.29-47
    • /
    • 2012
  • The purpose of this paper is to show how to convert the label files of the Buckeye Corpus of Spontaneous Speech [1] into Praat format and to introduce some of the Praat scripts that will enable linguists to study various aspects of spoken American English present in the corpus. During the conversion process, several types of errors were identified and corrected either manually or automatically by the use of scripts. The Praat script tools that have been developed can help extract from the corpus massive amounts of phonetic measures such as the VOT of plosives, the formants of vowels, word frequency information and speech rates that span several consecutive words. The script tools can extract additional information concerning the phonetic environment of the target words or allophones.

The Technique of Spectrum Flattening by Algorithm for Minimized Harmonics Variance Value (Harmonic 분산값 최소화 알고리즘에 의한 주파수 영역 평탄화 기법)

  • Min, So-Yeon;Kim, Young-Kyu
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.9
    • /
    • pp.3558-3562
    • /
    • 2010
  • The exact fundamental frequency (pitch) extraction is important in speech signal processing. However the exact pitch extraction from speech signal is very difficult due to the effect of formant and transitional amplitude. So in this paper, the pitch is detected after flattening the spectrum in frequency region by proposed algorithm for minimized harmonics variance value. Experimental result showed the proposed method appeared an outstanding performance in compared with LPC, Cepstrum. Also, the results show the proposed method is better than conventional method.

Feature analysis of deaf students' English language by frequency (청각장애학생의 영어 발성 주파수별 특징 분석)

  • Lee, Gun-Min;Park, Hye Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.819-828
    • /
    • 2014
  • In this paper, we analyze the characteristics of the English vocalization of deaf students and present the basic data for the development of personalized English learning aid tools that reflect its features. We visited hearing special schools in Seoul and Daegu and recorded English vocalization of the deaf students in order to analyze the characteristics of deaf students' English vocalization. We analyzed the data by Praat program, an professional voice analysis program. The voice features of deaf students' English vocalization were extracted and then compared with those of non-deaf students' English vocalization.

An Analysis of Acoustic Features Caused by Articulatory Changes for Korean Distant-Talking Speech

  • Kim Sunhee;Park Soyoung;Yoo Chang D.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2E
    • /
    • pp.71-76
    • /
    • 2005
  • Compared to normal speech, distant-talking speech is characterized by the acoustic effect due to interfering sound and echoes as well as articulatory changes resulting from the speaker's effort to be more intelligible. In this paper, the acoustic features for distant-talking speech due to the articulatory changes will be analyzed and compared with those of the Lombard effect. In order to examine the effect of different distances and articulatory changes, speech recognition experiments were conducted for normal speech as well as distant-talking speech at different distances using HTK. The speech data used in this study consist of 4500 distant-talking utterances and 4500 normal utterances of 90 speakers (56 males and 34 females). Acoustic features selected for the analysis were duration, formants (F1 and F2), fundamental frequency, total energy and energy distribution. The results show that the acoustic-phonetic features for distant-talking speech correspond mostly to those of Lombard speech, in that the main resulting acoustic changes between normal and distant-talking speech are the increase in vowel duration, the shift in first and second formant, the increase in fundamental frequency, the increase in total energy and the shift in energy from low frequency band to middle or high bands.

Korean Monophthong Development in Normal 4-, 5-, and 6-Years-Olds (4세, 5세, 6세 정상 아동의 한국어 단모음 발달)

  • Kang, Eunyeong
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.7 no.4
    • /
    • pp.89-104
    • /
    • 2019
  • Purpose : The purpose of this study was to investigate the development of korean vowels by acoustically analyzing whether children produce Korean vowels differently according to their age and gender between ages 4 and 6. Methods : A total of 104 children aged 4~6 years (56 males and 48 females) participated in this study. The participants were classified as either 4, 5, or 6 years old. Vowel speech data was obtained by asking the subjects to pronounce meaningful words in which the vowel in question was located in the first syllable. Speech analysis was performed using the Multi-speech 3700 program. Results : Age, gender, and vowel being pronounced all had significant effects on intensity. There was significant decrease with increasing age, and the intensity was significantly higher in male children than female children. Neither age, gender, nor the vowel being produced affected the fundamental frequency. The fundamental frequency produced did not differ by age or gender. The first and second formants had considerable effect on age and vowels, significantly decreased with age, and did not have a gender difference. Conclusion : The results of this study showed that children aged 4~6 have similar anatomical structures, but that maturity of speech motor skills required to pronounce vowels was correlated with age. The results of this study can be used to evaluate children's speech and develop speech therapy programs.

Development of 3-Ch EGG System Using Modulation and Demodulation Techniques(I) (변복조 방식을 이용한 3-채널 EGG 시스템의 개발(I))

  • Kim, J.M.;Song, C.G.;Lee, M.H.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1993 no.05
    • /
    • pp.134-135
    • /
    • 1993
  • The purpose of this research is development of EGG system for quantitative assessment of laryngeal function using speech and electroglotto-graphic data. The designed EGG system is 4-electrodes system which excitation current source is supplied from 1st to 4th electrode. The output signal.: from 2nd and 3rd electrodes, which are motivated by frequency of excitation current source, are air-pressure waveforms from vocal folds. After demodulation process, we obtain pitch signals of the modulated waveforms by excitation current source through differentiator which cuts off frequency below 0.1Hz. Software processing methods were used as conventional pitch extraction methods, but the proposed system is designed to analog hardware in order to eliminate interferences from low formant frequency of speech. We will construct the discriminating database between pathological subjects and control groups on each case. Using the proposed 3 channel EGG system and LMS algorithm, it will be detected that the distinctive characteristics of laryngeal function of voiced region and other regions by EGG signals and LPC spectra.

  • PDF

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

A comparative study of the acoustic characteristics of the vowel /a/ between children with spastic and dyskinetic cerebral palsy (경직형과 불수의운동형 뇌성마비아동의 /아/ 모음 음향학적 비교)

  • Jeong, Pil Yeon;Sim, Hyun Sub
    • Phonetics and Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.65-74
    • /
    • 2020
  • This study aims to compare the acoustic characteristics of vowel phonation in children with spastic and dyskinetic cerebral palsy (CP). Thirty-four children aged 4-12 years with CP participated in the study (spastic 26, dyskinetic 8). Voice samples for the acoustic analysis were extracted from a sustained vowel /a/. All acoustic measures were made using Praat. Group differences were compared by an independent t-test or Welch-Aspin test, if the equivalence assumption was not met. The results of this study are as follow. First, maximum phonation time(MPT) was significantly shorter for the dyskinetic CP than for the spastic CP. Second, shimmer percent was significantly increased in the dyskinetic CP than in the spastic CP. Lastly, there were no significant group differences in both the first formant and the second formant. These findings indicate that the dyskinetic CP has a poorer respiratory capacity and poorer laryngeal function than the spastic CP. On the other hand, both groups have a comparable ability to articulate the vowel /a/. The results of the present study help speech language pathologists identify the speech motor control ability of children with two types of CP (spastic and dyskinetic) and help to make an intervention plan associated with a specific type of CP.