• Title/Summary/Keyword: formant synthesis

Search Result 34, Processing Time 0.028 seconds

A Study on Vowel Formant Variation by Vocal Tract Modification (성도 변형에 따른 모음 포먼트의 변화 고찰)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.83-92
    • /
    • 1998
  • Vowels are classified by vocal tract shapes. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as $F_l,\;F_2,\;F_3$, and so on. This study reviews the perturbation theory of the tract and determines the corresponding formant frequencies from modified vocal tracts using vocal tract area function. Then, formant variation is observed from the theory. Finally, each set of $F_l,\;F_2,\;and\;F_3$ frequency is input to a speech synthesis software to make a vowel sound. Auditory impression of each sound without any modification of its vocal tract shape is almost the same as the corresponding phonetic symbol. Formant frequencies of $F_l,\;F_2,\;F_3$ vary according to the perturbation theory. Generally, constriction along the node causes formant values to decrease while constriction along the anti-node cause it to increase. Vocal tracts modified by more than $3\;cm^2$ change vowel qualities of /a/ and /i/ into those of f /v/ and /$\varepsilon$/, respectively. This study will be helpful in simulating sounds from modified vocal tracts before any operation. Further studies are desirable to compare vocal tract shapes of various languages and their sounds together.

  • PDF

A Study on the Formant Comparison of Korean Monophthongs according to Age and Gender -A Survey on Patients in Oriental Hospitals- (연령 및 성별에 따른 한국인 단모음 포먼트 비교에 관한 연구 -한방병원 내원환자를 중심으로-)

  • Kim, Young-Su;Kim, Keun Ho;Kim, Jong Yeol;Jang, Jun-Su
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.73-80
    • /
    • 2013
  • Formant is one of the essential vocal features for research of voice production, recognition and synthesis. Numerous studies were established on foreign languages including English vowels. However, studies related to Korean were done with a limited number of voice data. In this study, we compare four formants according to age and gender using a large number of Korean monophthongs. A total of 2614 Korean speakers participated in our experiments. We summarize statistical results by mean and standard deviation for each formant of five monophthongs. The results show a notable difference in each age and gender group. A quantitative study based on a large dataset is suggested for future studies on Korean speech sounds.

Flattening Techniques for Pitch Detection (피치 검출을 위한 스펙트럼 평탄화 기법)

  • 김종국;조왕래;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.381-384
    • /
    • 2002
  • In speech signal processing, it Is very important to detect the pitch exactly in speech recognition, synthesis and analysis. but, it is very difficult to pitch detection from speech signal because of formant and transition amplitude affect. therefore, in this paper, we proposed a pitch detection using the spectrum flattening techniques. Spectrum flattening is to eliminate the formant and transition amplitude affect. In time domain, positive center clipping is process in order to emphasize pitch period with a glottal component of removed vocal tract characteristic. And rough formant envelope is computed through peak-fitting spectrum of original speech signal in frequency domain. As a results, well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. After all, we obtain residual signal which is removed vocal tract element The performance was compared with LPC and Cepstrum, ACF 0wing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

Study on the Vehicle Sound Based on the Formant Filter and Musical Harmonics (포먼트 필터와 음악 화성학에 기반한 차량 음질 연구)

  • Chang, Kyoung-Jin;Park, Dong Chul
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.25 no.8
    • /
    • pp.525-531
    • /
    • 2015
  • Driving sound is an effective element to promote the product identity of a vehicle by providing customers with attractive sound which reflects the concept of a vehicle. Recently, major automakers are focusing on the target sound setting so that the sound can represent the brand image as well as the unique concept of a vehicle. In this study, a new method of target setting for the driving sound will be introduced based on using formant filter and musical harmonics characteristics. In addition, a target sound suggested from this method will be realized and verified by using active noise control in vehicle.

Measurement of the vocal tract area of vowels By MRI and their synthesis by area variation (MRI에 의한 모음의 성도 단면적 측정 및 면적 변이에 따른 합성 연구)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.19-34
    • /
    • 1998
  • The author collected and compared midsagittal, coronal, coronal oblique, and transversal images of Korean monophthongs /a, i, e, o, u, i, v/ produced by a healthy male speaker using 1.5 T MR, VISION. Area was measured by computer software after tracing the cross-section at different points along the tract. Results showed that the width of the oral and pharyngeal cavities varied compensatorily from each other on the midsagittal dimension. Formant frequency values estimated from the area functions of the seven vowels showed a strong correlation (r=0.978) with those analyzed from the spoken vowels. Moreover, almost all of 35 students who listened to the synthesized vowels from area data perceived the synthesized vowels as equivalent to the spoken ones. Movement of constriction points of vowel /u/ with wider lip opening sounded /i/ and led to slight changes in vowel quality. Jaw and tongue movement led to major volume variation with an anatomical limitation. Each comer vowel varied systematically from a somewhat constant volume of the average area. Thus, the author proposed that any simulation studies related to vocal tract area variation should reflect its constant volume. The results may be helpful to verify exact measurement of the vocal tract area through vowel synthesis and a simulation study before having any operation of the vocal tract.

  • PDF

A Study on the Implementation of Korean Synthesis-By-Rule System Using Formant Synthesis Method (포만트합성법을 이용한 한국어 규칙합성시스템의 구현에 관한 연구)

  • 조철우;이태원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.9 no.6
    • /
    • pp.38-44
    • /
    • 1990
  • 포만트 합성법을 이용하여 규칙합성시스템을 구현한 일례를 제시한다. 먼저 음소의 입력을 위한 영문 알파벳과 음소의 대응관계를 설정한 뒤 수집된 자연음성으로부터 포만트 합성을 위한 특징 파라미 터를 추출하여 데이터베이스를 작성하다. 그 다음 이러한 데이터베이스를 이용하여 제시된 음소간을 연 결하는 규칙을 제안하고 음소단위의 합성을 행한다. 합성에는 신호처리 프로세서를 사용한 실시간 포만 트 음성합성기를 구현하여 사용하였다. 합성결과 단독음소와 연결음소에 대하여 합성음성을 얻고 이를 평가하였다.

  • PDF

A Study on the Phoneme Based Analysis of Korean Initial Plosives Using Statistical Method and Perception Tests (통계적 방법과 인지실험을 통한 한국어 초성파열음의 음소단위 분석에 관한 연구)

  • Jo Cheol-Woo;Lee Woo-Sun;Lee Cyu-Ho;Kim Jong-Ahn;Lim Gwang-Il;Lee Tae-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.78-85
    • /
    • 1989
  • This paper describes a statistical methods and perception test for extracting the parameters to be used for the synthesis-by-rule of Korean plosives. Formant synthesizer is chosen for the synthesis of the phonemes. Speech materials for the analysis consists of 72 CV monosyllables from the single male speaker. The analysis is done mainly focused on the variation of parameters in time and frequency domain, then perception tests are executed to estimate the effects of variations of the formant transitions.

  • PDF

An Acoustical Study of Korean Diphthongs (한국어 이중모음의 음향학적 연구)

  • Yang Byeong-Gon
    • MALSORI
    • /
    • no.25_26
    • /
    • pp.3-26
    • /
    • 1993
  • The goals of the present study were (3) to collect and analyze sets of fundamental frequency (F0) and formant frequency (F1, F2, F3) data of Korean diphthongs from ten linguistically homogeneous speakers of Korean males, and (2) to make a comparative study of Korean monophthongs and diphthongs. Various definitions, kinds, and previous studies of diphthongs were examined in the introduction. Procedures for screening subjects to form a linguistically homogeneous group, time point selection and formant determination were explained in the following section. The principal findings were as follows: 1. Much variation was observed in the ongliding part of diphthongs. 2. F2 values of (j) group descended while those of [w] group ascended, 3. The average duration of diphthongs were about 110 msec, and there was not much variation between speakers and diphthongs. 4. In a comparative study of monophthongs and diphthongs, Fl and F2 values of the same offgliding part at the third time point almost converged. 5. The gliding of diphthongs was very short beginning from the h-noise. Perceptual studies using speech synthesis are desirable to find major parameters for diphthongs. The results of the present study wi11 be useful in the area of automated speech recognition and computer synthesis of speech.

  • PDF

A Study on the Pitch Extraction Improvement Using LSP for the Synthesis of High Speech Quality (고음질 음성합성을 위한 LSP를 이용한 피치검출 성능향상에 관한 연구)

  • Seo, Ji-Ho;Kim, Jong-Kuk;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.69-75
    • /
    • 2010
  • In this paper, the pitch is detected after the elimination of formant ingredients by flattening the spectrum in frequency domain. In order to remove impact of formant and transition frequency in the signal spectrum, formant envelop is made by linear interpolation with any points each sub-band and the spectrum of speech signal is compensated by the reverse of the envelop interpolated linearly after we divide frequency band into several segment based on LSP and detect the points. The experimental result showed the proposed method appeared an outstanding performance in compared with LPC, Cepstrum, Lifter methods. The method reduced the gross error rate 1.30% than the LPC method which appeared a good performance except the proposed method. Also, the proposed method showed low error rate in noise environment.

Speech Synthesis using Diphone Clustering and Improved Spectral Smoothing (다이폰 군집화와 개선된 스펙트럼 완만화에 의한 음성합성)

  • Jang, Hyo-Jong;Kim, Kwan-Jung;Kim, Gye-Young;Choi, Hyung-Il
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.665-672
    • /
    • 2003
  • This paper describes a speech synthesis technique by concatenating unit phoneme. At that time, a major problem is that discontinuity is happened from connection part between unit phonemes, especially from connection part between unit phonemes recorded by different persons. To solve the problem, this paper uses clustered diphone, and proposes a spectral smoothing technique, not only using formant trajectory and distribution characteristic of spectrum but also reflecting human's acoustic characteristic. That is, the proposed technique performs unit phoneme clustering using distribution characteristic of spectrum at connection part between unit phonemes and decides a quantity and a scope for the smoothing by considering human's acoustic characteristic at the connection part of unit phonemes, and then performs the spectral smoothing using weights calculated along a time axes at the border of two diphones. The proposed technique removes the discontinuity and minimizes the distortion which can be occurred by spectrum smoothing. For the purpose of the performance evaluation, we test on five hundred diphones which are extracted from twenty sentences recorded by five persons, and show the experimental results.