• 제목/요약/키워드: vocal tract filter

검색결과 16건 처리시간 0.022초

비고정 구간 길이 음향 튜브를 이용한 성도 모델링 (Vocal Tract Modeling with Unfixed Sectionlength Acoustic Tubes(USLAT))

  • 김동준
    • 전기학회논문지
    • /
    • 제59권6호
    • /
    • pp.1126-1130
    • /
    • 2010
  • Speech production can be viewed as a filtering operation in which a sound source excites a vocal tract filter. The vocal tract is modeled as a chain of cylinders of varying cross-sectional area in linear prediction acoustic tube modeling. In this modeling the most common implementation assumes equal length of tube sections. Therefore, to model complex vocal tract shapes, a large number of tube sections are needed. This paper proposes a new vocal tract model with unfixed sectionlengths, which uses the reduced lattice filter for modeling the vocal tract. This model transforms the lattice filter to reduced structure and the Burg algorithm to modified version. When the conventional and the proposed models are implemented with the same order of linear prediction analysis, the proposed model can produce more accurate results than the conventional one. To implement a system within similar accuracy level, it may be possible to reduce the stages of the lattice filter structure. The proposed model produces the more similar vocal tract shape than the conventional one.

MRI를 이용한 조음모델시뮬레이터 구현에 관하여 (On the Implementation of Articulatory Speech Simulator Using MRI)

  • 조철우
    • 음성과학
    • /
    • 제2권
    • /
    • pp.45-55
    • /
    • 1997
  • This paper describes the procedure of implementing an articulatory speech simulator, in order to model the human articulatory organs and to synthesize speech from this model after. Images required to construct the vocal tract model were obtained from MRI, they were then used to construct 2D and 3D vocal tract shapes. In this paper 3D vocal tract shapes were constructed by spatially concatenating and interpolating sectional MRI images. 2D vocal tract shapes were constructed and analyzed automatically into a digital filter model. Following this speech sounds corresponding to the model were then synthesized from the filter. All procedures in this study were using MATLAB.

  • PDF

파워 스펙트럼 warping을 이용한 성도 정규화 (Vocal Tract Normalization Using The Power Spectrum Warping)

  • 유일수;김동주;노용완;홍광석
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2003년도 학술회의 논문집 정보 및 제어부문 A
    • /
    • pp.215-218
    • /
    • 2003
  • The method of vocal tract normalization has been known as a successful method for improving the accuracy of speech recognition. A frequency warping procedure based low complexity and maximum likelihood has been generally applied for vocal tract normalization. In this paper, we propose a new power spectrum warping procedure that can be improve on vocal tract normalization performance than a frequency warping procedure. A mechanism for implementing this method can be simply achieved by modifying the power spectrum of filter bank in Mel-frequency cepstrum feature(MFCC) analysis. Experimental study compared our Proposal method with the well-known frequency warping method. The results have shown that the power spectrum warping is better 50% about the recognition performance than the frequency warping.

  • PDF

조음 음성 합성기에서 버퍼 재정렬을 이용한 연속음 구현 (Implementation of Continuous Utterance Using Buffer Rearrangement for Articula Synthesizer)

  • 이희승;정명진
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2002년도 하계학술대회 논문집 D
    • /
    • pp.2454-2456
    • /
    • 2002
  • Since articuratory synthesis models the human vocal organs as precise as possible, it is potentially the most desirable method to produce various words and languages. This paper proposes a new type of an articulatory synthesizer using Mermelstein vocal tract model and Kelly-Lochbaum digital filter. Previous researches have assumed that the length of the vocal tract or the number of its cross sections dose not vary while uttering. However, the continuous utterance can not be easily implemented under this assumption. The limitation is overcomed by "Buffer Rearrangement" for dynamic vocal tract in this paper.

  • PDF

감정 인식을 위한 음성 특징 도출 (Extraction of Speech Features for Emotion Recognition)

  • 권철홍;송승규;김종열;김근호;장준수
    • 말소리와 음성과학
    • /
    • 제4권2호
    • /
    • pp.73-78
    • /
    • 2012
  • Emotion recognition is an important technology in the filed of human-machine interface. To apply speech technology to emotion recognition, this study aims to establish a relationship between emotional groups and their corresponding voice characteristics by investigating various speech features. The speech features related to speech source and vocal tract filter are included. Experimental results show that statistically significant speech parameters for classifying the emotional groups are mainly related to speech sources such as jitter, shimmer, F0 (F0_min, F0_max, F0_mean, F0_std), harmonic parameters (H1, H2, HNR05, HNR15, HNR25, HNR35), and SPI.

LSP를 이용한 성문 스펙트럼 기울기 추정에 관한 연구 (A Study on the Estimation of Glottal Spectrum Slope Using the LSP (Line Spectrum Pairs))

  • 민소연;장경아
    • 음성과학
    • /
    • 제12권4호
    • /
    • pp.43-52
    • /
    • 2005
  • The common form of pre-emphasis filter is $H(z)\;=\;1\;- az^{-1}$, where a typically lies between 0.9 and 1.0 in voiced signal. Also, this value reflects the degree of filter and equals R(1)/R(0) in Auto-correlation method. This paper proposes a new flattening algorithm to compensate the weaked high frequency components that occur by vocal cord characteristic. We used interval information of LSP to estimate formant frequency. After obtaining the value of slope and inverse slope using linear interpolation among formant frequency, flattening process is followed. Experimental results show that the proposed algorithm flattened the weaked high frequency components effectively. That is, we could improve the flattened characteristics by using interval information of LSP as flattening factor at the process that compensates weaked high frequency components.

  • PDF

한국어 유아 음성인식을 위한 수정된 Mel 주파수 캡스트럼 (Modified Mel Frequency Cepstral Coefficient for Korean Children's Speech Recognition)

  • 유재권;이경미
    • 한국콘텐츠학회논문지
    • /
    • 제13권3호
    • /
    • pp.1-8
    • /
    • 2013
  • 본 논문에서는 한국어에서 유아 대상의 음성인식 향상을 위한 새로운 특징추출 알고리즘을 제안한다. 제안하는 특징추출 알고리즘은 세 가지 방법을 통합한 기법이다. 첫째 성도의 길이가 성인에 비해 짧은 유아의 음향적 특징을 보완하기 위한 방법으로 성도정규화 방법을 사용한다. 둘째 성인의 음성과 비교했을 때 높은 스펙트럼 영역에 집중되어 있는 유아의 음향적 특징을 보완하기 위해 균일한 대역폭을 사용하는 방법이다. 마지막으로 실시간 환경에서의 잡음에 강건한 음성인식기 개발을 위해 스무딩 필터를 사용하여 보완하는 방법이다. 세 가지 방법을 통해 제안하는 특징추출 기법은 실험을 통해 유아의 음성인식 성능 향상에 도움을 준다는 것을 확인했다.

목소리 특성의 주관적 평가와 음성 특징과의 상관관계 기초연구 (A Preliminary Study on Correlation between Voice Characteristics and Speech Features)

  • 한성만;김상범;김종열;권철홍
    • 말소리와 음성과학
    • /
    • 제3권4호
    • /
    • pp.85-91
    • /
    • 2011
  • Sasang constitution medicine utilizes voice characteristics to diagnose a person's constitution. To classify Sasang constitutional groups using speech information technology, this study aims at establishing the relationship between Sasang constitutional groups and their corresponding voice characteristics by investigating various speech feature variables. The speech variables include features related to speech source and vocal tract filter. Experimental results show that statistically significant correlation between voice characteristics and some speech feature variables is observed.

  • PDF

진동센서를 이용한 객관적 비강공명 측정장치의 개발에 대한 연구 (Development of an Objective Measuring Device for the Nasal Resonance using the Vibratory Sensor)

  • 박용재;최홍식;김광문;홍원표
    • 대한음성언어의학회:학술대회논문집
    • /
    • 대한음성언어의학회 1994년도 제2회 학술대회 연제순서 및 초록집
    • /
    • pp.84-84
    • /
    • 1994
  • 사람의 음성은 성대에서 성대음이 발성되어 성도(vocal tract)에서 공명되고 여과(filter)되어 생성된다. 성도로는 후두로부터 하인두강, 중인두강, 구강으로 이어지는 주된 통로와 하인두강, 중인두강, 상인두강, 비강으로 이어지는 보조적인 통로가 있다. 보통의 모음 발성 시에는 구강으로 통하는 통로가 주로 공명강으로 작용되며 비강 통로는 별 작용을 하지 않지만, 'ㄴ, ㅁ, o, ' 등의 비 자음을 발성할 때에는 비강통로가 주 공명강으로 작용된다. (중략)

  • PDF

정상 음성의 목소리 특성의 정성적 분류와 음성 특징과의 상관관계 도출 (Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features)

  • 김정민;권철홍
    • 말소리와 음성과학
    • /
    • 제6권1호
    • /
    • pp.71-76
    • /
    • 2014
  • In this paper voice quality of normal speech is qualitatively classified by five components of breathy, creaky, rough, nasal, and thin/thick voice. To determine whether a correlation exists between a subjective measure of voice and an objective measure of voice, each voice is perceptually evaluated using the 1/2/3 scale by speech processing specialists and acoustically analyzed using speech analysis tools such as the Praat, MDVP, and VoiceSauce. The speech parameters include features related to speech source and vocal tract filter. Statistical analysis uses a two-independent-samples non-parametric test. Experimental results show that statistical analysis identified a significant correlation between the speech feature parameters and the components of voice quality.