Search | Korea Science

Efficient Compensation of Spectral Tilt for Speech Recognition in Noisy Environment (잡음 환경에서 음성인식을 위한 스펙트럼 기울기의 효과적인 보상 방법)

Cho, Jungho
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.17 no.1
- /
- pp.199-206
- /
- 2017
Environmental noise can degrade the performance of speech recognition system. This paper presents a procedure for performing cepstrum based feature compensation to make recognition system robust to noise. The approach is based on direct compensation of spectral tilt to remove effects of additive noise. The noise compensation scheme operates in the cepstral domain by means of calculating spectral tilt of the log power spectrum. Spectral compensation is applied in combination with SNR-dependent cepstral mean compensation. Experimental results, in the presence of white Gaussian noise, subway noise and car noise, show that the proposed compensation method achieves substantial improvements in recognition accuracy at various SNR's.
https://doi.org/10.7236/JIIBC.2017.17.1.199 인용 PDF KSCI

An Acoustic Study of Phonation Types in Vowels Following Consonant Clusters in Korean (한국어 자음군의 후행모음에 나타난 발성유형의 음향음성학적 연구)

Park, Han-Sang
- MALSORI
- /
- no.64
- /
- pp.53-76
- /
- 2007
This study investigates phonation types of Korean obstruents associated with the vowels immediately following singletons or geminates in intervocalic positions. F0, H1-H2, and spectral tilt were measured from the 20 ms segment at the onset of the vowels for the tokens of /paCa/ and /paCCa/, where Cs are of the same manner and place of articulation. The results showed a remarkable change in the values of F0, H1-H2, and spectral tilt as the preceding obstruents shifts from the lenis singletons to the lenis geminates, which suggests that the spectral characteristics of the vowels following the lenis geminates are not different from those of the vowels following fortis singletons or geminates. Significantly enough, this study adds data about the spectral characteristics of Korean phonation types.
PDF

Electroglottographic Spectral Tilt in Frequency Ranges of Vowel Sound (모음 주파수 범위에 따른 성문전도 스펙트럼 기울기)

Kim, Ji-Hye;Jang, Ae-Lan;Jung, Dong-Keun
- Journal of Sensor Science and Technology
- /
- v.24 no.4
- /
- pp.247-251
- /
- 2015
In this study, electroglottographic spectral tilt (EST) was investigated for characterization of vocal cords vibration. EST was analyzed from the power spectrum of electroglottographic signals by dividing frequency analysis range as full range (0~4 octave), low range (0~2 octave), and high range (2~4 octave). EST of all ranges in female were greater than those in male. In female and male groups, EST of high range was higher than that of low range. This result suggests that EST has at least two components and dividing frequency range in analysis of EST is effective for investigating characteristics of vocal cords vibration.
https://doi.org/10.5369/JSST.2015.24.4.247 인용 PDF KSCI

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC (FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.5 no.4
- /
- pp.363-371
- /
- 2010
In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate for the speaker-independent speech recognition, we study the effect of spectral tilt on the Fourier magnitude spectrum en route to the extraction of MFCC. The effect of FIR filtering on the speech signal on the speech recognition is also investigated in parallel. Evaluation of the proposed methods are performed by two independent ways of the Fisher discriminant objective function and speech recognition test by hidden Markov model with fuzzy vector quantization. From the experiments, the recognition error rate is found to show about 10% relative improvements over the conventional method by an appropriate choice of the tilt factor.
PDF KSCI

Phonation Type Index k (발성유형지수 k)

Park Hansang
- Proceedings of the KSPS conference
- /
- 2002.11a
- /
- pp.77-80
- /
- 2002
This study proposes phonation type index k as a descriptor of the overall spectral tilt, which is free from the effects of fundamental frequency and vowel quality. The newly proposed phonation type index k presents a simple and single measure of the overall spectral tilt. Phonation type index k can be applied to speech technology. It can also be used in diagnosing patients voice qualities in speech pathology. The distribution of phonation type index k, which is speaker-dependent, may be useful in forensic phonetics and voice recognition as an indicator of speaker identity.
PDF

A Spectral Compensation Method for Noise Robust Speech Recognition (잡음에 강인한 음성인식을 위한 스펙트럼 보상 방법)

Cho, Jung-Ho
- 전자공학회논문지 IE
- /
- v.49 no.2
- /
- pp.9-17
- /
- 2012
One of the problems on the application of the speech recognition system in the real world is the degradation of the performance by acoustical distortions. The most important source of acoustical distortion is the additive noise. This paper describes a spectral compensation technique based on a spectral peak enhancement scheme followed by an efficient noise subtraction scheme for noise robust speech recognition. The proposed methods emphasize the formant structure and compensate the spectral tilt of the speech spectrum while maintaining broad-bandwidth spectral components. The recognition experiments was conducted using noisy speech corrupted by white Gaussian noise, car noise, babble noise or subway noise. The new technique reduced the average error rate slightly under high SNR(Signal to Noise Ratio) environment, and significantly reduced the average error rate by 1/2 under low SNR(10 dB) environment when compared with the case of without spectral compensations.
PDF KSCI

Glottal Parameters Contributing to the Perception of Loud Voices

Yi, So-Pae;Lee, One-Good;Kim, Hyung-Soon
- Speech Sciences
- /
- v.8 no.1
- /
- pp.143-157
- /
- 2001
This paper focused on glottal parameters contributing to the perception of loud voices because energy of a voice is not the only effective factor. We used a formant synthesizer to synthesize loud voices. We divided F0 tilt (the tilt of F0 contour), SQ (Speed Quotient), OQ (Open Quotient) and TL (spectral Tilt Level) into three levels to get different combinations with default values for the other synthesizer parameters. Analysis of listening tests indicated that F0 tilt, SQ, OQ and TL in descending order had significant influence on the perception of loud voices. F0 tilt had a far more significant effect than the others. The influence of SQ increased greatly with the exclusion of F0 tilt as a factor. The interaction between parameters was not significant.
PDF

Assessment of autonomic function in Cerebral palsy patients during graded head-up tilt (뇌 손상 환자(Cerebral palsy)의 Head up Tilt 상태에서의 심박변동과 자율 신경 활동 평가)

Choi, J.J.;Cho, S.R.;Lee, J.H.;Lee, M.H.
- Proceedings of the KIEE Conference
- /
- 2002.07d
- /
- pp.2693-2695
- /
- 2002
In this paper, the power spectral analysis of heart rate variability(HRV) was performed to evaluate effects of orthostatic stress with head-up tilt on autonomic nervous system(ANS) for 20 healthy male subjects(age : 245 yr.) and a new method was proposed to assess the autonomic balance. The ECG signals wore recorded for 3 minutes in both the supine and 70 head-up tilt positions, and then the HRV signals underwent power spectrum analysis at each position. The results of this study suggest that cardiac autonomic functions, such as sympathetic tone in autonomic balance with the increment of sympathetic tone and the decrement of parasympathetic tone which occur during head-up tilt position, arc not sufficient to overcome tile orthostatic stress arising in Cerebral Palsy.
PDF

Analysis of Voice Quality Features and Their Contribution to Emotion Recognition (음성감정인식에서 음색 특성 및 영향 분석)

Lee, Jung-In;Choi, Jeung-Yoon;Kang, Hong-Goo
- Journal of Broadcast Engineering
- /
- v.18 no.5
- /
- pp.771-774
- /
- 2013
This study investigates the relationship between voice quality measurements and emotional states, in addition to conventional prosodic and cepstral features. Open quotient, harmonics-to-noise ratio, spectral tilt, spectral sharpness, and band energy were analyzed as voice quality features, and prosodic features related to fundamental frequency and energy are also examined. ANOVA tests and Sequential Forward Selection are used to evaluate significance and verify performance. Classification experiments show that using the proposed features increases overall accuracy, and in particular, errors between happy and angry decrease. Results also show that adding voice quality features to conventional cepstral features leads to increase in performance.
https://doi.org/10.5909/JBE.2013.18.5.771 인용 PDF KSCI

An Acoustic Study of Korean Phonation Types (한국어 발성 유형의 음향음성학적 연구)

Park, Han-Sang
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.6
- /
- pp.343-352
- /
- 2005
Phonation type index k (PTI In) presents a single and simplified measure of the spectral tilt. which is free from the effects of fundamental frequency and vowel qualify This study investigates PTI k with vowels /i . e. a. o, u/ obtained from 10 Korean male subjects. Specifically. this study tests the significance of differences in PTI k across Positions, Phonation types. vowels, and speakers, respectively The results showed that there was a significant difference in PTI k across positions, Phonation types, vowels. and speakers.
PDF KSCI

Search Result 29, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)