• 제목/요약/키워드: Speech improvement

검색결과 610건 처리시간 0.029초

음성신호개선을 위한 임계대역 웨이블렛 패킷 기반의 스펙트럼 차감법 (Critical Banded Wavelet Packet-Based Spectral Subtractions for Speech Enhancement)

  • Chang, Sung-Wook;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • 제23권4E호
    • /
    • pp.125-133
    • /
    • 2004
  • In this paper, we propose a critical banded wavelet packet-based spectral subtraction for speech enhancement. Critical banded wavelet packet, which reflects the human auditory system, may lead to minimization of intelligibility loss and quality improvement of the enhanced speech in the spectral domain, when combined with an appropriate spectral subtraction gain function. The proposed method shows better performance than the conventional one in comparative assessments. We also show that, for effective evaluation of enhanced speech, it is essential to consider the characteristics of speech quality measures.

Improved Acoustic Modeling Based on Selective Data-driven PMC

  • Kim, Woo-Il;Kang, Sun-Mee;Ko, Han-Seok
    • 음성과학
    • /
    • 제9권1호
    • /
    • pp.39-47
    • /
    • 2002
  • This paper proposes an effective method to remedy the acoustic modeling problem inherent in the usual log-normal Parallel Model Composition intended for achieving robust speech recognition. In particular, the Gaussian kernels under the prescribed log-normal PMC cannot sufficiently express the corrupted speech distributions. The proposed scheme corrects this deficiency by judiciously selecting the 'fairly' corrupted component and by re-estimating it as a mixture of two distributions using data-driven PMC. As a result, some components become merged while equal number of components split. The determination for splitting or merging is achieved by means of measuring the similarity of the corrupted speech model to those of the clean model and the noise model. The experimental results indicate that the suggested algorithm is effective in representing the corrupted speech distributions and attains consistent improvement over various SNR and noise cases.

  • PDF

독일어 감정음성에서 추출한 포먼트의 분석 및 감정인식 시스템과 음성인식 시스템에 대한 음향적 의미 (An Analysis of Formants Extracted from Emotional Speech and Acoustical Implications for the Emotion Recognition System and Speech Recognition System)

  • 이서배
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.45-50
    • /
    • 2011
  • Formant structure of speech associated with five different emotions (anger, fear, happiness, neutral, sadness) was analysed. Acoustic separability of vowels (or emotions) associated with a specific emotion (or vowel) was estimated using F-ratio. According to the results, neutral showed the highest separability of vowels followed by anger, happiness, fear, and sadness in descending order. Vowel /A/ showed the highest separability of emotions followed by /U/, /O/, /I/ and /E/ in descending order. The acoustic results were interpreted and explained in the context of previous articulatory and perceptual studies. Suggestions for the performance improvement of an automatic emotion recognition system and automatic speech recognition system were made.

  • PDF

스펙트럼 분석과 신경망을 이용한 음성/음악 분류 (Speech/Music Discrimination Using Spectrum Analysis and Neural Network)

  • 금지수;임성길;이현수
    • 한국음향학회지
    • /
    • 제26권5호
    • /
    • pp.207-213
    • /
    • 2007
  • 본 연구에서는 스펙트럼 분석과 신경망을 이용한 효과적인 음성/음악 분류 방법을 제안한다. 제안하는 방법은 스펙트럼을 분석하여 스펙트럴 피크 트랙에서 지속성 특징 파라미터인 MSDF(Maximum Spectral Duration Feature)를 추출하고 기존의 특징 파라미터인 MFSC(Mel Frequency Spectral Coefficients)와 결합하여 음성/음악 분류기의 특징으로 사용한다. 그리고 신경망을 음성/음악 분류기로 사용하였으며, 제안하는 방법의 성능 평가를 위해 학습 패턴 선별과 양, 신경망 구성에 따른 다양한 성능 평가를 수행하였다. 음성/음악 분류 결과 기존의 방법에 비해 성능 향상과 학습 패턴의 선별과 모델 구성에 따른 안정성을 확인할 수 있었다. MSDF와 MFSC를 특징 파라미터로 사용하고 50초 이상의 학습 패턴을 사용할 때 음성에 대해서는 94.97%, 음악에 대해서는 92.38%의 분류율을 얻었으며, MFSC만 사용할 때보다 음성은 1.25%, 음악은 1.69%의 향상된 성능을 얻었다.

배경잡음을 고려한 가변임계값 Dual Rate ADPCM 음성 CODEC 구현 (Implementation of Variable Threshold Dual Rate ADPCM Speech CODEC Considering the Background Noise)

  • 양재석;한경호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.3166-3168
    • /
    • 2000
  • This paper proposed variable threshold dual rate ADPCM coding method which is modified from the standard ADPCM of ITU G.726 for speech quality improvement. The speech quality of variable threshold dual rate ADPCM is better than single rate ADPCM at noisy environment without increasing the complexity by using ZCR(Zero Crossing Rate). In this case, ZCR is used to divide input signal samples into two categories(noisy & speech). The samples with higher ZCR is categorized as the noisy region and the samples with lower ZCR is categorized as the speech region. Noisy region uses higher threshold value to be compressed by 16Kbps for reduced bit rates and the speech region uses lower threshold value to be compressed by 40Kbps for improved speech quality. Comparing with the conventional ADPCM, which adapts the fixed coding rate. the proposed variable threshold dual rate ADPCM coding method improves noise character without increasing the bit rate. For real time applications, ZCR calculation was considered as a simple method to obtain the background noise information for preprocess of speech analysis such as FFT and the experiment showed that the simple calculation of ZCR can be used without complexity increase. Dual rate ADPCM can decrease the amount of transferred data efficiently without increasing complexity nor reducing speech quality. Therefore result of this paper can be applied for real-time speech application such as the internet phone or VoIP.

  • PDF

집중치료를 통한 소뇌운동실조증 환자의 말운동개선 가능성 (Possibility of Motor Speech Improvement in People With Spinocerebellar Ataxia via Intensive Speech Treatment)

  • 박영미
    • 한국콘텐츠학회논문지
    • /
    • 제18권11호
    • /
    • pp.634-642
    • /
    • 2018
  • 소뇌운동실조증(SCA)은 유전성, 진행성 신경장애로 SCA 환자는 소뇌위축으로 인한 실조형 마비말장애를 보인다. 본 연구는 집중 말운동치료를 통해 SCA환자의 진행성 실조형 마비말장애의 개선의 유무를 확인하고, 개선이 있는 경우, 치료전후의 변화의 정도를 보고하고자 한다. 55세의 SCA 여성 환자를 대상으로 말운동 기능 개선을 위한 SPEAK $OUT!^{(R)}$ 치료프로그램을 시행 후, 큰 효과크기 수준에서 MPT와 과제별 음량의 개선이 관찰되었고, 음도의 경우 적은 효과크기의 변화를 보였다. 그러나 음도의 폭은 큰 효과크기의 변화를 보였다. 음질은 jitter, shimmer, HNR 모두에서 큰 효과크기 수준에서 긍정적 개선을 보였고, 모음면적도 넓어졌는데 이때 F1의 변화가 두드러졌다. 또한 심도 수준의 VHI점수는 치료 후 경도 수준으로 낮아졌다. 집중 말운동치료 프로그램인 SPEAK $OUT!^{(R)}$의 시행을 통해 SCA 환자의 음량, 음도, 음도의 폭, 음질, 모음면적의 증가를 관찰하였고, 음성장애에 관한 주관적 인식의 변화도 긍정적으로 감소하였다. 기초연구로서의 본 결과를 바탕으로 SCA환자의 진행성 실조형 마비말장애 개선을 위한 SPEAK $OUT!^{(R)}$에 대한 좀 더 체계적인 검증을 위한 후속 연구가 필요하다.

Performance Improvement of Adaptive Noise Cancellation Using a Speech Detector

  • Park, Jang-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • 제15권2E호
    • /
    • pp.39-44
    • /
    • 1996
  • The performance of two-channel adaptive noise canceller is ofter degraded by the weights perturbation due to the speech signal. In this paper, an adaptive noise canceller employing a speech detector and two adaptation algorithms which are switched according to the speech detector is proposed. When highly correlated speech signal is detected, the tap weights of the adaptive filter are adapted by the sign algorithm. On the other hand, the weights are adapted by the NLMS algorithm when silence is detected or when the characteristics of the noise propagation channel is changed. The employed speech detector utilizes the power ratio of the input and the output of an adaptive linear prediction-error filter. According to the computer simulation, the proposed method yields better performance than conventional ones.

  • PDF

스펙트럼 패턴 기반의 잡음 환경에 강인한 음성의 끝점 검출 기법 (Spectral Pattern Based Robust Speech Endpoint Detection in Noisy Environments)

  • 박진수;이윤재;이인호;고한석
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.111-117
    • /
    • 2009
  • In this paper, a new speech endpoint detector in noisy environment is proposed. According to the previous research, the energy feature in the speech region is easily distinguished from that in the speech absent region. In conventional method, the endpoint can be found by applying the edge detection filter that finds the abrupt changing point in feature domain. However, since the frame energy feature is unstable in noisy environment, the accurate edge detection is not possible. Therefore, in this paper, the novel feature extraction method based on spectrum envelop pattern is proposed. Then, the edge detection filter is applied to the proposed feature for detection of the endpoint. The experiments are performed in the car noise environment and a substantial improvement was obtained over the conventional method.

  • PDF

실시간 음성분석도구의 MatLab 구현 (Matlab Implementation of Real-time Speech Analysis Tool)

  • 박일서;김대현;조철우
    • 대한음성학회지:말소리
    • /
    • 제44호
    • /
    • pp.93-104
    • /
    • 2002
  • There are many speech analysis tools available. Among them real-time analysis tool is very useful for interactive experiments. A real-time speech analysis tool was implemented using Matlab. Matlab is a very widely used general purpose signal processing tool. In general, its computational speed is relatively lower than that of the codes from conventional programming languages. Especially, real-time analysis including input of signal and output of the result was not possible in the past. However, due to the improvement of computing power of PCs and inclusion of real-time I/O toolboxes in Matlab, real-time analysis is now possible in some extent by Matlab only. In this experiment, we tried to implement a real-time speech analysis tool using Matlab. Pitch and spectral information is computed in real-time. From the result it is shown that such real-time applications can be implemented easily using Matlab.

  • PDF

Filtering of a Dissonant Frequency for Speech Enhancement

  • Kang, Sang-Ki;Baek, Seong-Joon;Lee, Ki-Yong;Sun, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • 제22권3E호
    • /
    • pp.110-112
    • /
    • 2003
  • There have been numerous studies on the enhancement of the noisy speech signal. In this paper, we propose a completely new speech enhancement scheme, that is, a filtering of a dissonant frequency (especially F# in each octave of the tempered scale) based on the fundamental frequency which is developed in frequency domain. In order to evaluate the performance of the proposed enhancement scheme, subjective tests (MOS tests) were conducted. The subjective test results indicate that the proposed method provides a significant gain in audible improvement especially for speech contaminated by colored noise and speaking in a husky voice. Therefore when the filter is employed as a pre-filter for speech enhancement, the output speech quality and intelligibility is greatly enhanced.