Search | Korea Science

Improvement of Signal-to-Noise Ratio for Speech under Noisy Environment (잡음환경 하에서의 음성의 SNR 개선)

Choi, Jae-Seung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.7
- /
- pp.1571-1576
- /
- 2013
This paper proposes an improvement algorithm of signal-to-noise ratios (SNRs) for speech signals under noisy environments. The proposed algorithm first estimates the SNRs in a low SNR, mid SNR and high SNR areas, in order to improve the SNRs in the speech signal from background noise, such as white noise and car noise. Thereafter, this algorithm subtracts the noise signal from the noisy speech signal at each bands using a spectrum sharpening method. In the experiment, good signal-to-noise ratios (SNR) are obtained for white noise and car noise compared with a conventional spectral subtraction method. From the experiment results, the maximal improvement in the output SNR results was approximately 4.2 dB and 3.7 dB better for white noise and car noise compared with the results of the spectral subtraction method, in the background noisy environment, respectively.
https://doi.org/10.6109/jkiice.2013.17.7.1571 인용 PDF KSCI

Speaker Identification Using Score-based Confidence in Noisy Environments (스코어 기반 관측신뢰도를 이용한 잡음환경하 화자식별)

Min, So-Hee;Song, Min-Gyu;Na, Seung-You;Choi, Seung-Ho;Kim, Jin-Young
- Speech Sciences
- /
- v.14 no.4
- /
- pp.145-156
- /
- 2007
The performance of speaker identification is severely degraded in noisy environments. Recently probability weighting method based on observation membership was proposed for overcoming the noise problem[1]. In the paper[1] the observation confidence was calculated from SNR with sigmoid function. However, estimating SNR needs additive calculation amount and estimated SNR is corrupted in dynamic noisy environments. In this paper we propose estimation methods of the observation confidence based on score-based reliabilities (SBR) of entropy and dispersion measures. Generally SBRs are obtained from speaker models' probabilities. The proposed methods are evaluated with ETRI speaker recognition DB. We compared the performances of the proposed methods with those in [1][8]. The experimental results show that the proposed methods can be successfully applied for the case where SNR is not available.
PDF

Speech Recognition in Noisy Environrrents using Histogram-based Over-estimation (히스토그램 기반의 Over-estimation을 이용한 잡음환경에서의 음성인식)

권영욱
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.262-266
- /
- 1998
In the speech recognition under the noisy environments, reducing the mismatch introduced between training and testing environments is an important issue, and spectral subtraction is widely used technique because of its simplicity and relatively good performance in noisy environments. In this paper, we introduced histogram method as a reliable noise estimationi approach for spectral subtraction. To deal with the problem of residual noise after spectral subtraction, we proposed a new ove-estimation technique based on distribution characteristics of histogram used for noise estimation. Since the proposed technique decides the degree of over-estimation adaptively according to the measured noise distribution, it can cope with the SNR variations effectively in compared with the conventional over-estimation technique.
PDF

Parameters Comparison in the speaker Identification under the Noisy Environments (화자식별을 위한 파라미터의 잡음환경에서의 성능비교)

Choi, Hong-Sub
- Speech Sciences
- /
- v.7 no.3
- /
- pp.185-195
- /
- 2000
This paper seeks to compare the feature parameters used in speaker identification systems under noisy environments. The feature parameters compared are LP cepstrum (LPCC), Cepstral mean subtraction(CMS), Pole-filtered CMS(PFCMS), Adaptive component weighted cepstrum(ACW) and Postfilter cepstrum(PF). The GMM-based text independent speaker identification system is designed for this target. Some series of experiments show that the LPCC parameter is adequate for modelling the speaker in the matched environments between train and test stages. But in the mismatched training and testing conditions, modified parameters are preferable the LPCC. Especially CMS and PFCMS parameters are more effective for the microphone mismatching conditions while the ACW and PF parameters are good for more noisy mismatches.
PDF

Car Noise Cancellation by Using Spectral Subtraction Method Based on a New Speech/nonspeech Classification Function (새로운 음성/비음성 분류함수에 기반한 스펙트럼 차감법에 의한 차량잡음제거)

박영식;이준재;이응주;하영호
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.19 no.6
- /
- pp.994-1003
- /
- 1994
In this paper, a scheme of noise cancellation using spectral subreaction method with single input in an autombile noise environment is proposed. In order to remove the changing automonile noise components form the noisy speech signal, the noise of various states is analyzed and its characteristics are presented. For the decision of speech/nonspeech and the estimation of noise spectrum, a classification function is proposed on the basis of noise analysis. This function presents the precise decision of speech/nonspeech and the optimal estimation of noise spectrum with less computation. As the result of the estimation of noise spectrum by the proposed classification function, the clean speech signal is extracted from the noisy speech signal with high signal-to-ratio.
PDF

Noise Reduction Algorithm in Speech by Wiener Filter (위너필터에 의한 음성 중의 잡음제거 알고리즘)

Choi, Jae-Seung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.8 no.9
- /
- pp.1293-1298
- /
- 2013
This paper proposes a noise reduction algorithm using Wiener filter to remove the noise components from the noisy speech in order to improve the speech signal. The proposed algorithm first removes the noise spectrums of white noise from the noisy signal based on the noise reshaping and reduction method at each frame. And this algorithm enhances the speech signal using Wiener filter based on linear predictive coding analysis. In this experiment, experimental results of the proposed algorithm demonstrate using the speech and noise data by Japanese male speaker. Based on measuring the spectral distortion (SD) measure, experiments confirm that the proposed algorithm is effective for the speech by contaminated white noise. From the experiments, the maximum improvement in the output SD values was 4.94 dB better for white noise compared with former Wiener filter.
https://doi.org/10.13067/JKIECS.2013.8.9.1293 인용 PDF KSCI

Auditory Representations for Robust Speech Recognition in Noisy Environments (잡음 환경에서의 음성 인식을 위한 청각 표현)

Kim, Doh-Suk;Lee, Soo-Young;Kil, Rhee-M.
- The Journal of the Acoustical Society of Korea
- /
- v.15 no.5
- /
- pp.90-98
- /
- 1996
An auditory model is proposed for robust speech recognition in noisy environments. The model consists of cochlear bandpass filters and nonlinear stages, and represents frequency and intensity information efficiently even in noisy environments. Frequency information of the signal is obtained by zero-crossing intervals, and intensity information is also incorporated by peak detectors and saturating nonlinearities. Also, the robustness of the zero-crossings in estimating frequency is verified by the developed analytic relationship of the variance of the level-crossing interval perturbations as a function of the crossing level values. The proposed auditory model is computationally efficient and free from many unknown parameters compared with other auditory models. Speaker-independent speech recognition experiments demonstrate the robustness of the proposed method.
PDF

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

Yoon, Ki-mu;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.1
- /
- pp.47-50
- /
- 2019
This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).
https://doi.org/10.7776/ASK.2019.38.1.047 인용 PDF KSCI HTML

Speech Enhancement Based on Feature Compensation for Independently Applying to Different Types of Speech Recognition Systems (이기종 음성 인식 시스템에 독립적으로 적용 가능한 특징 보상 기반의 음성 향상 기법)

Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.18 no.10
- /
- pp.2367-2374
- /
- 2014
This paper proposes a speech enhancement method which can be independently applied to different types of speech recognition systems. Feature compensation methods are well known to be effective as a front-end algorithm for robust speech recognition in noisy environments. The feature types and speech model employed by the feature compensation methods should be matched with ones of the speech recognition system for their effectiveness. However, they cannot be successfully employed by the speech recognition with "unknown" specification, such as a commercialized speech recognition engine. In this paper, a speech enhancement method is proposed, which is based on the PCGMM-based feature compensation method. The experimental results show that the proposed method significantly outperforms the conventional front-end algorithms for unknown speech recognition over various background noise conditions.
https://doi.org/10.6109/jkiice.2014.18.10.2367 인용 PDF KSCI

Improvement of Rejection Performance using the Lip Image and the PSO-NCM Optimization in Noisy Environment (잡음 환경 하에서의 입술 정보와 PSO-NCM 최적화를 통한 거절 기능 성능 향상)

Kim, Byoung-Don;Choi, Seung-Ho
- Phonetics and Speech Sciences
- /
- v.3 no.2
- /
- pp.65-70
- /
- 2011
Recently, audio-visual speech recognition (AVSR) has been studied to cope with noise problems in speech recognition. In this paper we propose a novel method of deciding weighting factors for audio-visual information fusion. We adopt the particle swarm optimization (PSO) to weighting factor determination. The AVSR experiments show that PSO-based normalized confidence measures (NCM) improve the rejection performance of mis-recognized words by 33%.
PDF

Search Result 395, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)