• Title/Summary/Keyword: Speech Enhancement Algorithm

Search Result 134, Processing Time 0.03 seconds

A New Formulation of Multichannel Blind Deconvolution: Its Properties and Modifications for Speech Separation

  • Nam, Seung-Hyon;Jee, In-Nho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.4E
    • /
    • pp.148-153
    • /
    • 2006
  • A new normalized MBD algorithm is presented for nonstationary convolutive mixtures and its properties/modifications are discussed in details. The proposed algorithm normalizes the signal spectrum in the frequency domain to provide faster stable convergence and improved separation without whitening effect. Modifications such as nonholonomic constraints and off-diagonal learning to the proposed algorithm are also discussed. Simulation results using a real-world recording confirm superior performanceof the proposed algorithm and its usefulness in real world applications.

New Speech Enhancement Method using Psychoacoustic Criteria (심리 음향 기준을 이용한 새로운 음질 개선 방법)

  • 김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.56-66
    • /
    • 2001
  • The spectral subtraction algorithm using a criterion based on the human perception has been recently developed. The speech processed with Virag's algorithm sounds more pleasant to a human listener than those obtained by the classical methods. However, Virag's algorithm requires a robust voice activity detector (VAD). In the ESS (extended spectral subtraction) algorithm without VAD, the residual noise becomes more noticeable as the SNR decrease. In this paper we propose a new speech enhancement method, the combination of Wiener filter and spectral subtraction based on noise masking characteristics in the human auditory system. There is no need of VAD because the noise can be successively updated even during speech activity using Wiener filter. The adjustment of the subtraction parameter based on the masking threshold makes the residual noise inaudible. The proposed method has been compared with conventional spectral subtraction algorithms. Objective and subjective evaluation of the proposed system is performed with several noise types having different time-frequency distributions. The application of objective measures, the study of the speech spectrograms, as well as subjective listening tests, confirm that the enhanced speech with proposed algorithm is more pleasant to a human listener.

  • PDF

Performance Improvement on Hearing Aids Via Environmental Noise Reduction (배경 잡음 제거를 통한 보청 시스템의 성능 향상)

  • 박선준;윤대희;김동욱;박영철
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2
    • /
    • pp.61-67
    • /
    • 2000
  • Recent progress in digital and VLSI technology has offered new possibility fer noticeable advance of hearing aids. Yet, environmental noise remains one of the major problems to hearing aid users. This paper describes results which speech recognition performance and speech discrimination performance was measured for listeners with sensorineural hearing loss, while listeners in speech-band noise. In addition, to ameliorate hearing-aided environments of hearing impaired listeners, environmental noise reduction using speech enhancement techniques are investigated as a front-end of conventional hearing aids. Speech enhancement techniques are implemented in a realtime system equipped with DSP board. The clinical test results suggest that the speech enhancement technique may work in synergy with gain functions fer the greater SNR improvement as the preprocessing algorithm of digital hearing aids.

  • PDF

Noisy Speech Enhancement by Restoration of DFT Components Using Neural Network (신경회로망을 이용한 DFT 성분 복원에 의한 음성강조)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.5
    • /
    • pp.1078-1084
    • /
    • 2010
  • This paper presents a speech enhancement system which restores the amplitude components and phase components by discrete Fourier transform (DFT), using neural network training by back-propagation algorithm. First, a neural network is trained using DFT amplitude components and phase components of noisy speech signal, then the proposed system enhances speech signals that are degraded by white noise using a neural network. Experimental results demonstrate that speech signals degraded by white noise are enhanced by the proposed system using the neural network, whose inputs are DFT amplitude components and phase components. Based on measuring spectral distortion measurement, experiments confirm that the proposed system is effective for white noise.

Denoising of Speech Signal Using Wavelet Transform (웨이브렛 변환을 이용한 음성신호의 잡음제거)

  • 한미경;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.27-34
    • /
    • 2000
  • This paper deals with speech enhancement methods using the wavelet transform. A cycle-spinning scheme and undecimated wavelet transform are used for denoising of speech signals, and then their results are compared with that of the conventional wavelet transform. We apply soft-thresholding technique for removing additive background noise from noisy speech. The symlets 8-tap wavelet and pyramid algorithm are used for the wavelet transform. Performance assessments based on average SNR, cepstral distance and informal subjective listening test are carried out. Experimental results demonstrate that both cycle-spinning denoising(CSD) method and undecimated wavelet denoising(CWD) method outperform conventional wavelet denoising(UWD) method in objective performance measure as welt as subjective listening test. The two methods also show less "clicks" that usually appears in the neighborhood of signal discontinuities.

  • PDF

Pre-Processing for Performance Enhancement of Speech Recognition in Digital Communication Systems (디지털 통신 시스템에서의 음성 인식 성능 향상을 위한 전처리 기술)

  • Seo, Jin-Ho;Park, Ho-Chong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.7
    • /
    • pp.416-422
    • /
    • 2005
  • Speech recognition in digital communication systems has very low performance due to the spectral distortion caused by speech codecs. In this paper, the spectral distortion by speech codecs is analyzed and a pre-processing method which compensates for the spectral distortion is proposed for performance enhancement of speech recognition. Three standard speech codecs. IS-127 EVRC. ITU G.729 CS-ACELP and IS-96 QCELP. are considered for algorithm development and evaluation, and a single method which can be applied commonly to all codecs is developed. The performance of the proposed method is evaluated for three codecs, and by using the speech features extracted from the compensated spectrum. the recognition rate is improved by the maximum of $15.6\%$ compared with that using the degraded speech features.

DSP Implementation of Speech Enhancement System Using Microphone Array with Adaptive Post-processing (적응 후처리 과정을 갖는 마이크로폰 배열을 이용한 잡음제거기의 DSP 구현)

  • 권홍석;김시호;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.413-416
    • /
    • 2002
  • In this paper, a speech enhancement system using microphone array with adaptive Post-Processing is implemented in real-lime with TMS320C6201 DSP. It consists of delay-and-sum beamformer and adaptive post-processing filters with NLMS (Normalized Least Mean Square) algorithm. THS1206 ADC is used for collection of 4-channel microphone signals. Sizes of program memory, data ROM and data RAM of the implemented system are 15,744, 748 and 47,540 bytes, respectively. Finally 21.839${\times}$106 clocks per second is required for real-time operation.

  • PDF

Improved Minimum Statistics Based on Environment-Awareness for Noise Power Estimation (환경인식 기반의 향상된 Minimum Statistics 잡음전력 추정기법)

  • Son, Young-Ho;Choi, Jae-Hun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.3
    • /
    • pp.123-128
    • /
    • 2011
  • In this paper, we propose the improved noise power estimation in speech enhancement under various noise environments. The previous MS algorithm tracking the minimum value of finite search window uses the optimal power spectrum of signal for smoothing and adopts minimum probability. From the investigation of the previous MS-based methods it can be seen that a fixed size of the minimum search window is assumed regardless of the various environment. To achieve the different search window size, we use the noise classification algorithm based on the Gaussian mixture model (GMM). Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 perceptual evaluation of speech quality (PESQ) under various noise environments. Based on this, we show that the proposed algorithm yields better result compared to the conventional MS method.

Speech Enhancement for Voice commander in Car environment (차량환경에서 음성명령어기 사용을 위한 음성개선방법)

  • 백승권;한민수;남승현;이봉호;함영권
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.9-16
    • /
    • 2004
  • In this paper, we present a speech enhancement method as a pre-processor for voice commander under car environment. For the friendly and safe use of voice commander in a running car, non-stationary audio signals such as music and non-candidate speech should be reduced. Ow technique is a two microphone-based one. It consists of two parts Blind Source Separation (BSS) and Kalman filtering. Firstly, BSS is operated as a spatial filter to deal with non-stationary signals and then car noise is reduced by kalman filtering as a temporal filter. Algorithm Performance is tested for speech recognition. And the results show that our two microphone-based technique can be a good candidate to a voice commander.

Intelligibility Enhancement of Multimedia Contents Using Spectral Shaping (스펙트럼 성형기법을 이용한 멀티미디어 콘텐츠의 명료도 향상)

  • Ji, Youna;Park, Young-cheol;Hwang, Young-su
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.11
    • /
    • pp.82-88
    • /
    • 2016
  • In this paper, we propose an intelligibility enhancement algorithm for multimedia contents using spectral shaping. The dialogue signals is essential to understand the plot of audio-visual media contents such as movie and TV. However, the non-dialogue components as like sound effects and background music often degrade the dialogue clarity. To overcome this problem, this paper tries to improves the dialogue clarity of audio soundtracks which contain important cues for the visual scenes. In the proposed method, the dialogue components are first detected by soft masker based on speech presence probability (SPP) which is widely used in speech enhancement field. Then, extracted dialogue signals are applied to the spectral shaping method. It reallocate the spectral-temporal energy of speech to enhanced the intelligibility. The total energy is maintained as unchanged via a loudness normalization process to prevent saturation. The algorithm was evaluated using the modeled and real movie soundtracks and it was shown that the proposed algorithm enhances the dialogue clarity while preserving the total audio power.