• Title/Summary/Keyword: 음성 부재 확률

Search Result 21, Processing Time 0.021 seconds

Global Soft Decision Using Probabilistic Outputs of Support Vector Machine for Speech Enhancement (SVM의 확률 출력을 이용한 새로운 Global Soft Decision 기반의 음성 향상 기법)

  • Jo, Q-Haing;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.2
    • /
    • pp.75-79
    • /
    • 2008
  • In this paper, we propose a novel speech enhancement technique using global soft decision (GSD) based on the probabilistic outputs of support vector machine (SVM). Generally, speech enhancement algorithms applied soft decision gain modification and noise power estimation have bettor performance than those employing hard decision. Especially, global speech absence probability (GSAP), which is known as an effective measure of the speech absence in each frame, has been adopted to SD-based speech enhancement methods. For this reason, we introduce a new GSAP estimated from the probabilistic output of SVM using sigmoid function. The performance of the proposed algorithm is evaluated by the PESQ and MOS test under various noise environments and yields better results compared with the conventional GSD scheme.

Speech Enhancement Based on Improved Minima Controlled Recursive Averaging Incorporating GSAP (전역 음성 부재 확률 기반의 향상된 최소값 제어 재귀평균기법을 이용한 음성 향상 기법)

  • Song, Ji-Hyun;Bang, Dong-Hyeouck;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.104-111
    • /
    • 2012
  • In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA). From an examination for various noise environment, it is shown that the IMCRA has a fundamental drawback for the noise power estimate at the offset region of continuity speech signals. Espectially, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. To overcome the drawback, we apply the global speech absence probability (GSAP) conditioned on both a priori SNR and a posteriori SNR to the speech detection algorithm of IMCRA. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and a composite measure test, we show that the proposed algorithm yields better results compared to the conventional IMCRA-based scheme under various noise environments. In particular, in the case of babble 5 dB, the proposed method produced a remarkable improvement compared to the IMCRA ( PESQ = 0.026, composite measure = 0.029 ).

Statistical Model-Based Voice Activity Detection Using the Second-Order Conditional Maximum a Posteriori Criterion with Adapted Threshold (적응형 문턱값을 가지는 2차 조건 사후 최대 확률을 이용한 통계적 모델 기반의 음성 검출기)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.76-81
    • /
    • 2010
  • In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the second-order conditional maximum a posteriori (CMAP). In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the current observation and the speech activity decisions in the pervious two frames. Experimental results show that the proposed approach yields better results compared to the statistical model-based and the CMAP-based VAD using the LR test.

Voice Activity Detection Algorithm Based on the Power Spectral Deviation of Teager Energy in Noisy Environment (잡음환경에서 Teager 에너지의 전력 스펙트럼 편차에 기반한 음성 검출 알고리즘)

  • Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.7
    • /
    • pp.396-401
    • /
    • 2011
  • In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. The presented VAD utilizes the power spectral deviation (PSD) based on Teager energy (TE) instead of the conventional PSD scheme to improve the performance of decision for speech segments. In addition, the speech absence probability (SAP) is derived in each frequency subband to modify the PSD for further VAD. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.

Noise-Biased Compensation of Minimum Statistics Method using a Nonlinear Function and A Priori Speech Absence Probability for Speech Enhancement (음질향상을 위해 비선형 함수와 사전 음성부재확률을 이용한 최소통계법의 잡음전력편의 보상방법)

  • Lee, Soo-Jeong;Lee, Gang-Seong;Kim, Sun-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.1
    • /
    • pp.77-83
    • /
    • 2009
  • This paper proposes a new noise-biased compensation of minimum statistics(MS) method using a nonlinear function and a priori speech absence probability(SAP) for speech enhancement in non-stationary noisy environments. The minimum statistics(MS) method is well known technique for noise power estimation in non-stationary noisy environments. It tends to bias the noise estimate below that of true noise level. The proposed method is combined with an adaptive parameter based on a sigmoid function and a priori speech absence probability (SAP) for biased compensation. Specifically. we apply the adaptive parameter according to the a posteriori SNR. In addition, when the a priori SAP equals unity, the adaptive biased compensation factor separately increases ${\delta}_{max}$ each frequency bin, and vice versa. We evaluate the estimation of noise power capability in highly non-stationary and various noise environments, the improvement in the segmental signal-to-noise ratio (SNR), and the Itakura-Saito Distortion Measure (ISDM) integrated into a spectral subtraction (SS). The results shows that our proposed method is superior to the conventional MS approach.

Noisy Speech Enhancement Based on Complex Laplacian Probability Density Function (복소 라플라시안 확률 밀도 함수에 기반한 음성 향상 기법)

  • Park, Yun-Sik;Jo, Q-Haing;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.6
    • /
    • pp.111-117
    • /
    • 2007
  • This paper presents a novel approach to speech enhancement based on a complex Laplacian probability density function (pdf). With a use of goodness-of-fit (GOF) test we show that the complex Laplacian pdf is more suitable to describe the conventional Gaussian pdf. The likelihood ratio (LR) is applied to derive the speech absence probability in the speech enhancement algorithm. The performance of the proposed algorithm is evaluated by the objective test and yields better results compared with the conventional Gaussian pdf-based scheme.

Speech Enhancement based on Minima Controlled Recursive Averaging Technique Incorporating Second-order Conditional Maximum a posteriori Criterion (2차 조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상)

  • Kum, Jong-Mo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.4
    • /
    • pp.132-138
    • /
    • 2009
  • In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the second-order conditional maximum a posteriori (CMAP). From an investigation of the MCRA scheme, it is discovered that the MCRA method cannot take full consideration of the inter-frame correlation of voice activity since the noise power estimate is adjusted by the speech presence probability depending on an observation of the current frame. To avoid this phenomenon, the proposed MCRA approach incorporates the second-order CMAP criterion in which the noise power estimate is obtained using the speech presence probability conditioned on both the current observation and the speech activity decisions in the previous two frames. Experimental results show that the proposed MCRA technique based on second-order conditional MAP yields better results compared to the conventional MCRA method.

Speech Enhancement Based on Minima Controlled Recursive Averaging Technique Incorporating Conditional MAP (조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상)

  • Kum, Jong-Mo;Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.5
    • /
    • pp.256-261
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the conditional maximum a posteriori criterion. A crucial component of a practical speech enhancement system is the estimation of the noise power spectrum. One state-of-the-art approach is the minima controlled recursive averaging (MCRA) technique. The noise estimate in the MCRA technique is obtained by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. We improve the MCRA using the speech presence probability which is the a posteriori probability conditioned on both the current observation the speech presence or absence of the previous frame. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and subjective evaluation of speech quality, we show that the proposed algorithm yields better results compared to the conventional MCRA-based scheme.

Comparison of Recognition Performance for Preprocessing Method of USE STSA with Approximated Modified Bessel Function (Modified Bessel 함수 근사화를 적용한 MMSE STSA 전처리 기법의 음성인식 성능 비교)

  • Son Jong Mok;Kim Min Sung;Bae Keun Sung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.125-128
    • /
    • 2001
  • 본 연구에서는 음성신호의 왜곡에 대해 음성 부재 확률을 고려한 MMSE(Minimum Mean Square Error) STSA(Short-Time Spectral Amplitude Estimator)를 전처리기로 도입하여 HMM(Hidden Markov Model)에 기반 한 음성인식시스템의 인식성능을 평가하였다. 음성인식 시스템의 실시간 구현을 고려하여, MMSE STSA 기법을 음성개선을 위한 전처리기로 사용할 때 MMSE STSA의 이득계산 과정에서 많은 계산량이 요구되는 modified Bessel 함수를 근사 화하여 사용하였다.

  • PDF

A New Unified System of Acoustic Echo and Noise Suppression Incorporating a Novel Noise Power Estimation (새로운 잡음전력 추정 기법을 적용한 음향학적 반향 및 배경잡음 제거 통합시스템)

  • Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.680-685
    • /
    • 2009
  • In this paper, we propose a efficient noise power estimation technique for an integrated acoustic echo and noise suppression system in a frequency domain. The proposed method uses speech absence probability (SAP) derived from the microphone input signal as the smoothing parameter updating noise power to reduce the noise power estimation error resulted from the distortions in the unified structure where the noise suppression (NS) operation is placed after the acoustic echo suppression (AES) algorithm. Therefore, in the proposed approach, the smoothing parameter based on SAP derived from the input signal instead of echo-suppressed signal should stop updating noise power estimates during the distorted noise spectrum periods. The performance of the proposed algorithm is evaluated by the objective test under various environments and yields better results compared with the conventional scheme.