• Title/Summary/Keyword: Voice Activity Detection

Search Result 103, Processing Time 0.027 seconds

Robust Entropy Based Voice Activity Detection Using Parameter Reconstruction in Noisy Environment

  • Han, Hag-Yong;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.4
    • /
    • pp.205-208
    • /
    • 2003
  • Voice activity detection is a important problem in the speech recognition and speech communication. This paper introduces new feature parameter which are reconstructed by spectral entropy of information theory for robust voice activity detection in the noise environment, then analyzes and compares it with energy method of voice activity detection and performance. In experiments, we confirmed that spectral entropy and its reconstructed parameter are superior than the energy method for robust voice activity detection in the various noise environment.

Boll's Spectral Subtraction Algorithm by New Voice Activity Detection (새로운 음성 활동 검출법에 의한 Boll의 스펙트럼 차감 알고리즘)

  • 류종훈;김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.46-55
    • /
    • 2001
  • In this paper, a new voice activity detection method estimating SNR of enhanced speech with extended spectral subtraction (ESS) is proposed. Voice activity detection is performed by putting an second Wiener filter behind an Wiener filter used in the ESS to estimate speech and noise power of output signal of first Wiener filter. The proposed voice activity detection method does not require many computational loads and performs well under severe input SNR. Boll's spectral substraction algorithm with proposed voice activity detection was compared to ESS under several noise environment having different time-frequency distributions. During speech and non-speech activity, performance of Boll's spectral substraction algorithm with proposed voice activity detection is superior to that of ESS.

  • PDF

Reconstruction Effect of the Spectral Entropy for the Voice Activity Detection (음성 활동 구간 검출을 위한 스펙트랄 엔트로피의 재구성 효과)

  • Kwon HO-Min;Han Hag-Yong;Lee Kwang-Seok;Koh Si-Young;Hur Kang-In
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.25-28
    • /
    • 2002
  • Voice activity detection is important Problem in the speech recognition and communication. This paper introduces feature parameter which is reconstructed by the spectral entropy of information theory for the robust voice activity detection in the noise environment, analyzes and compares it with the energy method of voice activity detection and performance. In experiment, we confirmed that the spectral entropy is more feature parameter than the energy method for the robust voice activity detection in the various noise environment.

  • PDF

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.447-453
    • /
    • 2009
  • This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.

Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

  • Chang, Joon-Hyuk
    • ETRI Journal
    • /
    • v.34 no.2
    • /
    • pp.184-189
    • /
    • 2012
  • In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the interframe correlation of voice activity. This is clearly different from the previous approach [1] in that we employ the voice activity decisions in the second-order (previous two frames) CMAP, which has quadruple thresholds with an additional degree of freedom, rather than the first-order (previous single frame). Also, a soft-decision scheme is incorporated, resulting in time-varying thresholds for further performance improvement. Experimental results show that the proposed algorithm outperforms the conventional CMAP-based VAD technique under various experimental conditions.

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.2
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (자동차 잡음 환경에서 웨이브렛 밴드 엔트로피 앙상블 분석을 이용한 음성구간 검출 알고리즘)

  • Lee, G.H.;Lee, Y.J.;Kim, M.N.
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.9
    • /
    • pp.1005-1017
    • /
    • 2013
  • Voice activity detection is very important process that voice activity separated form noisy speech signal for speech enhance. Over the past few years, many studies have been made on voice activity detection, but it has poor performance in low signal to noise ratio environment or fickle noise such as car noise. In this paper, it proposed new voice activity detection algorithm using ensemble variance based on wavelet band entropy and soft thresholding method. We conduct a survey in a lot of signal to noise ratio environment of car noise to evaluate performance of the proposed algorithm and confirmed performance of the proposed algorithm.

Dynamic code allocation using voice activeity detection in DS-CDMA cellular system (DS-CDMA 셀룰러 시스템에서의 음성검출을 사용한 동적코드할당방식)

  • 유명수;양영님;고종하;이정규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.6
    • /
    • pp.1302-1310
    • /
    • 1997
  • In this paper, we propose a dynamic code allocation strategy using voice activity detection and evaluate the performance of a dynamic code allocation strategy using voice activeity detection in DS-CDMA system. Proposed method allocates code to mobile terminal according to the residual capacity computed by SIR in the base station. In hot spot traffic loading cell, we find that the performance of proposed method is better than that of a fixed code assignment strategy using voice activity detection. Also, we find that the proposed method provide much improvement in blocking probability against the dynamic code assignment strategy withoug voice activity detection.

  • PDF

Voice Activity Detection Algorithm using Fuzzy Membership Shifted C-means Clustering in Low SNR Environment (낮은 신호 대 잡음비 환경에서의 퍼지 소속도 천이 C-means 클러스터링을 이용한 음성구간 검출 알고리즘)

  • Lee, G.H.;Lee, Y.J.;Cho, J.H.;Kim, M.N.
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.3
    • /
    • pp.312-323
    • /
    • 2014
  • Voice activity detection is very important process that find voice activity from noisy speech signal for noise cancelling and speech enhancement. Over the past few years, many studies have been made on voice activity detection, it has poor performance for speech signal of sentence form in a low SNR environment. In this paper, it proposed new voice activity detection algorithm that has beginning VAD process using entropy and main VAD process using fuzzy membership shifted c-means clustering. We conduct an experiment in various SNR environment of white noise to evaluate performance of the proposed algorithm and confirmed good performance of the proposed algorithm.

Voice Activity Detection employing the Generalized Normal-Laplace Distribution (일반화된 정규-라플라스 분포를 이용한 음성검출기)

  • Kim, Sang-Kyun;Kwon, Jang-Woo;Lee, Sangmin
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.3
    • /
    • pp.294-299
    • /
    • 2014
  • In this paper, we propose a novel algorithm to improve the performance of a voice activity detection(VAD) which is based on the generalized normal-Laplace(GNL) distribution. In our algorithm, the probability density function(PDF) of the noisy speech signal is represented by the GNL distribution and the variance of the speech and noise of GNL distribution are estimated using higher order moments. Experimental results show that the proposed algorithm yields better results compared to the conventional VAD algorithms.