• Title/Summary/Keyword: Voice activity detection

Search Result 103, Processing Time 0.094 seconds

Human-Robot Interaction in Real Environments by Audio-Visual Integration

  • Kim, Hyun-Don;Choi, Jong-Suk;Kim, Mun-Sang
    • International Journal of Control, Automation, and Systems
    • /
    • v.5 no.1
    • /
    • pp.61-69
    • /
    • 2007
  • In this paper, we developed not only a reliable sound localization system including a VAD(Voice Activity Detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate three systems in the human-robot interaction to compensate errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audio-visual system in a prototype robot, called IROBAA(Intelligent ROBot for Active Audition), and demonstrated how to integrate the audio-visual system.

Voice Activity Detection Algorithm Using Speech Periodicity and QSNR in Noisy Environment (음성의 주기성과 QSNR을 이용한 잡음환경에서의 음성검출 알고리즘)

  • Jeong, Ju-Hyun;Song, Hwa-Jeon;Kim, Hyung-Soon
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.59-62
    • /
    • 2005
  • Voice activity detection (VAD) is important in many areas of speech processing technology. Speech/nonspeech discrimination in noisy environments is a difficult task because the feature parameters used for the VAD are sensitive to the surrounding environments. Thus the VAD performance is severely degraded at low signal-to-noise ratios (SNRs). In this paper, a new VAD algorithm is proposed based on the degree of voicing and Quantile SNR (QSNR). These two feature parameters are more robust than other features such as energy and spectral entropy in noisy environments. The effectiveness of proposed algorithm is evaluated under the diverse noisy environments in the Aurora2 DB. According to out experiment, the proposed VAD outperforms the ETSI Advanced Frontend VAD.

  • PDF

Feedback Active Noise Control Based Voice Enhancing Ear-Protection System

  • Moon, Seong-Pil;Chang, Tae-Gyu
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.4
    • /
    • pp.1627-1633
    • /
    • 2017
  • This paper proposes a voice enhancing ear-protection system which is based on feedback active noise control(FBANC). The proposed system selectively suppresses the background noise and preserves the talking voice by controlling the adaptive algorithm with the voice activity period detection module. The noise reduction performance of the proposed noise canceling algorithm is analytically derived for the two key performance affecting parameters, i.e., electro-acoustic coupling distance and noise bandwidth. The proposed system is also implemented with a floating-point DSP system and its performance is experimentally tested to compare with the analytically derived results. The achieved levels of noise reduction for the three different noise bandwidths cases, i.e., 10Hz, 50Hz, and 90Hz, are high to show 17.05dB, 10.54dB and 8.99dB, respectively. The feasibility of the proposed system is also shown by the peak noise reduction achieved more than 25dB while preserving the voice component in the frequency range between 200-800Hz.

Statistical Voice Activity Detection Using Probabilistic Non-Negative Matrix Factorization (확률적 비음수 행렬 인수분해를 사용한 통계적 음성검출기법)

  • Kim, Dong Kook;Shin, Jong Won;Kwon, Kisoo;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.8
    • /
    • pp.851-858
    • /
    • 2016
  • This paper presents a new statistical voice activity detection (VAD) based on the probabilistic interpretation of nonnegative matrix factorization (NMF). The objective function of the NMF using Kullback-Leibler divergence coincides with the negative log likelihood function of the data if the distribution of the data given the basis and encoding matrices is modeled as Poisson distributions. Based on this probabilistic NMF, the VAD is constructed using the likelihood ratio test assuming that speech and noise follow Poisson distributions. Experimental results show that the proposed approach outperformed the conventional Gaussian model-based and NMF-based methods at 0-15 dB signal-to-noise ratio simulation conditions.

Voice Activity Detection Using Modified Power Spectral Deviation Based on Teager Energy (Teager Energy 기반의 수정된 파워 스펙트럼 편차를 이용한 음성 검출)

  • Song, J.H.;Song, Y.R.;Shim, H.M.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.8 no.1
    • /
    • pp.41-46
    • /
    • 2014
  • In this paper, we propose a novel voice activity detection (VAD) algorithm using feature vectors based on TE (teager energy). Specifically, power spectral deviation (PSD), which is used as the feature for the VAD in the IS-127 noise suppression algorithm, is obtained after the input signal is transfomed by Teager energy operator. In addition, the TE-based likelihhod ratio are derived in each frame to modifiy the PSD for further VAD. The performance of our proposed VAD algorithm are evaluated by objective testing (total error rate, receiver operating characteristics, perceptual evaluation of speech quality) under various environments, and it is found that the proposed method yields better results than conventional VAD algorithms in the non-stationary noise environments under 5 dB SNR (total error rate = 2.6% decrease, PESQ score = 0.053 improvement).

  • PDF

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector (우도비 특징 벡터를 이용한 SVM 기반의 음성 검출기)

  • Jo, Q-Haing;Kang, Sang-Ki;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.

A Study on Voice Activity Detection Using Auditory Scene and Periodic to Aperiodic Component Ratio in CASA System (CASA 시스템의 청각장면과 PAR를 이용한 음성 영역 검출에 관한 연구)

  • Kim, Jung-Ho;Ko, Hyung-Hwa;Kang, Chul-Ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.10
    • /
    • pp.181-187
    • /
    • 2013
  • When there are background noises or some people speaking at the same time, a human's auditory sense has the ability to listen the target speech signal with a specific purpose through Auditory Scene Analysis. The CASA system with human's auditory faculty system is able to segregate the speech. However, the performance of CASA system is reduced when the CASA system fails to determine the correct position of the speech. In order to correct the error in locating the speech on the CASA system, voice activity detection algorithm is proposed in this paper, which is a combined auditory scene analysis with PAR(Periodic to Aperiodic component Ratio). The experiments have been conducted to evaluate the performance of voice activity detection in environments of white noise and car noise with the change of SNR 15~0dB. In this paper, by comparing the existing algorithms (Pitch and Guoning Hu) with the proposed algorithm, the accuracy of the voice activity detection performance has been improved as the following: improvement of maximum 4% at SNR 15dB and maximum 34% at SNR 0dB for white noise and car noise, respectively.

A Study on the Voice Traffic Efficiency and Buffer Management by Priority Control in ATM Multiplexer (ATM 멀티플렉서에서 우선순위 제어에 의한 음성전송효율 및 버퍼관리에 관한 연구)

  • 이동수;최창수;강준길
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.2
    • /
    • pp.354-363
    • /
    • 1994
  • This paper describes the method that voice traffic is served efficiently in BISDN. Voice is divided into talkspurt and silent period, and it is possible to transmit olny talksurt by the speech activity detection. This paper described the voice traffic control algorithm in the ATM network where cell discarding method is applied to the embedded ADPCM voice data. For traffic control, the cell discarding was used over low priority cells when it overflows the queue threshold. To estimate the efficiency of traffic control algorithm, the computer simuation was performed with cell loss probability, queue length and mean delay as performance parameters. The embedded ADPCM voice coding and cell disscarding resulted in improving the voice cell traffic efficiency and the dynamic control over network congestion.

  • PDF

Frequency Domain Acoustic Echo Suppression Based on Boundary Condition (주파수 영역에서 구간조건을 이용한 음향학적 반향 제거)

  • Lee, Kyu-Ho;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.5
    • /
    • pp.162-166
    • /
    • 2009
  • In this paper, we propose a novel approach of an acoustic echo cancellation (AEC) algorithm which is differently adopted in the relevant period condition by the suppression parameter of a parametric wiener filter (PWF). The PWF uses the suppression parameter to compensate uncertainty of acoustic echo signal estimation. The existing PWF method using the fixed suppression parameter derives the distortion of the near-end signal at the double-talk. To solve this problem, the boundary condition is devised using decision of the double-talk detection (DTD) algorithm and voice activity detector (VAD). The boundary condition makes it possible to treat differently depending on the case of the single-talk and double-talk. According to the experimental results, the proposed approach is found to be effective for acoustic echo cancellation using the boundary condition.

Statistical Voice Activity Defector Based on Signal Subspace Model (신호 준공간 모델에 기반한 통계적 음성 검출기)

  • Ryu, Kwang-Chun;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.372-378
    • /
    • 2008
  • Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.