• Title/Summary/Keyword: Interaural cues

Search Result 4, Processing Time 0.016 seconds

Implementation of Sound Source Location Detector (음원 위치 검출기의 구현)

  • 이종혁;김진천
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.5
    • /
    • pp.1017-1025
    • /
    • 2000
  • The human auditory system has been shown to posses remarkable abilities in the localization and tracking of sound sources. The localization is the result of processing two primary acoustics cues. These are the interaural time difference(ITD) cues and interaural intensity difference(IID) cues at the two ears. In this paper, we propose TEPILD(Time Energy Previous Integration Location Detector) model. TEPILD model is constructed with time function generator, energy function generator, previous location generator and azimuth detector. Time function generator is to process ITD and energy function generator is to process IID. Total average accuracy rate is 99.2%. These result are encouraging and show that proposed model can be applied to the sound source location detector.

  • PDF

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.141-148
    • /
    • 2010
  • In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.

  • PDF

CASA-based Front-end Using Two-channel Speech for the Performance Improvement of Speech Recognition in Noisy Environments (잡음환경에서의 음성인식 성능 향상을 위한 이중채널 음성의 CASA 기반 전처리 방법)

  • Park, Ji-Hun;Yoon, Jae-Sam;Kim, Hong-Kook
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.289-290
    • /
    • 2007
  • In order to improve the performance of a speech recognition system in the presence of noise, we propose a noise robust front-end using two-channel speech signals by separating speech from noise based on the computational auditory scene analysis (CASA). The main cues for the separation are interaural time difference (ITD) and interaural level difference (ILD) between two-channel signal. As a result, we can extract 39 cepstral coefficients are extracted from separated speech components. It is shown from speech recognition experiments that proposed front-end has outperforms the ETSI front-end with single-channel speech.

  • PDF

Sound Source Localization Technique at a Long Distance for Intelligent Service Robot (지능형 서비스 로봇을 위한 원거리 음원 추적 기술)

  • Lee Ji-Yeoun;Hahn Min-Soo
    • MALSORI
    • /
    • no.57
    • /
    • pp.85-97
    • /
    • 2006
  • This paper suggests an algorithm that can estimate the direction of the sound source in real time. The algorithm uses the time difference and sound intensity information among the recorded sound source by four microphones. Also, to deal with noise of robot itself, the Kalman filter is implemented. The proposed method can take shorter execution time than that of an existing algorithm to fit the real-time service robot. Also, using the Kalman filter, signal ratio relative to background noise, SNR, is approximately improved to 8 dB. And the estimation result of azimuth shows relatively small error within the range of ${\pm}7$ degree.

  • PDF