Single-Channel Speech Separation Using the Time-Frequency Smoothed Soft Mask Filter

시간-주파수 스무딩이 적용된 소프트 마스크 필터를 이용한 단일 채널 음성 분리

  • 이윤경 (충북대학교 제어계측공학과) ;
  • 권오욱 (충북대학교 전기전자컴퓨터공학부)
  • 발행 : 2008.09.30

초록

This paper addresses the problem of single-channel speech separation to extract the speech signal uttered by the speaker of interest from a mixture of speech signals. We propose to apply time-frequency smoothing to the existing statistical single-channel speech separation algorithms: The soft mask and the minimum-mean-square-error (MMSE) algorithms. In the proposed method, we use the two smoothing later. One is the uniform mask filter whose filter length is uniform at the time-Sequency domain, and the other is the met-scale filter whose filter length is met-scaled at the time domain. In our speech separation experiments, the uniform mask filter improves speaker-to-interference ratio (SIR) by 2.1dB and 1dB for the soft mask algorithm and the MMSE algorithm, respectively, whereas the mel-scale filter achieves 1.1dB and 0.8dB for the same algorithms.

키워드