Simultaneous Speaker and Environment Adaptation by Environment Clustering in Various Noise Environments

Kim, Young-Kuk;Song, Hwa-Jeon;Kim, Hyung-Soon;

doi:10.7776/ASK.2009.28.6.566

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 28 Issue 6
/
Pages.566-571
/
2009
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

DOI QR Code

Simultaneous Speaker and Environment Adaptation by Environment Clustering in Various Noise Environments

다양한 잡음 환경하에서 환경 군집화를 통한 화자 및 환경 동시 적응

김영국 (LG전자기술원) ;
송화전 (부산대학교 전자전기공학부) ;
김형순 (부산대학교 전자전기공학부)

Published : 2009.08.31

https://doi.org/10.7776/ASK.2009.28.6.566 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This paper proposes noise-robust fast speaker adaptation method based on the eigenvoice framework in various noisy environments. The proposed method is focused on de-noising and environment clustering. Since the de-noised adaptation DB still has residual noise in itself, environment clustering divides the noisy adaptation data into similar environments by a clustering method using the cepstral mean of non-speech segments as a feature vector. Then each adaptation data in the same cluster is used to build an environment-clustered speaker adapted (SA) model. After selecting multiple environmentally clustered SA models which are similar to test environment, the speaker adaptation based on an appropriate linear combination of clustered SA models is conducted. According to our experiments, we observe that the proposed method provides error rate reduction of $40{\sim}59%$ over baseline with speaker independent model.

본 논문에서는 eigenvoice 방식에 기반하여 다양한 잡음 환경에 강인한 고속 화자 적응 방법을 제안하였다. 제안된 방법은 잡음 제거 기술과 환경 군집화 방법을 기반으로 한다. 그러나, 잡음 제거 기술을 통해 잡음을 제거한 후에도 여전히 잔여 잡음이 존재하므로 비음성 구간의 켑스트럼 평균을 사용하여 잡음 환경별로 화자 적응 데이터를 분류한 후 각각의 환경별로 환경 모델을 구성한다. 이러한 환경 군집화를 적응데이터에 대해 구성한 후 테스트 음성이 입력되면 군집화된 모델 중에서 인식 데이터와 가장 유사한 복수의 환경별 군집화된 화자 적응 모델을 구한 후 이들의 가중함을 통해 화자 적응을 수행하는 방법이다. 제안된 방법은 적응 및 평가를 통해 화자 독립 모델을 사용한 경우에 비해 $40{\sim}59%$ 인식 오류 감소율을 얻었다.

Keywords

References

R. Kuhn, P. Nguyen, J. C. Jungua, L. Goldwasser, N. Nied-zielski, S. Finche, K. Field, and M. Contolini, "Eigenvoices for speaker adaptation," in Proc. ICSLP, pp 1771-1774, Nov. 30-Dec. 4, 1998
C. H. Lee, C. H. Lin, and B. H. Juang, "A study on speaker adaptation of the parameters of continuous density hidden Markov models," IEEE Trans. Signal Processing, vol.39, no.4, pp.806-814, 1991 https://doi.org/10.1109/78.80902
C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Computer Speech and Language, vol.9, no.1, pp.171-185, 1995 https://doi.org/10.1006/csla.1995.0010
H. J. Song and H. S. Kim, "Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation," In Proc. ICSLP, pp.2945-2948, Oct. 2004
ITU recommendation P.56, Objective measurement of active speech IeveI, Mar., 1993
H. J. Song and H. S. Kim, “Eigen-environment based noise compensation method for robust speech recognition,” In Proc. Eurospeech, pp.981-984, Sep. 2005

The Journal of the Acoustical Society of Korea (한국음향학회지)

Simultaneous Speaker and Environment Adaptation by Environment Clustering in Various Noise Environments

다양한 잡음 환경하에서 환경 군집화를 통한 화자 및 환경 동시 적응

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)