Effective Combination of Temporal Information and Linear Transformation of Feature Vector in Speaker Verification

화자확인에서 특징벡터의 순시 정보와 선형 변환의 효과적인 적용

  • 서창우 (숭실대학교 글로벌 미디어학부) ;
  • 조미화 (숭실대학교 글로벌 미디어 학부) ;
  • 임영환 (숭실대학교 글로벌 미디어학부) ;
  • 전성채 (한국전기연구원 융합기술연구단)
  • Published : 2009.12.30

Abstract

The feature vectors which are used in conventional speaker recognition (SR) systems may have many correlations between their neighbors. To improve the performance of the SR, many researchers adopted linear transformation method like principal component analysis (PCA). In general, the linear transformation of the feature vectors is based on concatenated form of the static features and their dynamic features. However, the linear transformation which based on both the static features and their dynamic features is more complex than that based on the static features alone due to the high order of the features. To overcome these problems, we propose an efficient method that applies linear transformation and temporal information of the features to reduce complexity and improve the performance in speaker verification (SV). The proposed method first performs a linear transformation by PCA coefficients. The delta parameters for temporal information are then obtained from the transformed features. The proposed method only requires 1/4 in the size of the covariance matrix compared with adding the static and their dynamic features for PCA coefficients. Also, the delta parameters are extracted from the linearly transformed features after the reduction of dimension in the static features. Compared with the PCA and conventional methods in terms of equal error rate (EER) in SV, the proposed method shows better performance while requiring less storage space and complexity.

Keywords