DOI QR코드

DOI QR Code

Forensic Automatic Speaker Identification System for Korean Speakers

과학수사를 위한 한국인 음성 특화 자동화자식별시스템

  • Received : 2012.07.25
  • Accepted : 2012.09.14
  • Published : 2012.09.30

Abstract

In this paper, we introduce the automatic speaker identification system 'SPO(Supreme Prosecutors Office) Verifier'. SPO Verifier is a GMM(Gaussian mixture model)-UBM(universal background model) based automatic speaker recognition system and has been developed using Korean speakers' utterances. This system uses a channel compensation algorithm to compensate recording device characteristics. The system can give the users the ability to manage reference models with utterances from various environments to get more accurate recognition results. To evaluate the performance of SPO Verifier on Korean speakers, we compared this system with one of the most widely used commercial systems in the forensic field. The results showed that SPO Verifier shows lower EER(equal error rate) than that of the commercial system.

Keywords

References

  1. Huang, X., Acero, A. & Hon, H. W. (2001). Spoken language processing: a guide to theory algorithm, and system development. NJ: Prentice Hall PTR.
  2. Atal, B. S. (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification, Journal of the Acoustical Society of America. Vol. 55, No. 6, 1304-1312. https://doi.org/10.1121/1.1914702
  3. Viikki, O. & Laurila, K. (1998). Cepstral domain segmental feature vector normalization for noise robust speech recognition, Speech Communication. Vol. 25, 133-147. https://doi.org/10.1016/S0167-6393(98)00033-8
  4. Smith, L. I. (2002). A tutorial on principal components analysis. http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf.
  5. Reynolds, D. A. & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models, IEEE Trans. Speech Audio Processing. Vol. 3, 72-83. https://doi.org/10.1109/89.365379
  6. Gauvain, J. L. & Lee, C. H. (1994). Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Trans. Speech Audio Processing. Vol. 2, 291-298. https://doi.org/10.1109/89.279278
  7. Reynolds, D. A., Quatieri, T. F. & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models, Digital Signal Processing. Vol. 10, 19-41. https://doi.org/10.1006/dspr.1999.0361
  8. Rosenberg, A. E., DeLong, J., Lee, C. H., Juang, B. H. & Soong, F. K. (1992). The use of cohort normalized scores for speaker verification, International Conference on Speech and Language Processing, 599-602.
  9. BATVOX webpage. http://www.agnitio-corp.com/producto.php?id_producto=2.
  10. BATVOX user manual (2005). Agnitio.