Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System

가변어휘 핵심어 검출 성능 향상을 위한 비핵심어 모델

  • 김민제 (울산대학교 컴퓨터정보통신공학부) ;
  • 이정철 (울산대학교 컴퓨터정보통신공학부)
  • Published : 2006.10.31

Abstract

We Propose two new methods for non-keyword modeling to improve the performance of speaker- and vocabulary-independent keyword spotting system. The first method is decision tree clustering of monophone at the state level instead of monophone clustering method based on K-means algorithm. The second method is multi-state multiple mixture modeling at the syllable level rather than single state multiple mixture model for the non-keyword. To evaluate our method, we used the ETRI speech DB for training and keyword spotting test (closed test) . We also conduct an open test to spot 100 keywords with 400 sentences uttered by 4 speakers in an of fce environment. The experimental results showed that the decision tree-based state clustering method improve 28%/29% (closed/open test) than the monophone clustering method based K-means algorithm in keyword spotting. And multi-state non-keyword modeling at the syllable level improve 22%/2% (closed/open test) than single state model for the non-keyword. These results show that two proposed methods achieve the improvement of keyword spotting performance.

References

  1. 황병한, '한국어 가변어휘 인식을 위한 음소 모델링 방법에 관한 연구', 부산대학교 석사졸업논문, 1999
  2. 신영욱, '가변어휘 핵심어 검출 시스템의 구현 및 성능개선', 부산대학교 석사졸업논문, 2001
  3. 김치수, 배건성, '고립단어 인식시스템에서 음성-비음성 식별에 관한 연구', 한국음향학회 학술대회지, 242-245, 1998
  4. 김상훈, 오승신, 정호영, 전형배, 김정세, '공통음성 DB 구축' 한국음향학회 학술대회지, 21-24. 2002
  5. R. C. Rose and D. B. Paul. 'A hidden Markov model based keyword recognition system,' ICASSP, 129-132, 1990
  6. J. G. Wilpon, L. R. Rabiner, C. H. Lee and E. R. Goldman, 'Automatic recognition of keywords in unconstrained speech using hidden Markov models,' IEEE Trans. Acoust., Speech, Signal Processing, 38 (11) 1870-1878, 1990 https://doi.org/10.1109/29.103088
  7. C.-H.Wu, Y.-J.Chen and G.-L.Yan. 'Integration of phonetic and prosodic information for robust utterance verification', Vision, Image and Signal Processing, 147 55-61, 2000 https://doi.org/10.1049/ip-vis:20000099
  8. Se-Jin Oh, Hyun-Yeol Chung, Cheol-Jun Hwang, Bum-Koog Kim, Ito, A., 'New state clustering of hidden Markov network with Korean phonological rules for speech recognition', Multimedia Signal Processing, 39-44, 2001
  9. Mei-Yuh Hwang, Xuedong Huang, Alleva. F.A., 'Predicting unseen triphones with senores', Speech and Audio Processing, 4 (6) 412-419, 1996 https://doi.org/10.1109/89.544526
  10. Young S, Kershaw D, Odell J, Ollason D, Valtchev V, Woodland P, The HTK Book, Entropic Research Laboratories Inc .. 1996