DOI QR코드

DOI QR Code

A Study on Speech Recognition based on Phoneme for Korean Subway Station Names

한국의 지하철역명을 위한 음소 기반의 음성인식에 관한 연구

  • Received : 2011.01.31
  • Accepted : 2011.06.11
  • Published : 2011.06.26

Abstract

This paper presented the method about the Implementation of Speech Recognition based on phoneme considering the phonological characteristic for Korean Subway Station Names. The Pronunciation dictionary considering PLU set and phonological variations with four Case in order to select the optimum PLU used for Speech Recognition based on phoneme for Korean Subway Station Names was comprised and the recognition rate was estimated. In the case of the applied PLU, we could know the optimum recognition rate(97.74%) be shown in the triphone model in case of considering the recognition unit division of the initial consonant and final consonant and phonological variations.

본 논문에서는 한국의 지하철역명을 위하여 음운론적 특성을 반영한 음소 기반의 음성인식 구현에 관한 방법을 제시하였다. 한국의 지하철역명의 음소 기반의 음성인식을 위하여 사용되는 최적의 유사음소 단위(PLU: Phoneme-Likely Unit)를 선정하기 위하여 네 가지의 Case 별로 PLU set과 음운 현상을 고려한 발음사전을 구성하여 인식률을 평가하였다. 적용된 유사음소 단위의 경우 초성과 종성 자음의 인식 단위 구분 및 음운 현상을 반영한 경우 트라이폰 모델에서 최적의 인식률(97.74%)을 보임을 알 수 있었다.

Keywords

References

  1. B.S. Kim, S.H. Kim (2009) A Study on the Speech Recognition for Commands of Ticketing Machine using CHMM, Journal of the Korean Society for Railway, 12(2), pp. 285-290.
  2. Korail, http://www.korail.com, accessed on 20 January 2011.
  3. Seoul Metro, http://www.seoulmetro.co.kr, accessed on 20 January 2011.
  4. Incheon Metro, http://www.incheonmetro.co.kr, accessed on 20 January 2011.
  5. Seoul Metropolitan Transit Corp, http://www.smrt.co.kr, accessed on 20 January 2011.
  6. K.N. Lee, M.H. Chung (2002) Discriminative Allophone Rules for Optimizing Pronunciation Dictionary in Korean LVCSR, KSCSP, Acoustical Society of Korea, 19(1).
  7. H.J. Kim, M.H. Chung (2002) Korean Continuous Recognition Using Phonological Context and Crossword Phonological Variations, KSCSP, Acoustical Society of Korea, 19(1).
  8. G.H. Shen, H.J. Seo, S.J. Hahm, J.G. Kim, et al. (2004) A Study on Phone-Like Units for Korean Continuous Speech Recognition in Noisy Environments, 2004 Autumn Conf. Acoustical Society of Korea, 23(2).
  9. R.D.R. Fagundes, J.S. Correa, P. Dumouchel (2002) A New Phonetic model for continuous speech recognition systems, Proc. ICSP, pp. 572-575.
  10. K.N. Lee (2006) Morpho-Phonological Modeling of Pronunciation Variation for Korean Large Vocabulary Continuous Speech Recognition, PhD Thesis, Sogang University.
  11. J.H. Lee (2009) Korean Phonology Lecture, SAMGYENGMUNHWASA.
  12. G.S. Lee, S.I. Yang, Y.H. Gwun (2001) Speech Recognition, Hanyang University Press.
  13. K.H. Kim, G.B. Lee, J.H. Lee (1995) Integration of Phoneme Recognition with Morphological Analysis for Spoken Korean Processing, The Korean Institute of Information Scientists and Engineers, 22(10), pp. 1488-1498.
  14. J.H. Jeon, S.H. Cha, M.H. Chung (1997) Korean Pronunciation Generation Based on Morphonological Analysis, Autumn Conf. The Korean Institute of Information Scientists and Engineers, 24(2), pp. 247-250.
  15. M. Suzuki, S. Makino, A. Ito, H. Aso, H. Shimodaira (1995) A New Hmnet Construction Algorithm Requiring No Contextual Factors, IEICE transaction on Information and systems, E78-D(6), pp. 662-668.
  16. S.H. Do (2010) New Research for The Place Name of Korea, JNC.
  17. D.Y. Park (2007) A Study on the Subway Station Names of Incheon Region, The Place Name Society of Korea, 13, pp. 147-177.
  18. S.H. Jeong (2007) The Study on Phonetical Information for Speech, The Association of The Research On Korean Language And Literature, 49, pp. 135-160.
  19. HTK, http://htk.eng.cam.ac.uk, accessed on 20 January 2011.
  20. S. Young, G. Evermmana, M. Gales, T. Hain, et al. (2006) The HTK Book for HTK Version 3.4.
  21. L.R. Rabiner (1989) A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition, Proc. IEEE, 77(2), pp. 257-286. https://doi.org/10.1109/5.18626
  22. D. Jurafsky and J. H. Martin (2008) Speech and Language Processing, Prentice Hall(2nd).
  23. L. Gu and K. Rose (2000) Sub-state tying in tied mixture hidden Markov models, Proc. IEEE, Acoustics, Speech, and Signal Processing, pp. 1062-1065.
  24. Praat, http://www.praat.org, accessed on 20 January 2011.

Cited by

  1. Speech Recognition of Korean Phonemes 'ㅅ', 'ㅈ', 'ㅊ' based on Volatility and Turning Points vol.20, pp.11, 2014, https://doi.org/10.5626/KTCP.2014.20.11.579