DOI QR코드

DOI QR Code

Decision Tree State Tying Modeling Using Parameter Estimation of Bayesian Method

Bayesian 기법의 모수 추정을 이용한 결정트리 상태 공유 모델링

  • Oh, SangYeob (Dept. of Computer Media Convergence, Gachon University)
  • 오상엽 (가천대학교 컴퓨터미디어융합학과)
  • Received : 2014.07.24
  • Accepted : 2015.01.20
  • Published : 2015.01.28

Abstract

Recognition model is not defined when you configure a model, Been added to the model after model building awareness, Model a model of the clustering due to lack of recognition models are generated by modeling is causes the degradation of the recognition rate. In order to improve decision tree state tying modeling using parameter estimation of Bayesian method. The parameter estimation method is proposed Bayesian method to navigate through the model from the results of the decision tree based on the tying state according to the maximum probability method to determine the recognition model. According to our experiments on the simulation data generated by adding noise to clean speech, the proposed clustering method error rate reduction of 1.29% compared with baseline model, which is slightly better performance than the existing approach.

인식 모델을 구성할 때 정의되지 않은 모델, 인식 모델 구성 후에 추가되어진 모델, 모델이 부족하여 하나의 모델 클러스터링으로 모델링하여 생성된 인식 모델들은 인식률 저하의 원인이 된다. 이러한 원인을 개선하기 위하여 Bayesian 기법의 모수 추정을 이용한 결정트리 상태 공유 모델링 방법을 제안하였다. 제안 방법은 Bayesian 기법의 파라미터 추정을 통하여 탐색된 결과로부터 결정트리 기반 상태 공유 모델링의 최대 확률 기법에 따라 인식모델을 결정한다. 본 논문에서 제안하여 시뮬레이션 데이터를 이용한 실험 결과에서 제안한 군집화 방식을 비교하여 1.29%의 음성인식 오류감소율을 보였으며, 기존 군집화 방식에 비해 개선된 성능을 보였다.

Keywords

References

  1. A. Srinivasan, Speech Recognition Using Hidden Markov Model, Applied Mathematical Sciences, vol. 5, no. 79, pp. 3943-3948, 2011.
  2. Chan-Shik Ahn, Sang-Yeob Oh. Gaussian Model Optimization using Configuration Thread Control In CHMM Vocabulary Recognition. The Journal of Digital Policy and Management. Vol. 10, No. 7, pp. 167-172, 2012.
  3. Chan-Shik Ahn, Sang-Yeob Oh. Echo Noise Robust HMM Learning Model using Average Estimator LMS Algorithm. The Journal of Digital Policy and Management. Vol. 10, No. 10, pp. 277-282, 2012.
  4. Chan-Shik Ahn, Sang-Yeob Oh. CHMM Modeling using LMS Algorithm for Continuous Speech Recognition Improvement. The Journal of digital policy and management. Vol. 10, No. 11, pp. 377-382, 2012.
  5. Chan-Shik Ahn, Sang-Yeob Oh. Vocabulary Recognition Retrieval Optimized System using MLHF Model. Journal of the Korea Society of Computer and Information. Vol. 14, No. 10, pp. 217-223, 2009.
  6. Beaufays, F., Vanhoucke, V., & Strope, B. Unsupervised discovery and training of maximally dissimilar cluster models. Proc. Interspeech, pp. 66-69, 2010.
  7. Zhang, Y., Xu, J., Yan, Z. J., & Huo, Q. An i-vector based approach to training data clustering for improved speech recognition. Proc. Interspeech, pp. 1247-1250. 2011.
  8. Tsao, Y. & Lee, C. H. An ensemble speaker and speaking environment modeling approach to robust speech recognition. IEEE Trans. Audio, Speech, and Language Processing, Vol. 17, No. 5, pp. 1025-1037, 2009. https://doi.org/10.1109/TASL.2009.2016231
  9. Sang-Yeob Oh. Improving Phoneme Recognition based on Gaussian Model using Bhattacharyya Distance Measurement Method. Journal of Korea Multimedia Society. Vol. 14, No. 1, pp. 85-93, 2011. https://doi.org/10.9717/kmms.2011.14.1.085
  10. Campbell, W. M., Sturim, D. E., Reynolds, D. A., Solomonoff, A. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. Proc. ICASSP, No. 1, pp. 97-100, 2006.
  11. Sang-Yeob Oh. Selective Speech Feature Extraction using Channel Similarity in CHMM Vocabulary Recognition. The Journal of digital policy and management. Vol. 11, No. 10, pp. 453-458, 2013. https://doi.org/10.14400/JDPM.2013.11.12.453
  12. Ban, S. M., Kang, B. O., Lee, Y. K., Kim, H. S. Automatic clustering of speech data using the distance between the cepstral mean vectors. Proc. 2012 Fall Conf. of the Korean Society of Speech Sciences, pp. -36, 2012.
  13. Lee, S. J., Kang, B. O., Jung, H. Y., Lee, Y. K. Kim, H. S. Statistical model-based noise reduction approach for car interior applications to speech recognition. ETRI Journal, Vol. 32, No. 5, pp. 801-809, 2010. https://doi.org/10.4218/etrij.10.1510.0024