DOI QR코드

DOI QR Code

ERB 필터를 이용한 시맨틱 온톨로지 음성 인식 성능 향상

Semantic Ontology Speech Recognition Performance Improvement using ERB Filter

  • 투고 : 2014.07.17
  • 심사 : 2014.10.20
  • 발행 : 2014.10.28

초록

기존의 음성 인식 알고리즘은 어휘들 간의 순서가 정해져 있지 않으며, 음성 인식 환경 변화에 따른 잡음으로 인한 음성 검출이 정확하지 못한 단점을 가지며, 검색 시스템은 키워드의 의미가 다양하여 정확한 정보를 인지하지 못한다. 본 연구에서는 사건 기반 시맨틱 온톨로지 추론 모델을 제안하였으며, 제안된 시스템에서 음성 인식 특징을 추출하기 위해 ERB 필터를 이용하여 특징 추출하는 모델을 구축하였다. 제안된 모델은 성능 평가를 위해 지하철역, 지하철 잡음을 사용하였고 잡음 환경의 SNR -10dB, -5dB 신호에서 잡음 제거를 수행하여 왜곡도를 측정한 결과 2.17dB, 1.31dB의 성능이 향상됨을 확인하였다.

Existing speech recognition algorithm have a problem with not distinguish the order of vocabulary, and the voice detection is not the accurate of noise in accordance with recognized environmental changes, and retrieval system, mismatches to user's request are problems because of the various meanings of keywords. In this article, we proposed to event based semantic ontology inference model, and proposed system have a model to extract the speech recognition feature extract using ERB filter. The proposed model was used to evaluate the performance of the train station, train noise. Noise environment of the SNR-10dB, -5dB in the signal was performed to remove the noise. Distortion measure results confirmed the improved performance of 2.17dB, 1.31dB.

키워드

참고문헌

  1. Yun-Kyung Lee, Oh-Wook Kwon, Application of Shape Analysis Techniques for Improved CASA-Based Speech Separation. The Korean Society of Phonetic Sciences and Speech Technology:MALSORI. No. 65, pp. 153-168, 2008.
  2. Tae-woong Choi, Soon-Hyub Kim. Target Speech Segregation Using Non-parametric Correlation Feature Extraction in CASA System. The Journal of the Acoustical Society of Korea. Vol. 32, No. 1, pp. 79-85. 2013. https://doi.org/10.7776/ASK.2013.32.1.079
  3. T. T. pham, J. Y. Kim, S. Y. Na, S. T. Hwang, "Robust Eye Localization for Lip Reading in Mobile Environment," Proceddings of SCIS&ISIS in Japan, pp.385-388, 2008.
  4. P. Li, Y. Guan, B. Xu, W. Liu.. Monaural speech separation based on computational auditory analysis and objective quality assessment of speech. IEEE Trans, audio, speech, and language processing. Vol. 14, No. 6, pp. 2014-2022, 2006. https://doi.org/10.1109/TASL.2006.883258
  5. A. P. Klapuri. Multipitch analysis of polyphonic music and speech signals using an auditory model. IEEE trans, Audio, Speech and Language Process. Vol. 16, No. 2, pp. 255-266, 2008. https://doi.org/10.1109/TASL.2007.908129
  6. Y. Shao, S. Srinivasan, Z. Jin, D. Wang. A Computational Auditory Scene Analysis System for Robust Speech Recognition. Computer Speech & Language. Vol. 24, No. 1, pp. 77-93, 2010. https://doi.org/10.1016/j.csl.2008.03.004
  7. G. Hu, D. Wang. A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation. IEEE Trans, Audio, Speech and Language Processing. Vol. 18, No. 8, pp. 2067-2079, 2010. https://doi.org/10.1109/TASL.2010.2041110
  8. Chanshik Ahn, Beomseung Kim, Taewoong Choi, Soonhyub Kim. Colored Noise Cancellation Algorithm using Average Estimator. Proceedings of the Acoustical Society of Korea Conference. Vol. 29, No. 1, pp. 71-74, 2012.
  9. T. T. Pham, M. G. Song, J. Y. KIm, S. Y. Na, S. T. Hwang, "A Robust Lip Center Detection in Cell Phone Environment," Proceedings of IEEE Symposium on Signal Processing and Information Technology, pp.390-395, Sarajevo, December, 2008.
  10. Byungwook Lee, Improvement of the Semantic Information Retrieval using Ontology and Spearman Corelation Coefficients, The Journal of Digital Policy and Management. Vol. 11, No. 11, pp. 351-357, 2013.