Performance Improvement of Continuous Digits Speech Recognition Using the Transformed Successive State Splitting and Demi-syllable Pair

Seo Eun-Kyoung;Choi Gab-Keun;Kim Soon-Hyob;Lee Soo-Jeong;

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

제9권1호
/
Pages.23-32
/
2006
/
1229-7771(pISSN)
/
2384-0102(eISSN)

한국멀티미디어학회 (Korea Multimedia Society)

반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자 음 인식의 성능 향상

Performance Improvement of Continuous Digits Speech Recognition Using the Transformed Successive State Splitting and Demi-syllable Pair

서은경 (광운대학교 대학원 컴퓨터공학과) ;
최갑근 (광운대학교 대학원 컴퓨터공학과) ;
김순협 (광운대학교 컴퓨터공학과) ;
이수정 (광운대학교 음성신호처리)

발행 : 2006.01.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 언어모델과 음향모델을 개선함으로써 단위 숫자음의 인식성능 최적화에 대해 설명한다. 언어모델은 한국어 단위 숫자음 문장의 문법적 특징을 분석하고, Finile State Network(FSN) 노드를 두 음절로 구성하여 오 인식률을 감소시켰다. 음향모델은 단 음절로 구성되어 발성기간이 짧고 조음이 많이 생기는 불명확한 음소, 음절의 분할로 인한 오 인식을 줄이기 위해 인식단위를 반음절 쌍으로 하였다. 인식단위의 특징을 효과적으로 모델링하기 위해 특징부분에서 K-means 알고리즘으로 군집화 하여, 상태를 분할하는 변형된 연쇄 상태 분할방법을 이용하였다. 실험 결과 제안된 언어모델의 적용 후 동일 문맥종속 음소모델에서 10.5%, 음향모델에서 인식단위를 반음절 쌍으로 하였을 경우 문맥종속 음소모델에 비해 12.5%, 변형된 연쇄 상태분할을 하였을 경우 1.5%의 인식률을 향상시킬 수 있었다.

This paper describes the optimization of a language model and an acoustic model to improve speech recognition using Korean unit digits. Since the model is composed of a finite state network (FSN) with a disyllable, recognition errors of the language model were reduced by analyzing the grammatical features of Korean unit digits. Acoustic models utilize a demisyllable pair to decrease recognition errors caused by inaccurate division of a phone or monosyllable due to short pronunciation time and articulation. We have used the K-means clustering algorithm with the transformed successive state splitting in the feature level for the efficient modelling of feature of the recognition unit. As a result of experiments, 10.5% recognition rate is raised in the case of the proposed language model. The demi-syllable fair with an acoustic model increased 12.5% recognition rate and 1.5% recognition rate is improved in transformed successive state splitting.

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자 음 인식의 성능 향상

Performance Improvement of Continuous Digits Speech Recognition Using the Transformed Successive State Splitting and Demi-syllable Pair

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)