A Study on the Korean Broadcasting Speech Recognition

;;;

한국음향학회지 (The Journal of the Acoustical Society of Korea)

제18권1호
/
Pages.53-60
/
1999
/
1225-4428(pISSN)
/
2287-3775(eISSN)

한국음향학회 (The Acoustical Society of Korea)

한국어 방송 음성 인식에 관한 연구

A Study on the Korean Broadcasting Speech Recognition

김석동 (호서대학교 컴퓨터학부) ;
송도선 (우송공업대학 전자정보계열) ;
이행세 (아주대학교 전자공학과)

발행 : 1999.01.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

이 논문은 한국 방송 음성 인식에 관한 연구이다. 여기서 우리는 대규모 어휘를 갖는 연속 음성 인식을 위한 방법을 제시한다. 주요 관점은 언어 모델과 탐색 방법이다. 사용된 음성 모델은 기본음소 Semi-continuous HMM이고 언어 모델은 N-gram 방법이다. 탐색 방법은 음성과 언어 정보를 최대한 활용하기 위해 3단계의 방법을 사용하였다. 첫째로, 단어의 끝 부분과 그에 관련된 정보를 만들기 위한 순방향 Viterbi Beam탐색을 하였으며, 둘째로 단어 의 시작 부분과 그에 관련된 정보를 만드는 역방향 Viterbi Beam탐색, 그리고 마지막으로 이들 두 결과와 확률적인 언어 모델을 결합하여 최종 인식결과를 얻기 위해 A/sup */ 탐색을 한다. 이 방법을 사용하여 12,000개의 단어에 대한 화자 독립으로 최고 96.0%의 단어 인식률과 99.2%의 음절 인식률을 얻었다.

This paper is a study on the korean broadcasting speech recognition. Here we present the methods for the large vocabuary continuous speech recognition. Our main concerns are the language modeling and the search algorithm. The used acoustic model is the uni-phone semi-continuous hidden markov model and the used linguistic model is the N-gram model. The search algorithm consist of three phases in order to utilize all available acoustic and linguistic information. First, we use the forward Viterbi beam search to find word end frames and to estimate related scores. Second, we use the backword Viterbi beam search to find word begin frames and to estimate related scores. Finally, we use A/sup */ search to combine the above two results with the N-grams language model and to get recognition results. Using these methods maximum 96.0% word recognition rate and 99.2% syllable recognition rate are achieved for the speaker-independent continuous speech recognition problem with about 12,000 vocabulary size.

키워드

참고문헌

IEEE In ternational Conference on Acoustics,Speech,and Signal Processing The DARPA 1000-Word R esource Management Database for Continuous Speech Recognition Price,P.;Fisher,W.M.;Bernstein,J.;Pallet,D.S.
DARPA Speech Recognition Workshop Design and prtparation of the 1996 HUB-4 Broadcast News Benchmark Test Corpora. John S.Garofolo,;Jonathan G.Fiscus,;William M.Fisher,
Problem Solving Methods in Articial Intelligence. Nilsson,N.J.
IEEE Transactions on Information Theory v.IT-13 Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm. Viterbi,A.J.
Ph.D.thesis,Computer Science Department The Harpy Speech Understanding System. Lowerre,B.
IEEE Trans Speech and Audio Processing v.2 Improvements in Timesynchronous Beam Search for 10000-Word Continuous Speech Recognition. R.Haeb-Umbach,;H.Ney,
IEEE Transactions on Pattern Analysis and Mac hine Intelligence v.PAMI-5 no.2 A Maximum Likelihood Approach to Continuous Speech Recognition. Bahl,L.R.;Jelinek,F.;Mercer,R.
Proceedings of DARPA Speech and Natural Language Workshop An Effcient A Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model. Paul,Douglas B.
In IEEE International Conference on Acoustics,Speech,and Signal Processing The Optimal N-Best Algorithm: An Effcient Proce dure for Finding Multiple Sentence Hypotheses. Schwartz,R.;Chow,Y.L.
Computer Speech and Language v.3 Semi-continuous hidden Markov model for speech signals X.D.Huang,;M.A.Jack,
Computer Speech Language v.8 On Structuring Probabilistic Dependences in Stochastic Language Modelling. H.Ney,;U.Essen,;R.Kneser,
Proc.ICASSP-88 High Performance connected digit recognition using hidden markov models L.R.Rabiner,;J.G.Wilpon,;F.K.Soong,
Proc ICASSP 93 v.2 Trigger-based Language Models: a Maximum Entropy Approach. R.Lau,;R.Rosenfield,;S.Roukos,
IEEE International Conference on Acoustics, Speech, and Signal Processing An Improved Search Algorithm for Continuous Speech Recognition. Alleva,F.;Huang,X.;Hwang,M.
Proc.DARPA Speech Recog.Workshop Test procedure for the March 1987 DARPA benchmark tests D.Pallett,

한국음향학회지 (The Journal of the Acoustical Society of Korea)

한국어 방송 음성 인식에 관한 연구

A Study on the Korean Broadcasting Speech Recognition

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)