• Title/Summary/Keyword: Word error rate

Search Result 125, Processing Time 0.029 seconds

Performance Evaluation of English Word Pronunciation Correction System (한국인을 위한 외국어 발음 교정 시스템의 개발 및 성능 평가)

  • Kim Mu Jung;Kim Hyo Sook;Kim Sun Ju;Kim Byoung Gi;Ha Jin-Young;Kwon Chul Hong
    • MALSORI
    • /
    • no.46
    • /
    • pp.87-102
    • /
    • 2003
  • In this paper, we present an English pronunciation correction system for Korean speakers and show some of experimental results on it. The aim of the system is to detect mispronounced phonemes in spoken words and to give appropriate correction comments to users. There are several English pronunciation correction systems adopting speech recognition technology, however, most of them use conventional speech recognition engines. From this reason, they could not give phoneme based correction comments to users. In our system, we build two kinds of phoneme models: standard native speaker models and Korean's error models. We also design recognition network based on phonemes to detect Koreans' common mispronunciations. We get 90% detection rate in insertion/deletion/replacement of phonemes, but we cannot get high detection rate in diphthong split and accents.

  • PDF

On a Reduced-Complexity Inner Decoder for the Davey-MacKay Construction

  • Jiao, Xiaopeng;Armand, M.A.
    • ETRI Journal
    • /
    • v.34 no.4
    • /
    • pp.637-640
    • /
    • 2012
  • The Davey-MacKay construction is a promising concatenated coding scheme involving an outer $2^k$-ary code and an inner code of rate k/n, for insertion-deletion-substitution channels. Recently, a lookup table (LUT)-based inner decoder for this coding scheme was proposed to reduce the computational complexity of the inner decoder, albeit at the expense of a slight degradation in word error rate (WER) performance. In this letter, we show that negligible deterioration in WER performance can be achieved with an LUT as small as $7{\cdot}2^{k+n-1}$, but no smaller, when the probability of receiving less than n-1 or greater than n+1 bits corresponding to one outer code symbol is at least an order of magnitude smaller than the WER when no LUT is used.

Isolated Word Recognition Algorithm Using Lexicon and Multi-layer Perceptron (단어사전과 다층 퍼셉트론을 이용한 고립단어 인식 알고리듬)

  • 이기희;임인칠
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.8
    • /
    • pp.1110-1118
    • /
    • 1995
  • Over the past few years, a wide variety of techniques have been developed which make a reliable recognition of speech signal. Multi-layer perceptron(MLP) which has excellent pattern recognition properties is one of the most versatile networks in the area of speech recognition. This paper describes an automatic speech recognition system which use both MLP and lexicon. In this system., the recognition is performed by a network search algorithm which matches words in lexicon to MLP output scores. We also suggest a recognition algorithm which incorperat durational information of each phone, whose performance is comparable to that of conventional continuous HMM(CHMM). Performance of the system is evaluated on the database of 26 vocabulary size from 9 speakers. The experimental results show that the proposed algorithm achieves error rate of 7.3% which is 5.3% lower rate than 12.6% of CHMM.

  • PDF

Design of Bluetooth baseband System (블루투스 기저대역 시스템 설계)

  • 백은창;조현묵
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.2
    • /
    • pp.206-214
    • /
    • 2002
  • In this paper, it is designed and verified the baseband system that performs various protocol functions of specification of the Bluetooth system. In order to verify the developed circuits, various baseband functions are tested by using the ModelSim simulator. The developed circuits operate at 4MHz main clock. Test suite includes hap selection function, generation of the sync word, error correction(1/3 rate FEC, 2/3 rate FEC), HEC generation/checking, CRC generation/checking, data whitening/dewhitening and packet trans/reception procedure. etc. As a result of the simulation, it is verified that the developed baseband system conform to the specification of the Bluetooth system.

  • PDF

The Usage of Phoneme Duration Information for Rejecting Garbage Sentences (소음문장 제거를 위한 음소지속시간 사용)

  • Koo Myoung-Wan;Kim Ho-Kyoung;Park Sung-Joon;Kim Jae-In
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.219-222
    • /
    • 2003
  • In this paper, we study the usage of phoneme duration information for rejection garbage sentence. First, we build a phoneme duration modeling in a speech recognition system based on dicicion tree state tying, We assume that phone duration has a Gamma distribution. Next, we build a verification module in which word-level confidence measure is used. Finally, we make a comparative study on phoneme duration with speech DB obtained from the live system. This DB consistes of OOT(out-of-task) and ING(in-grammar) utterences. the usage of phone duration information yields that OOT recognition rate is improved by 46% and that another 8.4% error rate is reduced when combined with utterence verification module.

  • PDF

User Adaptive Post-Processing in Speech Recognition for Mobile Devices (모바일 기기를 위한 음성인식의 사용자 적응형 후처리)

  • Kim, Young-Jin;Kim, Eun-Ju;Kim, Myung-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.5
    • /
    • pp.338-342
    • /
    • 2007
  • In this paper we propose a user adaptive post-processing method to improve the accuracy of speaker dependent, isolated word speech recognition, particularly for mobile devices. Our method considers the recognition result of the basic recognizer simply as a high-level speech feature and processes it further for correct recognition result. Our method learns correlation between the output of the basic recognizer and the correct final results and uses it to correct the erroneous output of the basic recognizer. A multi-layer perceptron model is built for each incorrectly recognized word with high frequency. As the result of experiments, we achieved a significant improvement of 41% in recognition accuracy (41% error correction rate).

Performance Improvement of Rapid Speaker Adaptation Using Bias Compensation and Mean of Dimensional Eigenvoice Models (바이어스 보상과 차원별 Eigenvoice 모델 평균을 이용한 고속화자적응의 성능향상)

  • 박종세;김형순;송화전
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.5
    • /
    • pp.383-389
    • /
    • 2004
  • In this paper. we propose the bias compensation methods and the eigenvoice method using the mean of dimensional eigenvoice to improve the performance of rapid speaker adaptation based on eigenvoice under mismatch between training and test environment. Experimental results for vocabulary-independent word recognition task (using PBW 452 DB) show that the proposed methods yield improvements for small adaptation data. We obtained about 22∼30% relative improvement by the bias compensation methods as amount of adaptation data varied from 1 to 50, and obtained 41% relative improvement in error rate by the eigenvoice method using the mean of dimensional eigenvoice with only single adaptation word.

SVM-based Utterance Verification Using Various Confidence Measures (다양한 신뢰도 척도를 이용한 SVM 기반 발화검증 연구)

  • Kwon, Suk-Bong;Kim, Hoi-Rin;Kang, Jeom-Ja;Koo, Myong-Wan;Ryu, Chang-Sun
    • MALSORI
    • /
    • no.60
    • /
    • pp.165-180
    • /
    • 2006
  • In this paper, we present several confidence measures (CM) for speech recognition systems to evaluate the reliability of recognition results. We propose heuristic CMs such as mean log-likelihood score, N-best word log-likelihood ratio, likelihood sequence fluctuation and likelihood ratio testing(LRT)-based CMs using several types of anti-models. Furthermore, we propose new algorithms to add weighting terms on phone-level log-likelihood ratio to merge word-level log-likelihood ratios. These weighting terms are computed from the distance between acoustic models and knowledge-based phoneme classifications. LRT-based CMs show better performance than heuristic CMs excessively, and LRT-based CMs using phonetic information show that the relative reduction in equal error rate ranges between $8{\sim}13%$ compared to the baseline LRT-based CMs. We use the support vector machine to fuse several CMs and improve the performance of utterance verification. From our experiments, we know that selection of CMs with low correlation is more effective than CMs with high correlation.

  • PDF

Status Report on the Korean Speech Recognition Platform (한국어 음성인식 플랫폼 개발현황)

  • Kwon, Oh-Wook;Kwon, Suk-Bong;Jang, Gyu-Cheol;Yun, Sung-rack;Kim, Yong-Rae;Jang, Kwang-Dong;Kim, Hoi-Rin;Yoo, Chang-Dong;Kim, Bong-Wan;Lee, Yong-Ju
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.215-218
    • /
    • 2005
  • This paper reports the current status of development of the Korean speech recognition platform (ECHOS). We implement new modules including ETSI feature extraction, backward search with trigram, and utterance verification. The ETSI feature extraction module is implemented by converting the public software to an object-oriented program. We show that trigram language modeling in the backward search pass reduces the word error rate from 23.5% to 22% on a large vocabulary continuous speech recognition task. We confirm the utterance verification module by examining word graphs with confidence score.

  • PDF

Quantization Performances and Iteration Number Statistics for Decoding Low Density Parity Check Codes (LDPC 부호의 복호를 위한 양자화 성능과 반복 횟수 통계)

  • Seo, Young-Dong;Kong, Min-Han;Song, Moon-Kyou
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.2
    • /
    • pp.37-43
    • /
    • 2008
  • The performance and hardware complexity of LDPC decoders depend on the design parameters of quantization, the clipping threshold $c_{th}$ and the number of quantization bits q, and also on the maximum number of decoding iterations. In this paper, the BER performances of LDPC codes are evaluated according to the clipping threshold $c_{th}$ and the number of quantization bits q through the simulation studies. By comparing the quantized Min-Sum algorithm with the ideal Min-Sum algorithm, it is shown that the quantized case with $c_{th}=2.5$ and q=6 has the best performance, which approaches the idea case. The decoding complexities are calculated and the word error rates(WER) are estimated by using the pdf which is obtained through the statistical analyses on the iteration numbers. These results can be utilized to tradeoff between the decoding performance and the complexity in LDPC decoder design.