• Title/Summary/Keyword: grapheme recognition

Search Result 34, Processing Time 0.026 seconds

Implementation of morphologica analyzer and spelling corrector for charcter recognition post-processing (문자 인식 후처리를 위한 형태소 분석기와 문자 교정기의 구현)

  • 이영화;김규성;김영훈;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.5
    • /
    • pp.82-92
    • /
    • 1997
  • In this paper, we propose post-rpocessing method that corrects a misrecognized character by generated a characater recognizer using morphological analyzer and spelling corrector. The proposed post-processing consists of sthree phases : First, our method pass through morhological analyzer which only outputted necessary information for spelling correcting, doesn't analyze a bundle of phrases, and detects the location of misrecognized character. Second, tagging the generated candidate character using the information of character substitution table and grapheme substitution/separating table. Then we retry analysis after the misrecognition character has been substituted. Finally we select table, we investigate misrecognized charcters in CORPUS. Reliability analysis used to frequency of randomly selected about 100,000 words in CORPUS. A korean character recognizer demonstrates 93% correction rate without a post-processing. The entire recognition rate of our system with a post-processing exceeds 97% correction rate.

  • PDF

Hybrid CTC-Attention Based End-to-End Speech Recognition Using Korean Grapheme Unit (한국어 자소 기반 Hybrid CTC-Attention End-to-End 음성 인식)

  • Park, Hosung;Lee, Donghyun;Lim, Minkyu;Kang, Yoseb;Oh, Junseok;Seo, Soonshin;Rim, Daniel;Kim, Ji-Hwan
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.453-458
    • /
    • 2018
  • 본 논문은 한국어 자소를 인식 단위로 사용한 hybrid CTC-Attention 모델 기반 end-to-end speech recognition을 제안한다. End-to-end speech recognition은 기존에 사용된 DNN-HMM 기반 음향 모델과 N-gram 기반 언어 모델, WFST를 이용한 decoding network라는 여러 개의 모듈로 이루어진 과정을 하나의 DNN network를 통해 처리하는 방법을 말한다. 본 논문에서는 end-to-end 모델의 출력을 추정하기 위해 자소 단위의 출력구조를 사용한다. 자소 기반으로 네트워크를 구성하는 경우, 추정해야 하는 출력 파라미터의 개수가 11,172개에서 49개로 줄어들어 보다 효율적인 학습이 가능하다. 이를 구현하기 위해, end-to-end 학습에 주로 사용되는 DNN 네트워크 구조인 CTC와 Attention network 모델을 조합하여 end-to-end 모델을 구성하였다. 실험 결과, 음절 오류율 기준 10.05%의 성능을 보였다.

  • PDF

Study on Implementation of a Handwritten-Character Recognition System in a PDA Using a Neural Hardware (신경망 하드웨어를 이용한 PDA 펜입력 인식시스템의 구현 연구)

  • Kim, Kwang-Hyun;Kang, Deung-Gu;Lee, Tae-Won;Park, Jin;Kim, Young-Chul
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.492-495
    • /
    • 1999
  • In this paper, a research is focused on implementation of the handwritten Korean-character recognition system using a neural coprocessor for PDA application. The proposed coprocessor is composed of a digital neural network called DMNN and a RISC-based dedicated controller in order to achieve high speed as well as compactness. Two neural networks are used for recognition, one for stroke classification out of extended 11 strokes and the other for grapheme classification. Our experimental result shows that the successful recognition rate of 92.1% over 3,000 characters written by 10 persons can be obtained. Moreover, it can be improved to 95.3% when four candidates are considered. The design verification of tile proposed neural coprocessor is conducted using the ASIC emulator for further hardware implementation.

  • PDF

An On-line Speech and Character Combined Recognition System for Multimodal Interfaces (멀티모달 인터페이스를 위한 음성 및 문자 공용 인식시스템의 구현)

  • 석수영;김민정;김광수;정호열;정현열
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.2
    • /
    • pp.216-223
    • /
    • 2003
  • In this paper, we present SCCRS(Speech and Character Combined Recognition System) for speaker /writer independent. on-line multimodal interfaces. In general, it has been known that the CHMM(Continuous Hidden Markov Mode] ) is very useful method for speech recognition and on-line character recognition, respectively. In the proposed method, the same CHMM is applied to both speech and character recognition, so as to construct a combined system. For such a purpose, 115 CHMM having 3 states and 9 transitions are constructed using MLE(Maximum Likelihood Estimation) algorithm. Different features are extracted for speech and character recognition: MFCC(Mel Frequency Cepstrum Coefficient) Is used for speech in the preprocessing, while position parameter is utilized for cursive character At recognition step, the proposed SCCRS employs OPDP (One Pass Dynamic Programming), so as to be a practical combined recognition system. Experimental results show that the recognition rates for voice phoneme, voice word, cursive character grapheme, and cursive character word are 51.65%, 88.6%, 85.3%, and 85.6%, respectively, when not using any language models. It demonstrates the efficiency of the proposed system.

  • PDF

A Study on Machine Printed Character Recognition Based on Character Type Classification (문자형식 분류 기반의 인쇄체 문자인식에 관한 연구)

  • 임길택;김호연
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.40 no.5
    • /
    • pp.266-279
    • /
    • 2003
  • In this paper, we propose machine printed character recognition methods which utilize the character type information and divide the character clusters. The characters are subdivided into a total of seven types, of which six types are for Hangul according to the grapheme combination fashions and one type for English characters, numerals, and symbols. According to the character type, we separate input character image into several recognition units and recognize them by using the direction angle feature. The recognition for each character type is completed by combining recognition units which are recognized by neural networks respectively For combining a total of seven character recognizers, we implemented seven methods such as switching method, integrating method, and their several variants. As experimental results, we obtained 98.2% recognition rate of simple switching method, 90.54% of integrating one, and between 97.35% and 98.65% of five variants.

On-line Recognition of Cursive Korean Characters Based on Hidden Markov Model and Level Building (은닉 마르코프 모델과 레벨 빌딩 알고리즘을 이용한 흘림체 한글의 온라인 인식)

  • Kim, Sang-Gyun;Kim, Gyeong-Hyeon;Lee, Jong-Guk;Lee, Jae-Uk;Kim, Hang-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.5
    • /
    • pp.1281-1293
    • /
    • 1996
  • In this paper, we propose a novel recognition model of on-line cursive Korean characters using HMM(Hidden Markov Model) and level building algorithm. The model is constructed as a form of recognition network with HMM for graphemes and Korean combination rules. Though the network is so flexible as to accomodate variability of input patterns, it has a problem of recognition speed caused by 11, 172 search paths. To settle the problem, we modify the level building algorithm to be adapted directly to the Korean combination rules and apply it to the model. The modified algorithm is efficient network search procedure time complexity of which depends on the number of HMMs for each grapheme, not the number of paths in the extensive recognition network. A test with 15, 000 hand written characters shows recognition rat 90% and speed of 0.72 second per character.

  • PDF

Pronunciation Variation Patterns of Loanwords Produced by Korean and Grapheme-to-Phoneme Conversion Using Syllable-based Segmentation and Phonological Knowledge (한국인 화자의 외래어 발음 변이 양상과 음절 기반 외래어 자소-음소 변환)

  • Ryu, Hyuksu;Na, Minsu;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.139-149
    • /
    • 2015
  • This paper aims to analyze pronunciation variations of loanwords produced by Korean and improve the performance of pronunciation modeling of loanwords in Korean by using syllable-based segmentation and phonological knowledge. The loanword text corpus used for our experiment consists of 14.5k words extracted from the frequently used words in set-top box, music, and point-of-interest (POI) domains. At first, pronunciations of loanwords in Korean are obtained by manual transcriptions, which are used as target pronunciations. The target pronunciations are compared with the standard pronunciation using confusion matrices for analysis of pronunciation variation patterns of loanwords. Based on the confusion matrices, three salient pronunciation variations of loanwords are identified such as tensification of fricative [s] and derounding of rounded vowel [ɥi] and [$w{\varepsilon}$]. In addition, a syllable-based segmentation method considering phonological knowledge is proposed for loanword pronunciation modeling. Performance of the baseline and the proposed method is measured using phone error rate (PER)/word error rate (WER) and F-score at various context spans. Experimental results show that the proposed method outperforms the baseline. We also observe that performance degrades when training and test sets come from different domains, which implies that loanword pronunciations are influenced by data domains. It is noteworthy that pronunciation modeling for loanwords is enhanced by reflecting phonological knowledge. The loanword pronunciation modeling in Korean proposed in this paper can be used for automatic speech recognition of application interface such as navigation systems and set-top boxes and for computer-assisted pronunciation training for Korean learners of English.

Improvement of The Printed Korean Grapheme Recognition using Meaningful Noises (규칙적인 잡음을 이용한 인쇄체 한글 자소인식 개선)

  • Lee, Jin-Soo;Kwon, Oh-Jun;Bang, Sung-Yang
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.143-147
    • /
    • 1995
  • 한글은 문자수가 많고 초성, 중성, 종성의 조합으로 이루어진 2차원적인 특성 때문에, 신경망을 이용한 한글 인식의 경우에는 자소를 분리한 후 자소별로 인식하는 방법이 많이 사용된다. 이러한 방법의 경우 분리된 자소영역에 원하는 자소 이외의 부분이 첨가되면 학습이 어려워 오인식의 주된 원인이 되기 때문에, 정확한 자소분리 알고리즘이나 전처리등을 통하여 그러한 잡음을 없애려는 시도가 많이 있었으나 아직도 원하는 자소부분 만을 정확히 분리하는 것은 어려운 문제로 남아있다. 본 논문에서는 그러한 잡음이 규칙적임을 이용하여, 필요한 자소영역만을 추출하려하기보다는 오히려 필요한 자소영역 외의 부분을 포함시킴으로써, 잡음이라고만 생각했던 부분을 하나의 정보로 역이용하여 이로 인한 여러 오인식 경우를 해결하였다. 또한 자소의 위치가 불규칙적인 부분에 있어서는, 그 위치를 고정시키는 알고리즘을 사용하여 인식률을 더욱 높였다.

  • PDF

A Method of Machine-Printed Hangul Recognition using Grapheme Recognizer (낱자 특징 기반 자소 인식기를 이용한 인쇄체 한글 인식방법)

  • Jang, SeungIck;Nam, Youn-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.351-354
    • /
    • 2004
  • 본 논문에서는 낱자에서 추출한 특징을 입력으로 사용하는 자소 인식기를 이용한 저해상도 인쇄체 한글 영상의 인식 방법을 제안하였다. 제안한 방법에서는 입력 문자를 한글 6 형식과 기타 형식의 문자, 총 7 종으로 분류한 뒤, 입력 문자를 인식 대상 문자의 수와 자소 복잡도에 따라 하나 또는 두 개의 인식 단위로 구분하여 인식한다. 각 HRU는 낱자에서 추출한 방향각 특징을 입력으로 사용하는 다층 신경망 인식기를 이용하여 인식한다. 다음으로, 각 다층 신경망 인식기의 신뢰도를 조합하여 최종 인식 결과를 도출한다. 제안한 방법을 사용한 실험에서 98.99%의 인식률을 얻을 수 있었으며, 이는 기존 방법에 비해 15.83%의 오류가 감소한 것이다.

  • PDF

Performance of Korean spontaneous speech recognizers based on an extended phone set derived from acoustic data (음향 데이터로부터 얻은 확장된 음소 단위를 이용한 한국어 자유발화 음성인식기의 성능)

  • Bang, Jeong-Uk;Kim, Sang-Hun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.39-47
    • /
    • 2019
  • We propose a method to improve the performance of spontaneous speech recognizers by extending their phone set using speech data. In the proposed method, we first extract variable-length phoneme-level segments from broadcast speech signals, and convert them to fixed-length latent vectors using an long short-term memory (LSTM) classifier. We then cluster acoustically similar latent vectors and build a new phone set by choosing the number of clusters with the lowest Davies-Bouldin index. We also update the lexicon of the speech recognizer by choosing the pronunciation sequence of each word with the highest conditional probability. In order to analyze the acoustic characteristics of the new phone set, we visualize its spectral patterns and segment duration. Through speech recognition experiments using a larger training data set than our own previous work, we confirm that the new phone set yields better performance than the conventional phoneme-based and grapheme-based units in both spontaneous speech recognition and read speech recognition.