• Title/Summary/Keyword: 트라이그램 언어모델

Search Result 2, Processing Time 0.017 seconds

Korean Word Segmentation and Compound-noun Decomposition Using Markov Chain and Syllable N-gram (마코프 체인 밀 음절 N-그램을 이용한 한국어 띄어쓰기 및 복합명사 분리)

  • 권오욱
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.274-284
    • /
    • 2002
  • Word segmentation errors occurring in text preprocessing often insert incorrect words into recognition vocabulary and cause poor language models for Korean large vocabulary continuous speech recognition. We propose an automatic word segmentation algorithm using Markov chains and syllable-based n-gram language models in order to correct word segmentation error in teat corpora. We assume that a sentence is generated from a Markov chain. Spaces and non-space characters are generated on self-transitions and other transitions of the Markov chain, respectively Then word segmentation of the sentence is obtained by finding the maximum likelihood path using syllable n-gram scores. In experimental results, the algorithm showed 91.58% word accuracy and 96.69% syllable accuracy for word segmentation of 254 sentence newspaper columns without any spaces. The algorithm improved the word accuracy from 91.00% to 96.27% for word segmentation correction at line breaks and yielded the decomposition accuracy of 96.22% for compound-noun decomposition.

Related Works for an Input String Recommendation and Modification on Mobile Environment (모바일 기기의 입력 문자열 추천 및 오타수정 모델을 위한 주요 기술)

  • Lee, Song-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.602-604
    • /
    • 2011
  • Due to wide usage of smartphones and mobile internet, mobile devices are used in various fields such as sending SMS, participating SNS, retrieving information and the number of users taking advantage of them are growing. The keypads of a mobile device are relatively smaller than those of desktop computers. Thus, the user has a difficulty in input sentences quickly and correctly. In this study, we introduce some string recommendation and modification techniques which can be used for helping a user input in mobile devices quickly and correctly. We describe a TRIE dictionary and n-gram language model which are the main technologies of the keyword recommendation applied to the online search engines.

  • PDF