• Title/Summary/Keyword: 문자교정

Search Result 49, Processing Time 0.031 seconds

The Recognition of The Korean Characters Using The Weighted Pattern Cluster (가중치 패턴 클러스터를 이용한 한글 문자 인식)

  • 김도형;이선화;차의영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.319-321
    • /
    • 2001
  • 본 논문에서는 스캐너로 입력된 한글 문서 영상에서 한글 문자를 인식하는 방법을 제시한다. 입력된 한글 문자를 한글의 구조적 특징에 따라 6개의 유형으로 분리하고, 각 유형에서의 모음의 형태학적 특징에 근거하여 모음을 인식한다. 각 유형에서의 자음의 인식을 위해서 가중치 패턴 클러스터를 생성하고 생성된 클러스터와 원영상간의 유사도 측정을 통해 자음을 인식하게 된다. 오인식 가능성이 있는 자음은 오인식 교정을 위한 세부 유사도 매칭과정을 통해 최종적으로 인식된다. 제안하는 알고리즘을 바탕으로 실험한 결과 스캐너로 입력받은 상용 한글 문자 14,983자에 대해 최종 95.68%의 인식률을 보였으며, 차후 정형화된 한글 문서 인식 시스템에 응용될 수 있을 것이다.

  • PDF

A Correction Algorithm for Misrecognized Words Using N-gram Hangeul Dictionary (N-GRAM 한글 사전을 이용한 오인식 단어의 교정 알고리즘)

  • Lee, Jong-Yun;Oh, Sang-Hun
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.271-283
    • /
    • 1993
  • 본 논문은 온라인 한글인식 시스템에서 오인식된 단어를 교정하는 알고리즘이다. 교정 기법으로는 N-gram 한글사전을 이용하였다. 오인식된 단어는 후보키의 선정과 선정된 후보문자중 가장 유사한 단어로 대체된다. 오인식 단어는 사전에 수록된 단어의 형태소 정보 즉, 사전의 표제어, 이의 품사 및 접속 규칙을 활용하여 교정된다. 본 논문은 오인식 교정에서 필요한 한글의 형태소 분석기에 관한 선행연구를 전제한다.

  • PDF

Restoration of Distorted Document Image Due to Depth Variation by Using Structured Light (Structured light를 이용한 깊이 차이에 문자 왜곡 교정)

  • Heo, Hoon;Chae, Ok-Sam
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.487-489
    • /
    • 2001
  • 제본된 책이나 고문서를 디자타이즈할 때 책을 바르게 펴지 못하거나 지정된 위치나 방위로 위치시키지 못해서 문제가 발생한다. 특히 힘을 가할 수 없는 고문서의 경우에는 바르게 펼수가 없기 때문에 깊이 차이에 의한 문서의 왜곡이 발생한다. 본 연구에서는 Structured Light을 이용하여 깊이 차에 의한 문자의 왜곡을 복원하는 방안을 제안한다. 또한 입력시 책의 위치와 방위에 대한 제한을 완화시킬 수 있도록 책의 위치와 방위 변화에 적을 알 수 있는 방안을 제안한다,

  • PDF

Korean Mobile Spam Filtering System Considering Characteristics of Text Messages (문자메시지의 특성을 고려한 한국어 모바일 스팸필터링 시스템)

  • Sohn, Dae-Neung;Lee, Jung-Tae;Lee, Seung-Wook;Shin, Joong-Hwi;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.7
    • /
    • pp.2595-2602
    • /
    • 2010
  • This paper introduces a mobile spam filtering system that considers the style of short text messages sent to mobile phones for detecting spam. The proposed system not only relies on the occurrence of content words as previously suggested but additionally leverages the style information to reduce critical cases in which legitimate messages containing spam words are mis-classified as spam. Moreover, the accuracy of spam classification is improved by normalizing the messages through the correction of word spacing and spelling errors. Experiment results using real world Korean text messages show that the proposed system is effective for Korean mobile spam filtering.

A Quality Evaluation System of a Handwriting String by Global and Local Features (지역특징과 지역특징을 통한 필기문자열의 품질평가시스템)

  • Kim Gye-Young
    • Journal of Internet Computing and Services
    • /
    • v.5 no.6
    • /
    • pp.121-128
    • /
    • 2004
  • This paper proposes a quality evaluation system of a handwriting string written by electronic pen. For the purpose of the system, this paper describes how to retrieve reference data from a database, how to evaluate the quality of a handwiting string using global and local features. Also, it explains how to optionally recognize a grade of a handwriting string at using global and how to diagnose stroke order at using local. The quality can be evaluated in the case of different language between reference and input by the system. Therefore, we expect that the system is very useful not only for training on handwriting but also for learning a language.

  • PDF

A Spelling Error Correction Model in Korean Using a Correction Dictionary and a Newspaper Corpus (교정사전과 신문기사 말뭉치를 이용한 한국어 철자 오류 교정 모델)

  • Lee, Se-Hee;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.5
    • /
    • pp.427-434
    • /
    • 2009
  • With the rapid evolution of the Internet and mobile environments, text including spelling errors such as newly-coined words and abbreviated words are widely used. These spelling errors make it difficult to develop NLP (natural language processing) applications because they decrease the readability of texts. To resolve this problem, we propose a spelling error correction model using a spelling error correction dictionary and a newspaper corpus. The proposed model has the advantage that the cost of data construction are not high because it uses a newspaper corpus, which we can easily obtain, as a training corpus. In addition, the proposed model has an advantage that additional external modules such as a morphological analyzer and a word-spacing error correction system are not required because it uses a simple string matching method based on a correction dictionary. In the experiments with a newspaper corpus and a short message corpus collected from real mobile phones, the proposed model has been shown good performances (a miss-correction rate of 7.3%, a F1-measure of 97.3%, and a false positive rate of 1.1%) in the various evaluation measures.

A Postprocessing Method of Korean Character Recognition by Mis-recognized Morphology Presumption (오인식 형태소 추정에 의한 한국어 문자 인식 후처리 기법)

  • Kim, Young-Hun;Lee, Young-Hwa;Lee, Sang-Jo
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.7
    • /
    • pp.46-55
    • /
    • 1999
  • We proposed the new method of postprocessing which not only reduces the frequency of dictionary access using morphological analysis but improve the recognition rate of character recognizer. In this paper, after estimating morphological construction of mis-recognized word using the part of speech that is analyzed, correct presumed mis-recognized morphology. The postprocessing using a morphology unit reduce candidate because of short than word and frequency of dictionary access because there is no need to morphological analysis for candidate. To select right candidate is only necessary to dictionary access. The proposed results show that reduced the frequency of dictionary access to 60% than postprocessing method using a word unit and recognition rate improved from 94% to 97%.

  • PDF

Candidate Word List and Probability Score Guided for Korean Scene Text Recognition (후보 단어 리스트와 확률 점수에 기반한 한국어 문자 인식 모델)

  • Lee, Yoonji;Lee, Jong-Min
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.73-75
    • /
    • 2022
  • Scene Text Recognition is a technology used in the field of artificial intelligence that requires manless robot, automatic vehicles and human-computer interaction. Though scene text images are distorted by noise interference, such as illumination, low resolution and blurring. Unlike previous studies that recognized only English, this paper shows a strong recognition accuracy including various characters, English, Korean, special character and numbers. Instead of selecting only one class having the highest probability value, a candidate word can be generated by considering the probability value of the second rank as well, thus a method can be corrected an existing language misrecognition problem.

  • PDF

English Critique and Verb Dictionary based on Extended Verb Pattern (확장 동사형에 기반한 동사사전과 영어 문장 검사기)

  • 차의영
    • Korean Journal of Cognitive Science
    • /
    • v.3 no.2
    • /
    • pp.311-328
    • /
    • 1992
  • The level and accuracy of English sentence that is generated by a man or machine translator are determined by the content of the verb dictionary and effective generation algorithm.The conventional English critiques is not adequate for foreigners because they do not have the verb dictionary including verb pattern or the important grammatical constraints. In this paper,Ipropose a structure of verb dictionary and an English sentence critique based on extended verb pattern that is useful to check and correct mistakes of English sentences generated by machine translator.

Implementation of morphologica analyzer and spelling corrector for charcter recognition post-processing (문자 인식 후처리를 위한 형태소 분석기와 문자 교정기의 구현)

  • 이영화;김규성;김영훈;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.5
    • /
    • pp.82-92
    • /
    • 1997
  • In this paper, we propose post-rpocessing method that corrects a misrecognized character by generated a characater recognizer using morphological analyzer and spelling corrector. The proposed post-processing consists of sthree phases : First, our method pass through morhological analyzer which only outputted necessary information for spelling correcting, doesn't analyze a bundle of phrases, and detects the location of misrecognized character. Second, tagging the generated candidate character using the information of character substitution table and grapheme substitution/separating table. Then we retry analysis after the misrecognition character has been substituted. Finally we select table, we investigate misrecognized charcters in CORPUS. Reliability analysis used to frequency of randomly selected about 100,000 words in CORPUS. A korean character recognizer demonstrates 93% correction rate without a post-processing. The entire recognition rate of our system with a post-processing exceeds 97% correction rate.

  • PDF