• Title/Summary/Keyword: character classifier

Search Result 49, Processing Time 0.025 seconds

The Font Recognition of Printed Hangul Documents (인쇄된 한글 문서의 폰트 인식)

  • Park, Moon-Ho;Shon, Young-Woo;Kim, Seok-Tae;Namkung, Jae-Chan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.8
    • /
    • pp.2017-2024
    • /
    • 1997
  • The main focus of this paper is the recognition of printed Hangul documents in terms of typeface, character size and character slope for IICS(Intelligent Image Communication System). The fixed-size blocks extracted from documents are analyzed in frequency domain for the typeface classification. The vertical pixel counts and projection profile of bounding box are used for the character size classification and the character slope classification, respectively. The MLP with variable hidden nodes and error back-propagation algorithm is used as typeface classifier, and Mahalanobis distance is used to classify the character size and slope. The experimental results demonstrated the usefulness of proposed system with the mean rate of 95.19% in typeface classification. 97.34% in character size classification, and 89.09% in character slope classification.

  • PDF

Spam-Mail Filtering System Using Weighted Bayesian Classifier (가중치가 부여된 베이지안 분류자를 이용한 스팸 메일 필터링 시스템)

  • 김현준;정재은;조근식
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.1092-1100
    • /
    • 2004
  • An E-mails have regarded as one of the most popular methods for exchanging information because of easy usage and low cost. Meanwhile, exponentially growing unwanted mails in user's mailbox have been raised as main problem. Recognizing this issue, Korean government established a law in order to prevent e-mail abuse. In this paper we suggest hybrid spam mail filtering system using weighted Bayesian classifier which is extended from naive Bayesian classifier by adding the concept of preprocessing and intelligent agents. This system can classify spam mails automatically by using training data without manual definition of message rules. Particularly, we improved filtering efficiency by imposing weight on some character by feature extraction from spam mails. Finally, we show efficiency comparison among four cases - naive Bayesian, weighting on e-mail header, weighting on HTML tags, weighting on hyperlinks and combining all of four cases. As compared with naive Bayesian classifier, the proposed system obtained 5.7% decreased precision, while the recall and F-measure of this system increased by 33.3% and 31.2%, respectively.

Vehicle License Plate Recognition System using DCT and LVQ (DCT와 LVQ를 이용한 차량번호판 인식 시스템)

  • 한수환
    • Journal of Intelligence and Information Systems
    • /
    • v.8 no.1
    • /
    • pp.15-25
    • /
    • 2002
  • This paper proposes a vehicle license plate recognition system, which has relatively a simple structure and is highly tolerant of noise, by using the DCT(Discrete Cosine Transform) coefficients extracted from the character region of a license plate and the LVQ(Learning Vector Quantization) neural network. The image of a license plate is taken from a captured vehicle image based on RGB color information, and the character region is derived by the histogram of the license plate and the relative position of individual characters in the plate. The feature vector obtained by the DCT of extracted character region is utilized as an input to the LVQ neural classifier fur the recognition process. In the experiment, 109 vehicle images captured under various types of circumstances were tested with the proposed method, and the relatively high extraction rate of license plates and recognition rate were achieved.

  • PDF

A Hierarchical Neural Network for Printed Hangul Character Recognition (인쇄체 한글문자 인식을 위한 계층적 신경망)

  • 조성배;김진형
    • Korean Journal of Cognitive Science
    • /
    • v.2 no.1
    • /
    • pp.33-50
    • /
    • 1990
  • Recently, neural networks have been proposed as computaional models for hard prlblems that the brain appears to solve easily. This paper proposes a hierarchical network which practically recognizes printed Hangul characters based on the various psychological stueies. This system is composed of a type classification netwotk and six recognition networks. The former clessifier input character images into one of the six thper by their overall sturcture, and the latter further classify them into character code. Extperiments with most frequently used 990 printed hangul characters conform the superiority of the propsed system. After all, neural nework approach turns out to be very reasonable through a comparison with statistical classifier and an analysis of mis-classification and generalization capability.

Using Naïve Bayes Classifier and Confusion Matrix Spelling Correction in OCR (나이브 베이즈 분류기와 혼동 행렬을 이용한 OCR에서의 철자 교정)

  • Noh, Kyung-Mok;Kim, Chang-Hyun;Cheon, Min-Ah;Kim, Jae-Hoon
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.310-312
    • /
    • 2016
  • OCR(Optical Character Recognition)의 오류를 줄이기 위해 본 논문에서는 교정 어휘 쌍의 혼동 행렬(confusion matrix)과 나이브 베이즈 분류기($na{\ddot{i}}ve$ Bayes classifier)를 이용한 철자 교정 시스템을 제안한다. 본 시스템에서는 철자 오류 중 한글에 대한 철자 오류만을 교정하였다. 실험에 사용된 말뭉치는 한국어 원시 말뭉치와 OCR 출력 말뭉치, OCR 정답 말뭉치이다. 한국어 원시 말뭉치로부터 자소 단위의 언어모델(language model)과 교정 후보 검색을 위한 접두사 말뭉치를 구축했고, OCR 출력 말뭉치와 OCR 정답 말뭉치로부터 교정 어휘 쌍을 추출하고, 자소 단위로 분해하여 혼동 행렬을 만들고, 이를 이용하여 오류 모델(error model)을 구축했다. 접두사 말뭉치를 이용해서 교정 후보를 찾고 나이브 베이즈 분류기를 통해 확률이 높은 교정 후보 n개를 제시하였다. 후보 n개 내에 정답 어절이 있다면 교정을 성공하였다고 판단했고, 그 결과 약 97.73%의 인식률을 가지는 OCR에서, 3개의 교정 후보를 제시하였을 때, 약 0.28% 향상된 98.01%의 인식률을 보였다. 이는 한글에 대한 오류를 교정했을 때이며, 향후 특수 문자와 숫자 등을 복합적으로 처리하여 교정을 시도한다면 더 나은 결과를 보여줄 것이라 기대한다.

  • PDF

Using Naïve Bayes Classifier and Confusion Matrix Spelling Correction in OCR (나이브 베이즈 분류기와 혼동 행렬을 이용한 OCR에서의 철자 교정)

  • Noh, Kyung-Mok;Kim, Chang-Hyun;Cheon, Min-Ah;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.310-312
    • /
    • 2016
  • OCR(Optical Character Recognition)의 오류를 줄이기 위해 본 논문에서는 교정 어휘 쌍의 혼동 행렬(confusion matrix)과 나이브 베이즈 분류기($na{\ddot{i}}ve$ Bayes classifier)를 이용한 철자 교정 시스템을 제안한다. 본 시스템에서는 철자 오류 중 한글에 대한 철자 오류만을 교정하였다. 실험에 사용된 말뭉치는 한국어 원시 말뭉치와 OCR 출력 말뭉치, OCR 정답 말뭉치이다. 한국어 원시 말뭉치로부터 자소 단위의 언어 모델(language model)과 교정 후보 검색을 위한 접두사 말뭉치를 구축했고, OCR 출력 말뭉치와 OCR 정답 말뭉치로부터 교정 어휘 쌍을 추출하고, 자소 단위로 분해하여 혼동 행렬을 만들고, 이를 이용하여 오류 모델(error model)을 구축했다. 접두사 말뭉치를 이용해서 교정 후보를 찾고 나이브 베이즈 분류기를 통해 확률이 높은 교정 후보 n개를 제시하였다. 후보 n개 내에 정답 어절이 있다면 교정을 성공하였다고 판단했고, 그 결과 약 97.73%의 인식률을 가지는 OCR에서, 3개의 교정 후보를 제시하였을 때, 약 0.28% 향상된 98.01%의 인식률을 보였다. 이는 한글에 대한 오류를 교정했을 때이며, 향후 특수 문자와 숫자 등을 복합적으로 처리하여 교정을 시도한다면 더 나은 결과를 보여줄 것이라 기대한다.

  • PDF

A Study on Utilizing Smartphone for CMT Object Tracking Method Adapting Face Detection (얼굴 탐지를 적용한 CMT 객체 추적 기법의 스마트폰 활용 연구)

  • Lee, Sang Gu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.1
    • /
    • pp.588-594
    • /
    • 2021
  • Due to the recent proliferation of video contents, previous contents expressed as the character or the picture are being replaced to video and growth of video contents is being boosted because of emerging new platforms. As this accelerated growth has a great impact on the process of universalization of technology for ordinary people, video production and editing technologies that were classified as expert's areas can be easily accessed and used from ordinary people. Due to the development of these technologies, tasks like that recording and adjusting that depends on human's manual involvement could be automated through object tracking technology. Also, the process for situating the object in the center of the screen after finding the object to record could have been automated. Because the task of setting the object to be tracked is still remaining as human's responsibility, the delay or mistake can be made in the process of setting the object which has to be tracked through a human. Therefore, we propose a novel object tracking technique of CMT combining the face detection technique utilizing Haar cascade classifier. The proposed system can be applied to an effective and robust image tracking system for continuous object tracking on the smartphone in real time.

Real-Time Vehicle License Plate Recognition System Using Adaptive Heuristic Segmentation Algorithm (적응 휴리스틱 분할 알고리즘을 이용한 실시간 차량 번호판 인식 시스템)

  • Jin, Moon Yong;Park, Jong Bin;Lee, Dong Suk;Park, Dong Sun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.9
    • /
    • pp.361-368
    • /
    • 2014
  • The LPR(License plate recognition) system has been developed to efficient control for complex traffic environment and currently be used in many places. However, because of light, noise, background changes, environmental changes, damaged plate, it only works limited environment, so it is difficult to use in real-time. This paper presents a heuristic segmentation algorithm for robust to noise and illumination changes and introduce a real-time license plate recognition system using it. In first step, We detect the plate utilized Haar-like feature and Adaboost. This method is possible to rapid detection used integral image and cascade structure. Second step, we determine the type of license plate with adaptive histogram equalization, bilateral filtering for denoise and segment accurate character based on adaptive threshold, pixel projection and associated with the prior knowledge. The last step is character recognition that used histogram of oriented gradients (HOG) and multi-layer perceptron(MLP) for number recognition and support vector machine(SVM) for number and Korean character classifier respectively. The experimental results show license plate detection rate of 94.29%, license plate false alarm rate of 2.94%. In character segmentation method, character hit rate is 97.23% and character false alarm rate is 1.37%. And in character recognition, the average character recognition rate is 98.38%. Total average running time in our proposed method is 140ms. It is possible to be real-time system with efficiency and robustness.

$\emph{A Priori}$ and the Local Font Classification (연역적이고 국부적인 영문자의 폰트 분류법)

  • 정민철
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.3 no.4
    • /
    • pp.245-250
    • /
    • 2002
  • This paper presents a priori and the local font classification method. The font classification uses ascenders, descenders, and serifs extracted from a word image. The gradient features of those sub-images are extracted, and used as an input to a neural network classifier to produce font classification results. The font classification determines 2-font styles (upright or slant), 3-font groups (serif, sans serif, or typewriter), and 7-font names (PostScript fonts such as Avant Garde, Helvetica, Bookman, New Century Schoolbook, Palatino, Times, or Courier). The proposed a priori and local font classification method allows an OCR system consisting of various font-specific character segmentation tools and various mono-font character recognizers.

  • PDF

Extraction of Car License Plate Region Using Histogram Features of Edge Direction (에지 영상의 방향성분 히스토그램 특징을 이용한 자동차 번호판 영역 추출)

  • Kim, Woo-Tae;Lim, Kil-Taek
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.3
    • /
    • pp.1-14
    • /
    • 2009
  • In this paper, we propose a feature vector and its applying method which can be utilized for the extraction of the car license plate region. The proposed feature vector is extracted from direction code histogram of edge direction of gradient vector of image. The feature vector extracted is forwarded to the MLP classifier which identifies character and garbage and then the recognition of the numeral and the location of the license plate region are performed. The experimental results show that the proposed methods are properly applied to the identification of character and garbage, the rough location of license plate, and the recognition of numeral in license plate region.