• Title/Summary/Keyword: Character Code

Search Result 151, Processing Time 0.028 seconds

A Study on Character Recognition using HMM and the Mason's Theorem

  • Lee Sang-kyu;Hur Jung-youn
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.259-262
    • /
    • 2004
  • In most of the character recognition systems, the method of template matching or statistical method using hidden Markov model is used to extract and recognize feature shapes. In this paper, we used modified chain-code which has 8-directions but 4-codes, and made the chain-code of hand-written character, after that, converted it into transition chain-code by applying to HMM(Hidden Markov Model). The transition chain code by HMM is analyzed as signal flow graph by Mason's theory which is generally used to calculate forward gain at automatic control system. If the specific forward gain and feedback gain is properly set, the forward gain of transition chain-code using Mason's theory can be distinguished depending on each object for recognition. This data of the gain is reorganized as tree structure, hence making it possible to distinguish different hand-written characters. With this method, $91\%$ recognition rate was acquired.

  • PDF

A Method for Automatic Detection of Character Encoding of Multi Language Document File (다중 언어로 작성된 문서 파일에 적용된 문자 인코딩 자동 인식 기법)

  • Seo, Min Ji;Kim, Myung Ho
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.4
    • /
    • pp.170-177
    • /
    • 2016
  • Character encoding is a method for changing a document to a binary document file using the code table for storage in a computer. When people decode a binary document file in a computer to be read, they must know the code table applied to the file at the encoding stage in order to get the original document. Identifying the code table used for encoding the file is thus an essential part of decoding. In this paper, we propose a method for detecting the character code of the given binary document file automatically. The method uses many techniques to increase the detection rate, such as a character code range detection, escape character detection, character code characteristic detection, and commonly used word detection. The commonly used word detection method uses multiple word database, which means this method can achieve a much higher detection rate for multi-language files as compared with other methods. If the proportion of language is 20% less than in the document, the conventional method has about 50% encoding recognition. In the case of the proposed method, regardless of the proportion of language, there is up to 96% encoding recognition.

Design and Implementation of Conversion System Between ISO/IEC 10646 and Multi-Byte Code Set (ISO/IEC 10646과 멀티바이트 코드 세트간의 변환시스템의 설계 및 구현)

  • Kim, Chul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.4
    • /
    • pp.319-324
    • /
    • 2018
  • In this paper, we designed and implemented a code conversion method between ISO/IEC 10646 and the multi-byte code set. The Universal Multiple-Octet Coded Character Set(UCS) provides codes for more than 65,000 characters, huge increase over ASCII's code capacity of 128 characters. It is applicable to the representation, transmission, interchange, processing, storage, input and presentation of the written form of the language throughout the world. Therefore, it is so important to guide on code conversion methods to their customers during customer systems are migrated to the environment which the UCS code system is used and/or the current code systems, i.e., ASCII PC code and EBCDIC host code, are used with the UCS together. Code conversion utility including the mapping table between the UCS and IBM new host code is shown for the purpose of the explanation of code conversion algorithm and its implementation in the system. The programs are successfully executed in the real system environments and so can be delivered to the customer during its migration stage from the UCS to the current IBM code system and vice versa.

Considering the scrambling code of the line Study on the New Korea joint protection Standard Hangul character (회선부호의 스크램블링을 고려한 새로운 한국표준 한글글자마디부호에 관한 연구)

  • Park, Yo-Seph;Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.12
    • /
    • pp.1345-1354
    • /
    • 2015
  • This paper, information communication code standard($KS{\times}1001$, confirmation in 2004), as definded in Hangul Character Code Hangul AMI/HDB-3 the code set for the new system Hangul consonant and vowel tables presented. The result of the existing system and the code set ($4{\times}4$) bit source coding rules for comparing the frequency of use Hangul consonant and vowel tables(The National Institute of The Korea Language) and statistices showed that 44% of the data processing efficiency is improved.

A Study on the Hangul Character Code System for KS X 1001 Information Interchange considering AMI/HDB-3 Line Encoding and HDLC Flag (AMI/HDB-3 회선부호화 및 HDLC FLAG를 고려한 KS X 1001 정보교환용 한글낱자 부호체계 개선연구)

  • Woo, Je-Teak;Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.1
    • /
    • pp.65-72
    • /
    • 2015
  • AMI / HDB-3 method used a scrambling technique is used primarily for long distance data transmission line encoding. In this paper, information communication code standard (KS X 1001; 2014 confirmation), as defined in Hangul Character Code HDLC Flag bit or character stuffing at the data link layer and physical layer with respect to the code set for Hangul AMI / HDB-3 the code set for the new system to increase the data transmission efficiency Hangul consonant and vowel tables presented in terms of scrambling. The result of the existing system and the code set ($4{\times}4$) bit source coding rules for comparing the frequency of use Hangul consonant and vowel tables and statistics showed that about 22.01% of the data processing efficiency is improved.

Automatic Container Code Recognition from Multiple Views

  • Yoon, Youngwoo;Ban, Kyu-Dae;Yoon, Hosub;Kim, Jaehong
    • ETRI Journal
    • /
    • v.38 no.4
    • /
    • pp.767-775
    • /
    • 2016
  • Automatic container code recognition from a captured image is used for tracking and monitoring containers, but often fails when the code is not captured clearly. In this paper, we increase the accuracy of container code recognition using multiple views. A character-level integration method combines recognized codes from different single views to generate a new code. A decision-level integration selects the most probable results from the codes from single views and the new integrated code. The experiment confirmed that the proposed integration works successfully. The recognition from single views achieved an accuracy of around 70% for the test images collected on a working pier, whereas the proposed integration method showed an accuracy of 96%.

A Study on Developing the Identification Code System for Korean Sci-Tech Journals for KSCI (KSCI 구축을 위한 국내 학술지 식별체계 연구)

  • 김선호;김태중
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.37 no.3
    • /
    • pp.57-77
    • /
    • 2003
  • The objective of the study is to develop the identification code of the Korean sci-tech journals for KSCI. To achieve the purpose, the study has researched and analyzed a variety of the major international and national serials or information objects identification code systems. And then, KOJIC(KOrean Journal Identification Code) has been developed. KOJIC is unique, unambiguous identifiers for titles of Korean journals in all subject areas. The concepts of KOJIC are simplicity, mnemonics, internationalization, and extensibility of its use. KOJIC is a six-character and alphanumeric code and has one check character.

  • PDF

A Study on the Neural Network for the Character Recognition (문자인식을 위한 신경망컴퓨터에 관한 연구)

  • 이창기;전병실
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.29B no.8
    • /
    • pp.1-6
    • /
    • 1992
  • This paper proposed a neural computer architecture for the learning of script character pattern recognition categories. Oriented filter with complex cells preprocess about the input script character, abstracts contour from the character. This contour normalized and inputed to the ART. Top-down attentional and matching mechanisms are critical in self-stabilizing of the code learning process. The architecture embodies a parallel search scheme that updates itself adaptively as the learning process unfolds. After learning ART self-stabilizes, recognition time does not grow as a function of code complexity. Vigilance level shows the similarity between learned patterns and new input patterns. This character recognition system is designed to adaptable. The simulation of this system showed satisfied result in the recognition of the hand written characters.

  • PDF

Implementation of the Container ISO Code Recognition System for Real-Time Processing (실시간 처리를 위한 컨테이너 ISO코드 인식시스템의 구현)

  • Choi Tae-Wan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.8
    • /
    • pp.1478-1489
    • /
    • 2006
  • This paper describes system to extract ISO codes in container image. A container ISO code recognition system for real-time processing is made of 5 core parts which are container ISO code detection and image acquisition, ISO code region extraction, individual character extraction, character recognition and database. Among them, the accuracy of ISO code extraction can affect significantly the accuracy of system recognition rate, and also the more exact extraction of ISO code is required in various weather and environment conditions. The proposed system produces binary of the ISO code's template lesions using an adaptive thresholding, extracts candidate regions containing distribution of ISO code, and recognizes ISO codes as detecting a final region through the verifications by using character distribution characteristics of ISO code among the extracted candidates. Experimental results reveal that ISO codes can be efficiently extracted by the proposed method.

Consideration of Roman Character in KS × 1001 Code System for Information Interchange considered AMI/HDB-3 and HDLC FLAG (AMI/HDB-3 회선부호화 및 HDLC FLAG를 고려한 KS × 1001 정보 교환용 로마문자 부호체계고찰)

  • Hong, Wan-Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.7
    • /
    • pp.1017-1023
    • /
    • 2013
  • Datacommunications transmit the source codes that are coded in information devices, such as computer to the transmission line by means of the line coded signal. AMI method is applied to the line coding method to transmit the signal for long distance. The disadvantage of the AMI method is to loss the bit synchronization when consecutive binary bit '0' over 4ea is coming into line coder. The scrambling technique is used to overcome the problem. The HDB-3 scrambling method is used in Korea standard which standard in ITU-T. When the HDB-3 technology is used. the method should convert the consecutive bit '0' over 4ea to certain bits format. As a result, when there are many such kind of '0' bit stream in source codes, data transmission efficiency will be decreased to treat in line coder, etc. This paper is directed to study the Roman character code system in $KS{\times}1001$, Korea standard for information exchange code in datacommunication systems. Based on the study result, this paper proposed the maximum optimized Roman character code system. In the study, Character coding rule for $4{\times}4$bits and the statistical data for roman character using frequency were considered to simulate. The paper shows the result that when the proposed new roman character coding system is applied to use, the data transmission efficiency could be increased to about 134% compared to existing code system.