• Title/Summary/Keyword: Hangul text

Search Result 96, Processing Time 0.02 seconds

A Rule-Based Analysis from Raw Korean Text to Morphologically Annotated Corpora

  • Lee, Ki-Yong;Markus Schulze
    • Language and Information
    • /
    • v.6 no.2
    • /
    • pp.105-128
    • /
    • 2002
  • Morphologically annotated corpora are the basis for many tasks of computational linguistics. Most current approaches use statistically driven methods of morphological analysis, that provide just POS-tags. While this is sufficient for some applications, a rule-based full morphological analysis also yielding lemmatization and segmentation is needed for many others. This work thus aims at 〔1〕 introducing a rule-based Korean morphological analyzer called Kormoran based on the principle of linearity that prohibits any combination of left-to-right or right-to-left analysis or backtracking and then at 〔2〕 showing how it on be used as a POS-tagger by adopting an ordinary technique of preprocessing and also by filtering out irrelevant morpho-syntactic information in analyzed feature structures. It is shown that, besides providing a basis for subsequent syntactic or semantic processing, full morphological analyzers like Kormoran have the greater power of resolving ambiguities than simple POS-taggers. The focus of our present analysis is on Korean text.

  • PDF

Typography for Efficient Visual Flow of Text Focused on Hangul (텍스트의 효율적 시각흐름을 위한 타이포그래피-한글을 중심으로-)

  • 신경주;김지현
    • Archives of design research
    • /
    • v.11 no.3
    • /
    • pp.187-196
    • /
    • 1998
  • This study is intended to suggest the method of text arrangement in order to enhance visual perception which would help darify its communication. One hundred subjects without restriction of gender and profession participated in each experiment :their reading time was measured by the 0.01 second. The Analysis of Variance(two-way ANOVA without interaction) was performed for each experiment and the p value was 0.0001 which implies that there was a strong consistency among test results. Based on the first results, it is found that there is a consistent relationship between type size and text line length, and the following discovery was made ; the most effective ratio of type size to line length is approximately 1 :8. Judging from the Second and Third results, it seems that the vertical text arrangement is most efficient for reading regardless of text line length. So to make same rreading direction is more important than to narrow down the eye moving distance between column and column for efficient visual flow. This research supports the view that considering efficient eye movement on text, it is important to understand the mentioned variables that affect visual interpretation.

  • PDF

Recognition of Various Printed Hangul Images by using the Boundary Tracing Technique (경계선 기울기 방법을 이용한 다양한 인쇄체 한글의 인식)

  • Baek, Seung-Bok;Kang, Soon-Dae;Sohn, Young-Sun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.1-5
    • /
    • 2003
  • In this paper, we realized a system that converts the character images of the printed Korean alphabet (Hangul) to the editable text documents by using the black and white CCD camera, We were able to abstract the contours information of the character which is based on the structural character by using the boundary tracing technique that is strong to the noise on the character recognition. By using the contours information, we recognized the horizontal vowels and vertical vowels of the character image and classify the character into the six patterns. After that, the character is divided to the unit of the consonant and vowel. The vowels are recognized by using the maximum length projection. The separated consonants are recognized by comparing the inputted pattern with the standard pattern that has the phase information of the boundary line change. We realized a system that the recognized characters are inputted to the word editor with the editable KS Hangul completion type code.

Secure Steganography Based on Triple-A Algorithm and Hangul-jamo (Triple-A 알고리즘과 한글자모를 기반한 안전한 스테가노그래피)

  • Ji, Seon-Su
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.507-513
    • /
    • 2018
  • Steganography is a technique that uses hidden messages to prevent anyone apart from knowing the existence of a secret message, except the sender and trusted recipients. This paper applies 24 bit color image as cover medium. And a 24-bit color image has three components corresponding to red, green and blue. This paper proposes an image steganography method that uses Triple-A algorithm to hide the secret (Hangul) message by arbitrarily selecting the number of LSB bits and the color channel to be used. This paper divides the secret character into the chosung, jungsung and jongsung, and applies crossover, encryption and arbitrary insertion positions to enhance robustness and confidentiality. Experimental results of the proposed method show that insertion capacity and correlation are excellent and acceptable image quality level. Also, considering the image quality, it was confirmed that the size of LSB should be less than 2.

A Study of developing Hangul text editor for X-Window using a one-byte Hangul code supporting ISO 2022 (ISO 2022 를 따르는 한 바이트 한글 부호계를 지원하는 X-Window용 한글 문서 편집기 개발 연구)

  • Cho, Chung-Lae;Kim, Kyong-Sok
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.73-79
    • /
    • 1995
  • 현재 정보 관련 응용 분야 가운데 ISO 2022 를 따르는 분야가 아주 않은데, 현재 쓰고 있는 한글 부호계들은 ISO 2022 를 절대로 지원하지 못하거나, ISO 2022를 지원하더라도 한글을 재대로 지원하지 못하는 문제가 있다. 이러한 문제를 해결하기 위해서 ISO 2022 를 지원하면서 한글을 제대로 지원하는 새로운 한 바이트 한글 부호계를 만들었다. 새로운 한 바이트 한글 부호계는 요즘 한글 11,172 소리마디를 모두 표현 할 수 있으며, 불완전한 소리마디를 나타내는 방법으로 기존의 채움 글자 방식을 버리고 자연스럽고 한글의 특성에 맞는 뗌 글자 방식을 택하였다. 본 연구에서는 새로운 한 바이트 한글 부호계를 지원하는 X-Window용 한글 문서 편집기를 개발해 봄으로써 그 운용 가능성을 검증하였다. 한 바이트 한글 부호계를 지원하는 문서 편집기의 운용 환경으로 유닉스 운영체제 하에서 돌아가는 X-Window 시스템을 택하였고, 한글 입출력 부분을 모티프 (Motif)의 위짓 (widget) 형태로 구현하여 다른 응용 프로그램에서도 쉽게 한 바이트 한글 부호계를 지원할 수 있게 하였다.

  • PDF

A Reliability Verification of Screening Time Prediction Reporting of 'Cine-Hangeul'

  • Jeon, Byoung-Won
    • Journal of Multimedia Information System
    • /
    • v.7 no.2
    • /
    • pp.141-146
    • /
    • 2020
  • Cine-Hangeul is a program that can predict the running time of a movie based on the screenplay before production. This paper seeks to verify the prediction reporting function of Cine-Hangeul, which is the standard Korean screenplay format. Moreover, this paper presents a method to increase the accuracy of the Cine-Hangeul reporting function. The objective of this paper is to offer a correction method based on scientific evidence because the current Cine-Hangeul reporting function has many errors. The verification process for five scenarios and movies confirmed that the default setting value of Cine- Hangeul's screening time prediction reporting was many errors. Cine-Hangeul analyzes the amount of textual information to predict the time of the scene and the time of the dialogue and helps predict the total time of the movie. Therefore, if a certain amount of text information is not available, the accuracy is unreliable. The current Cine-Hangeul prediction report confirms that the efficiency is high when the scenario volume is about 90 to 100 pages. As a result, prediction of screening time by Cine-Hangeul, a Korean scenario standard format program, confirmed the verification that it could secure the same level of reliability as the actual screening time by correcting the reporting settings. This verification also affirms that when applying about 50 percent of the basic set of screening time reporting, it is almost identical to the screening time.

Implementation of an efficient Pocket PC- based Hangul Matching System (Pocket PC기반의 효율적인 한글 정합 시스템 구현)

  • Park Jong-Min;Cho Beom-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.7
    • /
    • pp.1546-1552
    • /
    • 2004
  • Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(Pocket PC) for supporting natural and convenient data input. One of the most important issues is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique.

A Verification Method for Handwritten text in Off-line Environment Using Dynamic Programming (동적 프로그래밍을 이용한 오프라인 환경의 문서에 대한 필적 분석 방법)

  • Kim, Se-Hoon;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.12
    • /
    • pp.1009-1015
    • /
    • 2009
  • Handwriting verification is a technique of distinguishing the same person's handwriting specimen from imitations with any two or more texts using one's handwriting individuality. This paper suggests an effective verification method for the handwritten signature or text on the off-line environment using pattern recognition technology. The core processes of the method which has been researched in this paper are extraction of letter area, extraction of features employing structural characteristics of handwritten text, feature analysis employing DTW(Dynamic Time Warping) algorithm and PCA(Principal Component Analysis). The experimental results show a superior performance of the suggested method.

A Study on Conventional Expression of Hangul Ganchal and Email (조선시대 한글 간찰과 이메일의 상투적 표현 고찰)

  • Jeon, Byeong-yong
    • (The)Study of the Eastern Classic
    • /
    • no.49
    • /
    • pp.431-459
    • /
    • 2012
  • The purpose of this article is to compare and analyze the conventional expression of Hangul Ganchal in Cheosun Dynasty and Email. Conventional expression is used remarkably in introductions and conclusions. In introduction, it is used for addressing and safety greetings while in conclusion, it is used for closing address and closing words. In Cheosun Dynasty, an envelope of Ganchal only included the details of the receiver because the letter was genuinely delivered by someone who knew the receiver and the sender very well. An envelope of Ganchal is applicable to the screen of the internet which is used for emailing. In an email, we see the name of the sender and the title of the text and once we click the title, we are able to view the text. The difference between the Ganchal and the email was reflected on how the receiver's detail showed on Ganchal and the email show the sender's details. In a case of addressing in a letter while using the conventional expression, we can see how we use "To~" in humble term and " ~께" in a honorific term. We confirmed that the conventional expression has not yet settled in both of the Gnachal and email for the seasonal greetings. The safety greetings comprised with both of the senders' and receivers' latest updates. In Ganchal, this composition is well described conventionally, whereas in emails, only the receivers' latest news are written but the senders' latest updates are hard to be seen throughout the text. In Ganchal's closing section, the closing address and closing words were expressed conventionally. However, in the case of email; those were again hard to be found throughout. To conclude, in Ganchal the conventional expression was developed and placed in 16thcentury(Sun-eon) when there was a focus in our native language. In 17thcentury(Hyeon-eon), it stood still for a sometime and moved on to 19thcentury(Jing-eon) when there was a strong in fluence of Hangul Ganchal, which resulted in regression to the conservative expression. In general, we are able to confirm that the conventional expression is slowly disappearing.

The Study on the ${\ulcorner}$Sun Gi Il Il Bun Wi Sa Si(順氣一日分爲四時)${\lrcorner}$ of the ${\ulcorner}$Young Chu(靈樞)${\lrcorner}$ ("영추.순기일일분위사시(靈權.順氣一日分爲四時)"에 대한 연구(硏究))

  • Kim, Young-Ha;Ruk, Sang-Won
    • Journal of Korean Medical classics
    • /
    • v.18 no.1 s.28
    • /
    • pp.33-48
    • /
    • 2005
  • The purpose of this study is that translates ${\ulcorner}$Sun Gi Il Il Bun Wi Sa Si${\lrcorner}$ in the ${\ulcorner}$Young Chu(靈樞)${\lrcorner}$ as a modern words because it is hard to understand which was written by classical words. We revised the original text with the 7 other classic books and classified annotations of the 6 annotated books according to the similar contents. We classified this volume by 3 chapters, and added Hangul suffixes to the original text. The Five types of changes(五變) in the second chapter is meaning to the mutual relationships among the Five viscera and Color, Time, Day, Note, Taste. The word order of contents in the second chapter must be unified follow the Color, Time, Day, Note, Tastes. The Five types of changes in the third chapter must be revise the Five types of diseases(五病) on the bases of the ${\ulcorner}$You Kyoung(類經)${\lrcorner}$.

  • PDF