• Title/Summary/Keyword: Hangeul and Hanja

Search Result 4, Processing Time 0.019 seconds

A study on Unifying Hanja Variant Groups of Korea and China for LGR (Label Generation Rule) of Internet Top-Level Hangeul Hanja Domain

  • Kim, Kyongsok
    • International journal of advanced smart convergence
    • /
    • v.7 no.2
    • /
    • pp.7-21
    • /
    • 2018
  • The author studied the process of unifying Hanja variant groups of Korea and China for LGR (Label Generation Rule) of Internet Top-Level Hangeul Hanja Domain and possible confusion between Hangeul syllable and Hanja character. Among 3518 Chinese variant groups, Korea and China need not review variant groups which include no or just one Korean Hanja character. Korea and China reviewed 304 Chinese variant groups (9% of the 3518 Chinese variant groups) which include two or more Korean Hanja characters. By doing so, Korea and China succeeded in efficiently unifying variant groups. Unification process of variant groups which is the main core of Korea-China coordination and almost final unification result is summarized in this paper. In addition, the author analyzed systematically whether some Hanja character could be confused with a Hangeul syllable and obtained a good result which was not expected at the beginning. Probably this kind of systematic analysis has not been performed in the past and seems the first attempt, which is one of the contributions of this paper. The author also reviewed how to express K-LGR in XML for submission to ICANN.

A Study on Classification into Hangeul and Hanja in Text Area of Printed Document (인쇄체 문서의 문자영역에서 한글과 한자의 구별에 관한 연구)

  • 심상원;이성범;남궁재찬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.6
    • /
    • pp.802-814
    • /
    • 1993
  • This paper propose an algorithm for preprocessing of character recognition, which classify characters into Hangeul and Hanja. In this study, we use the 9 structural chacteristics of Hanja which isn't affected by deformation of size and style of characters and rates based on character size to classify characters. Firstly, we process the blocking to segment each characters. Secondly, on this segmented characters, we apply algorithm proposed in this paper to classify Hangeul and Hanja. Finally, we classify characters into Hangeul and Hanja, respectively. An experiment with 2350 Hangeul and 4888 Hanja printed Gothic and Mincho style of KS-C 5601 are carried out. We experiment on typeface sample book, newspapers, academic society's papers, magazines, textbooks and documents written out word processor to obtain the classifying rates of 98.8%, 92%, 96%, 98% and 98%, respectively.

  • PDF

Hanja Information in the Entries of Korean Unabridged Dictionary (국어대사전의 표제어에 나타나는 한자 정보)

  • Kim, Cheol-Su
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.438-446
    • /
    • 2010
  • For language information processing that includes both Hangul and Hanja, an electronic dictionary supporting Hangul and Hanja simultaneously is necessary. This paper examined statistical information on Hanja entries of Korean Unabridged Dictionary such as the number of entries that include Hanja based on the KSC-5601 character set, the frequency of the pronunciation and meaning of each character of Hanja included in the entries, the frequency per part of speech of Hanja in entries and the average number of Hanja characters per entry. At least one or more of Hanja characters appear in 303,951 entries out of 440,594, accounting for 68.99% of the total. 858,595 characters of Hanja are included in the 440,594 entries, which is 1.95 Hanja characters per entry. As the average syllable length of the entries is 3.56 and the average count of the Hanja characters per entry is 1.96, it can be said that 54.7% of all the characters of the entries are in Hanja. Among 4,888 Hanja character codes, 4,660 are used once or more, whereas 228 Hanja codes never appear in any entry. There were 5 characters which appear more than 4,000 times. A total of 858,595 Hanja characters used in all the entries correspond to 471 Hangeul codes.

An Web-based Mapping by Constructing Database of Geographical Names (지명 데이터베이스 구축을 통한 웹지도화 방안)

  • Kim, Nam-Shin
    • Journal of the Korean association of regional geographers
    • /
    • v.16 no.4
    • /
    • pp.428-439
    • /
    • 2010
  • Map of geographical names can give us information for understanding of region because geographical name reflects regional perception of human. This study aimed to make an web-based map by constructing database of geographical names. Main contents carried out research on methods for classification of geographical names, database construction, and mapping on the website. Geographical name classified into four categories of the physical geography, culture and historical geography, economic geography, and the other and also, 18 sub-categories by classification criteria. Geographical name designed to input by collecting geographical names from paper-based maps and vernacular place names only known to the local region. Fields of database consisted of address, coordinates, geographical name(hangeul, hanja), classification, explanation, photographs. Map of geographical names can be represented with regional geographical information. The result of research is expected to offer information for distribution of geographical names as well as regional interpretation.

  • PDF