DOI QR코드

DOI QR Code

A study on Unifying Hanja Variant Groups of Korea and China for LGR (Label Generation Rule) of Internet Top-Level Hangeul Hanja Domain

  • Kim, Kyongsok (School of computer science and engineering, Pusan National University)
  • Received : 2018.04.10
  • Accepted : 2018.05.01
  • Published : 2018.06.30

Abstract

The author studied the process of unifying Hanja variant groups of Korea and China for LGR (Label Generation Rule) of Internet Top-Level Hangeul Hanja Domain and possible confusion between Hangeul syllable and Hanja character. Among 3518 Chinese variant groups, Korea and China need not review variant groups which include no or just one Korean Hanja character. Korea and China reviewed 304 Chinese variant groups (9% of the 3518 Chinese variant groups) which include two or more Korean Hanja characters. By doing so, Korea and China succeeded in efficiently unifying variant groups. Unification process of variant groups which is the main core of Korea-China coordination and almost final unification result is summarized in this paper. In addition, the author analyzed systematically whether some Hanja character could be confused with a Hangeul syllable and obtained a good result which was not expected at the beginning. Probably this kind of systematic analysis has not been performed in the past and seems the first attempt, which is one of the contributions of this paper. The author also reviewed how to express K-LGR in XML for submission to ICANN.

Keywords

References

  1. Guidelines for Developing Script-Specific Label Generation Rules for Integration into the Root Zone LGR, Version 2015-04-24. https://www.icann.org/en/system/files/files/Guidelines-for-LGR-20150424.pdf.
  2. Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels, Version 2013-03-20b. https://www.icann.org/en/system/files/files/lgr-procedure-20mar13-en.pdf.
  3. K. Konishi, K. Juang, H. Qian and Y. Ko. RFC 3743, Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean. Apr., 2004.
  4. X. Lee, W. Mao, E. Chen, N. Hsu and J. Klensin. RFC 4713, Registration and Administration Recommendations for Chinese Domain Names. Oct., 2006.
  5. K. KIM, "A study on a method of selecting variant groups to be reviewed for LGR (Label Generation Rule) of Internet Top-Level Hanja Domains," KIPS Tr. Comp. and Comm. Sys. (Korea Information Processing Society), Vol. 5, No. 1, pp. 7-16, 2016. DOI: https://doi.org/10.3745/KTCCS.2016.5.1.7
  6. ISO/IEC 10646, Information technology - Universal Coded Character Set (UCS), fifth edition, Dec. 2017.
  7. The Whole Table of Simplified Characters (简化字 总表), 1964, China Character Reform Committee, China Ministry of Culture, China Ministry of Education.
  8. CGP MSS 2015.04.30. Chinese repertoire of 12563 Hanzi characters and 3093 variant groups.
  9. C-LGR 2016.07.20. Chinese repertoire of 19738 Hanzi characters and 3518 variant groups.
  10. C-LGR 2017.03.31. Chinese repertoire of 19744 Hanzi characters and 3475 variant groups.
  11. K-LGR v0.3, Korean repertoire of Hangeul syllables and 4819 Hanja characters and 37 variant groups. Document number klgp171_4. 2015.08.13.
  12. K-LGR v0.5, Korean repertoire of Hangeul syllables and 4819 Hanja characters and 50 variant groups. Document number klgp200_51e. 2016.09.28.
  13. K-LGR v0.7, Korean repertoire of Hangeul syllables and 4758 Hanja characters and 152 variant groups. Document number klgp220_78g. 2017.03.03.
  14. Proposal for a Korean Script Root Zone LGR, LGR Version 1.0, Korean Script Generation Panel. Document number klgp220_101f. 2018.01.25.
  15. K. Davies and A. Freytag. RFC 7940, Representing Label Generation Rulesets Using XML. August 2016.