DOI QR코드

DOI QR Code

Variations of AlexNet and GoogLeNet to Improve Korean Character Recognition Performance

  • Lee, Sang-Geol (Human Resources Development Group for Women Maker in Integrated Engineering, Dongguk University) ;
  • Sung, Yunsick (Dept. of Multimedia Engineering, Dongguk University) ;
  • Kim, Yeon-Gyu (Dept. of Computer Science Engineering, Pusan National University) ;
  • Cha, Eui-Young (Dept. of Computer Science Engineering, Pusan National University)
  • Received : 2017.05.12
  • Accepted : 2017.07.31
  • Published : 2018.02.28

Abstract

Deep learning using convolutional neural networks (CNNs) is being studied in various fields of image recognition and these studies show excellent performance. In this paper, we compare the performance of CNN architectures, KCR-AlexNet and KCR-GoogLeNet. The experimental data used in this paper is obtained from PHD08, a large-scale Korean character database. It has 2,187 samples of each Korean character with 2,350 Korean character classes for a total of 5,139,450 data samples. In the training results, KCR-AlexNet showed an accuracy of over 98% for the top-1 test and KCR-GoogLeNet showed an accuracy of over 99% for the top-1 test after the final training iteration. We made an additional Korean character dataset with fonts that were not in PHD08 to compare the classification success rate with commercial optical character recognition (OCR) programs and ensure the objectivity of the experiment. While the commercial OCR programs showed 66.95% to 83.16% classification success rates, KCR-AlexNet and KCR-GoogLeNet showed average classification success rates of 90.12% and 89.14%, respectively, which are higher than the commercial OCR programs' rates. Considering the time factor, KCR-AlexNet was faster than KCR-GoogLeNet when they were trained using PHD08; otherwise, KCR-GoogLeNet had a faster classification speed.

Keywords

E1JBB0_2018_v14n1_205_f0001.png 이미지

Fig. 1. KCR-AlexNet architecture.

E1JBB0_2018_v14n1_205_f0002.png 이미지

Fig. 2. Inception module for GoogLeNet and KCR-GoogLeNet.

E1JBB0_2018_v14n1_205_f0003.png 이미지

Fig. 3. KCR-GoogLeNet architecture.

E1JBB0_2018_v14n1_205_f0004.png 이미지

Fig. 4. Example of transformed Input data from PHD08.

E1JBB0_2018_v14n1_205_f0005.png 이미지

Fig. 5. Comparison between KCR-AlexNet and KCR-GoogLeNet. (a) E1-Set_1, (b) E1-Set_2, (c) E1-Set_3, (d) E1-Set_4, and (e) E1-Set_5.

E1JBB0_2018_v14n1_205_f0006.png 이미지

Fig. 6. Comparison between KCR-AlexNet and KCR-GoogLeNet.

Table 1. KCR-GoogLeNet incarnation of the inception architecture

E1JBB0_2018_v14n1_205_t0001.png 이미지

Table 2. PHD08 composition

E1JBB0_2018_v14n1_205_t0002.png 이미지

Table 3. Five experimental data sets

E1JBB0_2018_v14n1_205_t0003.png 이미지

Table 4. Test accuracies for the last training iteration and average times for a single iteration

E1JBB0_2018_v14n1_205_t0004.png 이미지

Table 5. Used fonts for PHD08 and new data set

E1JBB0_2018_v14n1_205_t0005.png 이미지

Table 6. Classification success rate comparison between KCR-AlexNet, KCR-GoogLeNet and other programs

E1JBB0_2018_v14n1_205_t0006.png 이미지

References

  1. D. Cireşan, U. Meier, J. Masci, and J. Schmidhuber, "Multi-column deep neural network for traffic sign classification," Neural Networks, vol. 32, pp. 333-338, 2012. https://doi.org/10.1016/j.neunet.2012.02.023
  2. N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A convolutional neural network for modelling sentences," 2014 [Online]. Available: https://arxiv.org/abs/1404.2188.
  3. P. L. Callet, C. Viard-Gaudin, and D. Barba, "A convolutional neural network approach for objective video quality assessment," IEEE Transactions on Neural Networks, vol. 17, no. 5, pp. 1316-1327, 2006. https://doi.org/10.1109/TNN.2006.879766
  4. D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, "Convolutional neural network committees for handwritten character classification," in Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), Beijing, China, 2011, pp. 1135-1139.
  5. T. Wang, D. J. Wu, A. Coates, and A. Y. Ng, "End-to-end text recognition with convolutional neural networks," in Proceedings of the 21st International Conference on Pattern Recognition(ICPR 2012), Tsukuba, Japan, 2012, pp. 3304-3308.
  6. Y. Zhang, "Deep convolutional network for handwritten Chinese character recognition," [Online]. Available: http://yuhao.im/files/Zhang_CNNChar.pdf.
  7. Z. Zhong, L. Jin, and Z. Xie, "High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature map," in Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 846-850.
  8. W. Yang, L. Jin, Z. Xie, and Z. Feng, "Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge," in Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 551-555.
  9. D. C. Hwang and S. S. Kim, "Hangul recognition using path following algorithm," IE Interfaces, vol. 3, no. 2, pp. 53-62, 1990.
  10. B. K. Sin, and J. H. Kim, "On-line handwritten character recognition with hidden Markov models," in Proceedings of the 4th Annual Conference on Human and Cognitive Language Technology, Seoul, Korea, 1992, pp. 533-542.
  11. J. K. Chung, S. I. Kim, and J. C. Namgung, "A study on an on-line handwritten Hangul character recognition by identifying relative positions of strokes," Journal of Information Technology Applications and Management, vol. 4, no. 2, pp. 65-78, 1997.
  12. J. Y. Ha, and B. K. Shin, "Optimization of number of states in HMM for on-line Hangul recognition," Proceeding of the Korea Information Science Society, vol. 25, no. 2, pp. 372-374, 1998.
  13. J. H. Lee, J. H. Ahn, and I. B. Lee, "Elastic curvature matching for online handwritten Hangul recognition," Proceeding of the Korea Information Science Society, vol. 35, no. 2, pp. 238-239, 2008.
  14. H. S. Cho, "A new feature-extraction method using wavelet transformation and fuzzy data for handwritten Hangul recognition," Journal of Korean Institute of Information Technology, vol. 3, no. 4, pp. 11-17, 2005.
  15. I. J. Kim and X. Xie, "Handwritten Hangul recognition using deep convolutional neural network," International Journal of Document Analysis and Recognition, vol. 18, no. 1, pp. 1-13, 2011. https://doi.org/10.1007/s10032-014-0229-4
  16. D. S. Ham, D. Y. Lee, I. S. Jung, and I. S. Oh, "Construction of printed Hangul character database PHD08," Journal of the Korea Contents Association, vol. 8, no. 11, pp. 33-40, 2008. https://doi.org/10.5392/JKCA.2008.8.11.033
  17. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceeding of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791
  18. A. Krizhevsky, I. Sutskever and G. E. Hinton, "ImageNet classification with deep convolutional neural network," in Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12), Lake Tahoe, NV, 2012, pp. 1097-1105.
  19. ImageNet Large Scale Visual Recognition Challenge [Online]. Available: http://www.image-net.org/challenges/LSVRC/.
  20. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," in Proceeding of the IEEE Conference on Computer Vison and Patter Recognition, Boston, MA, 2015, pp. 1-9.
  21. S. Arora, A. Bhaskara, R. Ge, and T. Ma, "Provable bounds for learning some deep representations," 2013 [Online]. Available: https://arxiv.org/abs/1310.6343.
  22. Linear interpolation [Online]. Available: https://en.wikipedia.org/wiki/Linear_interpolation.
  23. Caffe (convolutional architecture for fast feature embedding) [Online]. Available: http://caffe.berkeleyvision.org/.
  24. ABBYY FineReader 12 [Online]. Available: http://www.retia.co.kr/cnt/products/products.html?category=1&uid=24&name=finereader-12&tab=1.
  25. ABC OCR scanner app [Online]. Available: https://itunes.apple.com/us/app/scanner-ocr-optical-character/id777913435.
  26. Office Lens app [Online]. Available: https://itunes.apple.com/kr/app/office-lens/id975925059?mt=8.