DOI QR코드

DOI QR Code

Character Recognition Algorithm in Low-Quality Legacy Contents Based on Alternative End-to-End Learning

대안적 통째학습 기반 저품질 레거시 콘텐츠에서의 문자 인식 알고리즘

  • Lee, Sung-Jin (Department of AI Convergence, Chonnam National University) ;
  • Yun, Jun-Seok (Department of AI Convergence, Chonnam National University) ;
  • Park, Seon-hoo (Department of SW Engineering, Chonnam National University) ;
  • Yoo, Seok Bong (Department of AI Convergence, Chonnam National University)
  • Received : 2021.08.23
  • Accepted : 2021.09.23
  • Published : 2021.11.30

Abstract

Character recognition is a technology required in various platforms, such as smart parking and text to speech, and many studies are being conducted to improve its performance through new attempts. However, with low-quality image used for character recognition, a difference in resolution of the training image and test image for character recognition occurs, resulting in poor accuracy. To solve this problem, this paper designed an end-to-end learning neural network that combines image super-resolution and character recognition so that the character recognition model performance is robust against various quality data, and implemented an alternative whole learning algorithm to learn the whole neural network. An alternative end-to-end learning and recognition performance test was conducted using the license plate image among various text images, and the effectiveness of the proposed algorithm was verified with the performance test.

문자 인식은 스마트 주차, text to speech 등 최근 다양한 플랫폼에서 필요로 하는 기술로써, 기존의 방법과 달리 새로운 시도를 통하여 그 성능을 향상시키려는 연구들이 진행되고 있다. 그러나 문자 인식에 사용되는 이미지의 품질이 낮을 경우, 문자 인식기 학습용 이미지와 테스트 이미지간에 해상도 차이가 발생하여 정확도가 떨어지는 문제가 발생된다. 이를 해결하기 위해 본 논문은 문자 인식 모델 성능이 다양한 품질 데이터에 대하여 강인하도록 이미지 초해상도 및 문자 인식을 결합한 통째학습 신경망을 설계하고, 대안적 통째학습 알고리즘을 구현하여 통째 신경망 학습을 수행하였다. 다양한 문자 이미지 중 차량 번호판 이미지를 이용하여 대안적 통째학습 및 인식 성능 테스트를 진행하였고, 이를 통해 제안하는 알고리즘의 효과를 검증하였다.

Keywords

Acknowledgement

This study was financially supported by Chonnam National University(Grant number : 2021-2208) and this work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(NRF-2020R1G1A1100798).

References

  1. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004. https://doi.org/10.1109/TIP.2003.819861
  2. W. Xing and K. Egiazarian, "End-to-end learning for joint image demosaicing, denoising and super-resolution," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3507-3516, 2021.
  3. Y. Wei, S. Gu, Y. Li, R. Timofte, L. Jin, and H. Song, "Unsupervised real-world image super resolution via domain-distance aware training" in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 13385-13394, 2021.
  4. L. Wang, X. Dong, Y, Wang. X, Ying, Z. Lin, W. An, and Y. Guo, "Exploring sparsity in image super-resolution for efficient inference," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4917-4926, 2021.
  5. Y. Jo, S. W. Oh, P. Vajda, and S. J. Kim, "Tackling the ill-posedness of super-resolution through adaptive target generation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 16236-16245, 2021.
  6. A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, "YOLOv4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
  7. J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
  8. Darknet: Open source neural networks in C [Internet]. Available: https://pjreddie.com/darknet/.
  9. Ultralytics. YOLOv5 [Internet]. Available: https://github.com/ultralytics/yolov5.
  10. G. Huang, Z. Liu, L. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceeding of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, pp. 2261-2269, 2017.
  11. C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, and I. H. Yeh, "CSPNet: A new backbone that can enhance learning capability of CNN," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 390-391, 2020.
  12. S. Qiao, L. C. Chen, and A. Yuille, "DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10213-10224, Jun. 2021.
  13. J. G. Seo and H. Y. Park, "Object Recognition in very low resolution images using deep collaborative learning," in IEEE Access, vol. 7, pp. 134071-134082, 2019. https://doi.org/10.1109/access.2019.2941005
  14. J. Chen, B. Li, and X. Xue, "Scene text telescope: Text-focused scene image super-resolution," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12026-12035, Jun. 2021.
  15. S. J. Lee, T. J. Kim, C. H. Lee, and S. B. Yoo, "Image super-resolution for improving object recognition accuracy," Journal of the Korea Institute of Information and Communication Engineering, vol. 25, no. 6, pp. 774-784, Jun. 2021. https://doi.org/10.6109/JKIICE.2021.25.6.774
  16. T. Y. Song, Y. H. Lee, M. J. Kim, B. H. Ku, and H. S. Ko, "Fusion methods of license plate detection and super resolution for improving license plate recognition," Journal of The Korea Society of Computer and Information, vol. 16, no. 4, pp. 53-60, Apr. 2011. https://doi.org/10.9708/JKSCI.2011.16.4.053
  17. J. Jiao, W. S. Zheng, A. Wu, X. Zhu, and S. Gong, "Deep low-resolution person re-identification," in Proceeding of the 32nd AAAI Conference on Artificial Intelligence, 2018.
  18. M. Haris, G. Shakhnarovich, and N. Ukita, "Deep back-projection networks for super-resolution," in Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1664-1673, Mar. 2018.