DOI QR코드

DOI QR Code

An Implementation of a System for Video Translation on Window Platform Using OCR

윈도우 기반의 광학문자인식을 이용한 영상 번역 시스템 구현

  • 황선명 (대전대학교 컴퓨터공학과) ;
  • 염희균 (대전대학교 컴퓨터공학과)
  • Received : 2019.09.21
  • Accepted : 2019.11.24
  • Published : 2019.12.31

Abstract

As the machine learning research has developed, the field of translation and image analysis such as optical character recognition has made great progress. However, video translation that combines these two is slower than previous developments. In this paper, we develop an image translator that combines existing OCR technology and translation technology and verify its effectiveness. Before developing, we presented what functions are needed to implement this system and how to implement them, and then tested their performance. With the application program developed through this paper, users can access translation more conveniently, and also can contribute to ensuring the convenience provided in any environment.

기계학습 연구가 발달함에 따라 번역 분야 및, 광학 문자 인식(Optical Character Recognition, OCR) 등의 이미지 분석 기술은 뛰어난 발전을 보였다. 하지만 이 두 가지를 접목시킨 영상 번역은 기존의 개발에 비해 그 진척이 더딘 편이다. 본 논문에서는 기존의 OCR 기술과 번역기술을 접목시킨 이미지 번역기를 개발하고 그 효용성을 검증한다. 개발에 앞서 본 시스템을 구현하기 위하여 어떤 기능을 필요로 하는지, 기능을 구현하기 위한 방법은 어떤 것이 있는지 제시한 뒤 각기 그 성능을 시험하였다. 본 논문을 통하여 개발된 응용프로그램으로 사용자들은 좀 더 편리하게 번역에 접근할 수 있으며, 영상 번역이라는 특수한 환경으로 한정된 번역기능에서 벗어나 어떠한 환경에서라도 제공되는 편의성을 확보하는데 기여할 수 있을 것이다.

Keywords

References

  1. K.H.Cho, et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv: 1406.1078, 2014.
  2. B.Dzmitry, K.H.Cho, and Y.Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
  3. Tu, Zhaopeng, et al., "Context gates for neural machine translation," Transactions of the Association for Computational Linguistics 5, pp.87-99, 2017. https://doi.org/10.1162/tacl_a_00048
  4. V.Ashish, et al., "Attention is all you need," Advances in Neural Information Processing Systems, 2017.
  5. Ma, Mingbo, et al., "Osu multimodal machine translation system report," arXiv preprint arXiv:1710.02718, 2017.
  6. Madhyastha, P.Swaroop, J.Wang, and L.Specia, "Sheffield multimt: Using object posterior predictions for multimodal machine translation," Proc. of the Second Conference on Machine Translation, 2017.
  7. Caglayan, Ozan, et al., "Lium-cvc submissions for wmt17 multimodal translation task," arXiv preprint arXiv:1707.04481, 2017.
  8. N.Kalchbrenner and P.Blunsom, "Recurrent continuous translation models," EMNLP, 2013.
  9. I.Sutskever, O.Vinyals, Q.V.Le, "Sequence to Sequence Learning with Neural Networks," Advances in Neural Information Processing Systems (NIPS), 2014.
  10. D.Bahdanau, K.Cho and Y.Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate," Int'l Conf. on Learning Representations (ICLR), 2015.
  11. P.Koehn, "Statistical Machine Translation. Statistical Machine Translation," Cambridge University Press, ISBN 9780521874151, 2010.
  12. R.Mithe, S.Indalkar, and N.Divekar, "Optical character recognition," International Journal of Recent Technology and Engineering, Vol.2, pp.72-75, 2013.
  13. E.B.Go, Y.J.Ha, S.R.Choi, K.H.Lee, and Y.H.Park, "An implementation of an android mobile system for extracting and retrieving texts from images," Journal of Digital Contents Society, Vol.12, No.1, pp.57-67, 2011. https://doi.org/10.9728/dcs.2011.12.1.057
  14. M.H.Cho, "A study on character recognition using wavelet transformation and moment," Journal of The Korea Society of Computer and Information, Vol.15, No.10, pp.49-57, 2010. https://doi.org/10.9708/jksci.2010.15.10.049
  15. J.W.Song, N.R.Jung, and H.S.Kang, "Container BIC-code region extraction and recognition method using multiple thresholding," Journal of the Korea Institute of Information and Communication Engineering, Vol.19, No.6, pp.1462-1470, 2015. https://doi.org/10.6109/jkiice.2015.19.6.1462