DOI QR코드

DOI QR Code

An Implementation of a System for Video Translation on Window Platform Using OCR

윈도우 기반의 광학문자인식을 이용한 영상 번역 시스템 구현

  • 황선명 (대전대학교 컴퓨터공학과) ;
  • 염희균 (대전대학교 컴퓨터공학과)
  • Received : 2019.09.21
  • Accepted : 2019.11.24
  • Published : 2019.12.31

Abstract

As the machine learning research has developed, the field of translation and image analysis such as optical character recognition has made great progress. However, video translation that combines these two is slower than previous developments. In this paper, we develop an image translator that combines existing OCR technology and translation technology and verify its effectiveness. Before developing, we presented what functions are needed to implement this system and how to implement them, and then tested their performance. With the application program developed through this paper, users can access translation more conveniently, and also can contribute to ensuring the convenience provided in any environment.

References

  1. K.H.Cho, et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv: 1406.1078, 2014.
  2. B.Dzmitry, K.H.Cho, and Y.Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
  3. Tu, Zhaopeng, et al., "Context gates for neural machine translation," Transactions of the Association for Computational Linguistics 5, pp.87-99, 2017. https://doi.org/10.1162/tacl_a_00048
  4. V.Ashish, et al., "Attention is all you need," Advances in Neural Information Processing Systems, 2017.
  5. Ma, Mingbo, et al., "Osu multimodal machine translation system report," arXiv preprint arXiv:1710.02718, 2017.
  6. Madhyastha, P.Swaroop, J.Wang, and L.Specia, "Sheffield multimt: Using object posterior predictions for multimodal machine translation," Proc. of the Second Conference on Machine Translation, 2017.
  7. Caglayan, Ozan, et al., "Lium-cvc submissions for wmt17 multimodal translation task," arXiv preprint arXiv:1707.04481, 2017.
  8. N.Kalchbrenner and P.Blunsom, "Recurrent continuous translation models," EMNLP, 2013.
  9. I.Sutskever, O.Vinyals, Q.V.Le, "Sequence to Sequence Learning with Neural Networks," Advances in Neural Information Processing Systems (NIPS), 2014.
  10. D.Bahdanau, K.Cho and Y.Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate," Int'l Conf. on Learning Representations (ICLR), 2015.
  11. P.Koehn, "Statistical Machine Translation. Statistical Machine Translation," Cambridge University Press, ISBN 9780521874151, 2010.
  12. R.Mithe, S.Indalkar, and N.Divekar, "Optical character recognition," International Journal of Recent Technology and Engineering, Vol.2, pp.72-75, 2013.
  13. E.B.Go, Y.J.Ha, S.R.Choi, K.H.Lee, and Y.H.Park, "An implementation of an android mobile system for extracting and retrieving texts from images," Journal of Digital Contents Society, Vol.12, No.1, pp.57-67, 2011. https://doi.org/10.9728/dcs.2011.12.1.057
  14. M.H.Cho, "A study on character recognition using wavelet transformation and moment," Journal of The Korea Society of Computer and Information, Vol.15, No.10, pp.49-57, 2010.
  15. J.W.Song, N.R.Jung, and H.S.Kang, "Container BIC-code region extraction and recognition method using multiple thresholding," Journal of the Korea Institute of Information and Communication Engineering, Vol.19, No.6, pp.1462-1470, 2015. https://doi.org/10.6109/jkiice.2015.19.6.1462