DOI QR코드

DOI QR Code

Efficient Object Classification Scheme for Scanned Educational Book Image

교육용 도서 영상을 위한 효과적인 객체 자동 분류 기술

  • Choi, Young-Ju (Dept. of IT Engineering, Sookmyung Women's University) ;
  • Kim, Ji-Hae (Dept. of IT Engineering, Sookmyung Women's University) ;
  • Lee, Young-Woon (Dept. of Computer Converged Electronics Engineering, SunMoon University) ;
  • Lee, Jong-Hyeok (Dept. of IT Engineering, Sookmyung Women's University) ;
  • Hong, Gwang-Soo (Dept. of IT Engineering, Sookmyung Women's University) ;
  • Kim, Byung-Gyu (Dept. of IT Engineering, Sookmyung Women's University)
  • 최영주 (숙명여자대학교 IT공학과) ;
  • 김지해 (숙명여자대학교 IT공학과) ;
  • 이영운 (선문대학교 컴퓨터융합전자공학과) ;
  • 이종혁 (숙명여자대학교 IT공학과) ;
  • 홍광수 (숙명여자대학교 IT공학과) ;
  • 김병규 (숙명여자대학교 IT공학과)
  • Received : 2017.11.03
  • Accepted : 2017.11.25
  • Published : 2017.11.30

Abstract

Despite the fact that the copyright has grown into a large-scale business, there are many constant problems especially in image copyright. In this study, we propose an automatic object extraction and classification system for the scanned educational book image by combining document image processing and intelligent information technology like deep learning. First, the proposed technology removes noise component and then performs a visual attention assessment-based region separation. Then we carry out grouping operation based on extracted block areas and categorize each block as a picture or a character area. Finally, the caption area is extracted by searching around the classified picture area. As a result of the performance evaluation, it can be seen an average accuracy of 83% in the extraction of the image and caption area. For only image region detection, up-to 97% of accuracy is verified.

오늘날 저작권 관련 산업이 사회, 경제적으로 큰 영향을 미치는 대규모 산업으로 성장하였음에도 불구하고 저작물에 대한 소유권 및 저작권에 대한 문제가 끊임없이 발생하고 있으며 특히 이미지 저작권과 관련된 연구는 거의 진행되지 않는 상태이다. 본 연구에서는 기존의 문서 영상처리 기술과 딥 러닝 기술을 융합하여 교육용 도서 영상에서의 객체 자동 추출 및 분류 기술 시스템을 제안한다. 제안된 기술은 먼저 잡음을 제거한 후, 시각적 주의(visual attention) 기반 영역 추출 과정을 수행한다. 추출된 영역을 기반으로 블록화 작업을 수행하고, 각 블록을 그림인지 아니면 문자 영역인지를 분류한다. 마지막으로 추출된 그림 영역 주위를 검색하여 캡션 영역을 추출한다. 본 연구에서 진행한 성능 평가 결과, 그림 영역은 최대 97% 정확도를 보이며, 그림 및 캡션 영역 추출에 있어서는 평균 83%의 정확도를 보여 준다.

Keywords

Acknowledgement

Supported by : 한국저작권위원회

References

  1. J. W. Choe, "A Study on the Utilization of Orphan Works from the Perspective of Comparative Law," The Informedia Law, Vol. 15, No. 2, pp. 217-254, Sep 2011.
  2. H. W. Nam, "Research on the Improvement Plan for the Protection and Use Activation of Image Copyright," Korea Science & Art Forum, Vol. 15, pp. 233-247, Mar 2014. https://doi.org/10.17548/ksaf.2014.03.15.233
  3. Korea Copyright Commission, "The copyright statistics," Korea Copyright Commission, Vol. 5, No. 6, pp. 1-156, Oct 2016.
  4. H. W. Lee, "A study on the introduction of the extensive collection management to the orphan works," Ministry of Culture, Sports and Tourism, 2012.
  5. C. E. Cheong, J. G. Choi, S. W. Kang, and S. K. Ahn, "The Segmentation of Document Images for Recognition," in The Institute of Electronics and Information Engineers, pp. 621-625, 1989.
  6. K. Kise, A. Sato, and M. Iwata, "Segmentation of page images using the area Voronoi diagram," Computer Vision and Image Understanding, Vol. 70, No. 3, pp. 370-382, June 1998. https://doi.org/10.1006/cviu.1998.0684
  7. K.Y. Wong, R.G. Casey and F.M. Wahl, "Document analysis system," IBM Journal of Research and Development, Vol. 26, No. 6, pp. 647-656, November 1982. https://doi.org/10.1147/rd.266.0647
  8. D. Wong and S.N. Srihari, "Classification of Newspaper Image Blocks Using Texture Analysis," Computer Graphics and Image Processing, Vol. 47, No. 3, pp. 327-352, September 1989. https://doi.org/10.1016/0734-189X(89)90116-3
  9. G. Mehul, P. Ankita, D. Namrata, G. Rahul, S. Sheth, "Text-based Image Segmentation Methodology," Procedia Technology, Vol. 14, pp.465-472, 2014 https://doi.org/10.1016/j.protcy.2014.08.059
  10. X. Hou, L. Zhang, "Saliency Detection: A Spectral Residual Approach" Computer Vision and Pattern Recognition, CVPR '07. IEEE Conference, June 2007
  11. M.D. Garris, C.L. Wilson, and J.L. Blue, "Neural Network-Based Systems for Handprint OCR Applications," IEEE Trans. Image Processing, Vol. 7, No. 8, pp. 1097-1112, 1998. https://doi.org/10.1109/83.704304
  12. L. Splilmann and John S. Werrier, "Visual Perception: the neurophysiological foundations," Academic Press Inc, 1990.
  13. L. Itti, C. Koch, and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 11, pp. 1254-1259, 1998. https://doi.org/10.1109/34.730558
  14. Hans G. Feichtinger, and Thomas Strohmer, "Gabor Analysis and Algorithms," Birkhauser, 1998.
  15. Jae-Kyung Baek and Young-Geon Seo, "Extracting the Slope and Compensating the Image Using Edges and Image Segmentation in Real World Image," Journal of Digital Contents Society, Vol. 17, No. 5, pp. 441-448, Oct. 2016. https://doi.org/10.9728/dcs.2016.17.5.441