Accurate Human Localization for Automatic Labelling of Human from Fisheye Images

Than, Van Pha;Nguyen, Thanh Binh;Chung, Sun-Tae;

doi:10.9717/kmms.2017.20.5.769

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

제20권5호
/
Pages.769-781
/
2017
/
1229-7771(pISSN)
/
2384-0102(eISSN)

한국멀티미디어학회 (Korea Multimedia Society)

DOI QR Code

Accurate Human Localization for Automatic Labelling of Human from Fisheye Images

Than, Van Pha (Dept. of Information and Telecommunication Engineering, Soongsil University) ;
Nguyen, Thanh Binh (Embedded Vision, Inc.) ;
Chung, Sun-Tae (Dept. of Smart Systems Software, Soongsil Uniersity)

투고 : 2017.02.15
심사 : 2017.04.28
발행 : 2017.05.31

https://doi.org/10.9717/kmms.2017.20.5.769 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Deep learning networks like Convolutional Neural Networks (CNNs) show successful performances in many computer vision applications such as image classification, object detection, and so on. For implementation of deep learning networks in embedded system with limited processing power and memory, deep learning network may need to be simplified. However, simplified deep learning network cannot learn every possible scene. One realistic strategy for embedded deep learning network is to construct a simplified deep learning network model optimized for the scene images of the installation place. Then, automatic training will be necessitated for commercialization. In this paper, as an intermediate step toward automatic training under fisheye camera environments, we study more precise human localization in fisheye images, and propose an accurate human localization method, Automatic Ground-Truth Labelling Method (AGTLM). AGTLM first localizes candidate human object bounding boxes by utilizing GoogLeNet-LSTM approach, and after reassurance process by GoogLeNet-based CNN network, finally refines them more correctly and precisely(tightly) by applying saliency object detection technique. The performance improvement of the proposed human localization method, AGTLM with respect to accuracy and tightness is shown through several experiments.

키워드

참고문헌

J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, et al., "Recent Advances in Convolutional Neural Networks," arXiv:1512.07108, 2017.
N.T. Binh, N.V. Tuan, and S.T. Chung, "Real-time Human Detection under Omni-directional Camera based on CNN with Unified Detection and AGMM for Visual Surveillance," Journal of Korea Multimedia Society, Vol. 19, No. 8, pp. 1345-1360, 2016. https://doi.org/10.9717/kmms.2016.19.8.1345
R. Stewart and M. Andriluka, "End-to-end People Detection in Crowded Scenes," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2325-2333, 2016.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al, "Going Deeper with Convolutions," Proceeding of Computer Vision and Pattern Recognition (CVPR) , pp. 1-9, 2015.
M. Lin, Q. Chen, and S. Yan, "Network in Network," arXiv:1312.4400, 2013.
O. Russakovsky, "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, 2015. https://doi.org/10.1007/s11263-015-0816-y
C. Olah, Understanding LSTM Networks, http://colah.github.io/posts/2015-08-Underst anding-LSTMs/ (accessed Feb., 14, 2017).
K.Y. Chang, T.L. Liu, H.T. Chen, and S.H. Lai, "Fusing Generic Objectless and Visual Saliency for Salient Object Detection," Proceeding of International Conference on Computer Vision, pp. 914-921, 2011.
C. Yang, L. Zhang, H. Lu, X. Ruan, and M. Yang, "Saliency Detection via Graph-Based Manifold Ranking," Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition, pp. 3166-3173, 2013.
VATIC: Video Annotation Tool from Irvine, California, http://web.mit.edu/vondrick/vatic/ (accessed Feb., 14, 2017).
ViPER: The Video Performance Evaluation Resource, http://viper-toolkit.sourceforge.net (accessed Feb., 14, 2017).
LabelMe, http://labelme.csail.mit.edu/Release3.0/ (accessed Feb., 14, 2017).
LabelImg, https://github.com/tzutalin/labelImg (accessed Feb., 14, 2017).
X. Wang, M. Wang, and W. Li, "Scene-Specific Pedestrian Detection for Static Video Surveillance," IEEE Transactionson Pattern Analysis and Machine Intelligence, Vol. 36, No. 2, pp. 361-374, 2014. https://doi.org/10.1109/TPAMI.2013.124
J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, "How Transferable are Features in Deep Neural Networks?," Advances in Neural Information Processing Systems 27, pp. 3320-3328, 2014.
K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, "Return of the Devil in the Details: Delving Deep into Convolutional Nets," Proceedings of British Machine Vision Conference, pp. 18-19, 2014.
D. Xing, W. Dai, G.R. Xue, and Y. Yu, "Bridged Refinement for Transfer Learning," Proceeding of European Conference on Principles and Practice of Knowledge Discovery in Databases, Lecture Notes in Computer Science, pp. 324-335, 2007.
X. Zeng, W. Ouyang, and M. Wang, "Deep Learning of Scene-Specific Classifier for Pedestrian Detection," Proceeding of Europe an Conference on Computer Vision, pp 472-487, 2014.
A. Mhalla, T. Chateau, and S. Gazzah, "Scene-Specific Pedestrian Detector Using Monte Carlo Framework and Faster R-CNN Deep Model," Proceeding of International Conference on Distributed Smart Camera, pp. 228-229, 2016.
H. Maamatou, T. Chateau, S. Gazzah, Y. Goyat, and N. Essoukri Ben Amara, "Transductive Transfer Learning to Specialize a Generic Classifier Towards a Specific Scene," Proceeding of International Conference on Computer Vision Theory and Applications, pp. 411-422, 2016.
T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, et al., "Learning to Detect a Salient Object," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 353-367, 2011.
L. Wang, J. Xue, N. Zheng, and G. Hua, "Automatic Salient Object Extraction with Contextual Cue," Proceeding of International Conference on Computer Vision, pp.105-112, 2011.
P. Dollar, C. Wojek, B. Schiele, and P. Perona, "Pedestrian Detection: An Evaluation of th State of the Art," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, Issue 4, pp. 743-761, 2011. https://doi.org/10.1109/TPAMI.2011.155
Bomni-DB Hompage, https://www.cmpe.boun.edu.tr/pilab/pilabfiles/databases/bomni/ (accessed Feb., 14, 2017).

피인용 문헌

Deep Learning을 사용한 백색광 주사 간섭계의 높이 측정 방법 vol.21, pp.8, 2017, https://doi.org/10.9717/kmms.2018.21.8.864
어안렌즈 카메라로 획득한 영상에서 차량 인식을 위한 딥러닝 기반 객체 검출기 vol.22, pp.2, 2017, https://doi.org/10.9717/kmms.2019.22.2.128

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

Accurate Human Localization for Automatic Labelling of Human from Fisheye Images

초록

키워드

참고문헌

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)