DOI QR코드

DOI QR Code

Evaluation of Recurrent Neural Network Variants for Person Re-identification

  • Le, Cuong Vo (School of Electronics and Telecommunications, Hanoi University of Science and Technology) ;
  • Tuan, Nghia Nguyen (School of Electronics and Telecommunications, Hanoi University of Science and Technology) ;
  • Hong, Quan Nguyen (School of Electronics and Telecommunications, Hanoi University of Science and Technology) ;
  • Lee, Hyuk-Jae (Inter-university Semiconductor Research Center, Department of Electrical and Computer Engineering, Seoul National University)
  • Received : 2017.05.03
  • Accepted : 2017.05.26
  • Published : 2017.06.30

Abstract

Instead of using only spatial features from a single frame for person re-identification, a combination of spatial and temporal factors boosts the performance of the system. A recurrent neural network (RNN) shows its effectiveness in generating highly discriminative sequence-level human representations. In this work, we implement RNN, three Long Short Term Memory (LSTM) network variants, and Gated Recurrent Unit (GRU) on Caffe deep learning framework, and we then conduct experiments to compare performance in terms of size and accuracy for person re-identification. We propose using GRU for the optimized choice as the experimental results show that the GRU achieves the highest accuracy despite having fewer parameters than the others.

Keywords

References

  1. Bedagkar-Gala and S. K. Shah, "A survey of approaches and trends in person re-identification," Image and Vision Computing, vol. 32, no. 4, pp. 270-286, 2014. https://doi.org/10.1016/j.imavis.2014.02.001
  2. M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, "Person re-identification by symmetry-driven accumulation of local features," in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010, pp. 2360-2367.
  3. D. Gray and H. Tao, "Viewpoint invariant pedestrian recognition with an ensemble of localized features," Computer Vision-ECCV 2008, pp. 262-275, 2008.
  4. T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 7, pp. 971-987, 2002. https://doi.org/10.1109/TPAMI.2002.1017623
  5. R. Zhao, W. Oyang, and X. Wang, "Person re-identification by saliency learning," IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 2, pp. 356-370, 2017. https://doi.org/10.1109/TPAMI.2016.2544310
  6. V. Le, Q. N. Hong, T. T. Quang, and N. D. Trung, "Superpixel-based background removal for accuracy salience person re-identification," in Consumer Electronics-Asia (ICCE-Asia), IEEE International Conference on. IEEE, 2016, pp. 1-4.
  7. T. B. Nguyen, V. P. Pham, T.-L. Le, and C. V. Le, "Background removal for improving saliency-based person re-identification," in Knowledge and Systems Engineering (KSE), 2016 Eighth International Conference on. IEEE, 2016, pp. 339-344.
  8. L. Wu, C. Shen, and A. v. d. Hengel, "Personnet: person reidentification with deep convolutional neural networks," arXiv preprint arXiv:1601.07255, 2016.
  9. Y. Yan, B. Ni, Z. Song, C. Ma, Y. Yan, and X. Yang, "Person reidentification via recurrent feature aggregation," in European Conference on Computer Vision. Springer, 2016, pp. 701-716.
  10. N. McLaughlin, J. Martinez del Rincon, and P. Miller, "Recurrent convolutional network for video-based person re-identification," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1325-1334.
  11. H. Liu, Z. Jie, K. Jayashree, M. Qi, J. Jiang, S. Yan, and J. Feng, "Videobased person re-identification with accumulative motion context," arXiv preprint arXiv:1701.00193, 2017.
  12. K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, "Lstm: A search space odyssey," IEEE transactions on neural networks and learning systems, 2016.
  13. T. Wang, S. Gong, X. Zhu, and S. Wang, "Person re-identification by discriminative selection in video ranking," IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 12, pp. 2501-2514, 2016. https://doi.org/10.1109/TPAMI.2016.2522418
  14. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 886-893.
  15. O. Chapelle and S. S. Keerthi, "Efficient algorithms for ranking with svms," Information Retrieval, vol. 13, no. 3, pp. 201-215, 2010. https://doi.org/10.1007/s10791-009-9109-9
  16. S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  17. F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, "Learning precise timing with lstm recurrent networks," Journal of machine learning research, vol. 3, no. Aug, pp. 115-143, 2002.
  18. K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using rnn encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014.
  19. T. Wang, S. Gong, X. Zhu, and S. Wang, "Person re-identification by video ranking," in European Conference on Computer Vision. Springer, 2014, pp. 688-703.
  20. M. Hirzer, C. Beleznai, P. M. Roth, and H. Bischof, "Person reidentification by descriptive and discriminative classification," in Scandinavian conference on Image analysis. Springer, 2011, pp. 91-102.
  21. M. Hirzer, P. Roth, M. Kostinger, and H. Bischof, "Relaxed pairwise learned metric for person re-identification," Computer Vision-ECCV 2012, pp. 780-793, 2012.
  22. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014, pp. 675-678.