DOI QR코드

DOI QR Code

Deep Convolution Neural Networks in Computer Vision: a Review

  • Received : 2014.04.05
  • Accepted : 2014.11.19
  • Published : 2015.02.28

Abstract

Over the past couple of years, tremendous progress has been made in applying deep learning (DL) techniques to computer vision. Especially, deep convolutional neural networks (DCNNs) have achieved state-of-the-art performance on standard recognition datasets and tasks such as ImageNet Large-Scale Visual Recognition Challenge (ILSVRC). Among them, GoogLeNet network which is a radically redesigned DCNN based on the Hebbian principle and scale invariance set the new state of the art for classification and detection in the ILSVRC 2014. Since there exist various deep learning techniques, this review paper is focusing on techniques directly related to DCNNs, especially those needed to understand the architecture and techniques employed in GoogLeNet network.

Keywords

References

  1. L. Deng and D. Yu, Deep Learning Methods and Applications, now Publishers Inc., 2014.
  2. W. McCulloch and W. Pitts, "A Logical calculus of the ideas immanent in nervous activity", Bulletin of Mathematical Biophysics, vol. 5, pp. 115-133, 1943. https://doi.org/10.1007/BF02478259
  3. Donald O. Hebb, The Organization of Behavior: A Neuropsychological Theory, Wiley, June 1949.
  4. F. Rosenblatt, "The perceptron: A probabilistic model for information storage and organization in the brain," Psychological Review, vol. 65, pp. 386-408, 1958. https://doi.org/10.1037/h0042519
  5. B. Widrow and M.E. Hoff, Jr., "Adaptive Switching Circuits," IRE WESCON Convention Record, Part 4, pp. 96-104, August 1960.
  6. M. Minsky and Seymour Papert, Percetrons, Cambridge, MIT Press, 1969.
  7. T. Kohonen, "Correlation Matrix Memories," IEEE Transactions on Computers, vol. 21, pp. 353-359, April 1972.
  8. James A. Anderson, "A simple neural network generating an interactive memory," Mathematical Biosciences, vol. 14, pp. 197-220, 1972. https://doi.org/10.1016/0025-5564(72)90075-2
  9. S. Grossberg, "Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors," Biological Cybernetics, vol. 23, pp. 121-134, 1976. https://doi.org/10.1007/BF00344744
  10. J. J. Hopfield, "Neural Networks and Physical Systems with Emergent Collective Computational Abilities," Proceedings of the National Academy of Sciences of the United States of America (PNAS), vol. 79, pp. 2554-2558, 1982. https://doi.org/10.1073/pnas.79.8.2554
  11. P. J. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD thesis, Harvard University, 1974.
  12. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation," Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D. E. Rumelhart and J. L McClelland, eds, vol. I, pp 318-362, MIT, Cambridge, 1986.
  13. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Computation, vol. 1, pp. 541-551, 1989. https://doi.org/10.1162/neco.1989.1.4.541
  14. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient based learning applied to document recognition," Proceedings of the IEEE, vol. 86, pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791
  15. Geoffrey E. Hinton and Simon Osindero, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, pp. 1527-1554, 2006. https://doi.org/10.1162/neco.2006.18.7.1527
  16. Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems 25, pp. 1106-1114, 2012.
  17. Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," CVPR 2014, IEEE Conference on, 2014.
  18. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, "Going Deeper with Convolutions," CoRR, 2014.
  19. K. Fukushima, "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position," Biol. Cybernetics, vol. 36, pp. 193-202, 1980. https://doi.org/10.1007/BF00344251
  20. D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction, and functional architecture in the cats visual cortex," Journal of Physiology (London), vol. 160, pp. 106-154, 1962. https://doi.org/10.1113/jphysiol.1962.sp006837
  21. M.D. Zeiler, R. Fergus, "Visualizing and Understanding Convolutional Networks," ECCV 2014 (Honorable Mention for Best Paper Award), Arxiv 1311.2901, Nov 28, 2013.
  22. Min Lin, Qiang Chen, and Shuicheng Yan, "Network in network," CoRR, abs/1312.4400, 2013.
  23. Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, and Yann LeCun, "Overfeat: Integrated recognition, localization and detection using convolutional networks," CoRR, abs/1312.6229, 2013.
  24. Matthew D. Zeiler, Hierarchical Convolutional Deep Learning in Computer Vision, Ph.D. Thesis, Nov. 8, 2013.
  25. Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," CoRR, abs/1207.0580, 2012.
  26. Pierre Sermanet and Yann LeCun, "Traffic Sign Recognition with Multi-Scale Convolutional Networks," IJCNN, 2011.
  27. J. Fan, W. Xu, Y. Wu, and Y. Gong, "Human tracking using convolutional neural networks," Neural Networks, IEEE Transactions on, vol. 21, pp. 1610-1623, 2010. https://doi.org/10.1109/TNN.2010.2066286
  28. B. A. Olshausen and D. J. Field, Sparse coding with an overcomplete basis set: a strategy employed by V1?, Vision Research, vol. 37, pp. 3311-3325, 1997. https://doi.org/10.1016/S0042-6989(97)00169-7
  29. P. C. Bush and T. J. Sejnowski, The cortical neuron, Oxford University Press, 1995.
  30. R. Douglas and K. Martin, "Recurrent excitation in neocortical circuits," Science, vol. 269, pp. 981-985, 1995. https://doi.org/10.1126/science.7638624
  31. Xavier Glorot, Antoine Bordes, and Yoshua Bengio, "Deep Sparse Rectifier Neural Networks", Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS) 2011, Fort Lauderdale, FL, USA, vol. 15 of JMLR:W&CP 15, pp. 315-323, 2011.
  32. V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," Proc. 27th International Conference on Machine Learning, 2010.
  33. M. Zeiler, G. Taylor, and R. Fergus, "Adaptive deconvolutional networks for mid and high level feature learning," ICCV, 2011.
  34. Sanjeev Arora, Aditya Bhaskara, Rong Ge, and Tengyu Ma, "Provable bounds for learning some deep representations," CoRR, abs/1310.6343, 2013.
  35. Andrew G. Howard, "Some improvements on deep convolutional neural network based image classification," CoRR, abs/1312.5402, 2013.

Cited by

  1. Dual-Dense Convolution Network for Change Detection of High-Resolution Panchromatic Imagery vol.8, pp.10, 2018, https://doi.org/10.3390/app8101785
  2. Towards an Automatic Plant Identification System without Dedicated Dataset vol.9, pp.1, 2019, https://doi.org/10.18178/ijmlc.2019.9.1.761