DOI QR코드

DOI QR Code

A Hybrid Proposed Framework for Object Detection and Classification

  • Received : 2017.11.17
  • Accepted : 2018.08.25
  • Published : 2018.10.31

Abstract

The object classification using the images' contents is a big challenge in computer vision. The superpixels' information can be used to detect and classify objects in an image based on locations. In this paper, we proposed a methodology to detect and classify the image's pixels' locations using enhanced bag of words (BOW). It calculates the initial positions of each segment of an image using superpixels and then ranks it according to the region score. Further, this information is used to extract local and global features using a hybrid approach of Scale Invariant Feature Transform (SIFT) and GIST, respectively. To enhance the classification accuracy, the feature fusion technique is applied to combine local and global features vectors through weight parameter. The support vector machine classifier is a supervised algorithm is used for classification in order to analyze the proposed methodology. The Pascal Visual Object Classes Challenge 2007 (VOC2007) dataset is used in the experiment to test the results. The proposed approach gave the results in high-quality class for independent objects' locations with a mean average best overlap (MABO) of 0.833 at 1,500 locations resulting in a better detection rate. The results are compared with previous approaches and it is proved that it gave the better classification results for the non-rigid classes.

Keywords

References

  1. Z. Rahman, Y. F. Pu, M. Aamir, and F. Ullah, "A framework for fast automatic image cropping based on deep saliency map detection and Gaussian filter," International Journal of Computers and Applications, 2018. https://doi.org/10.1080/1206212X.2017.1422358.
  2. M. Aamir, Y. F. Pu, W. A. Abro, H. Naeem, and Z. Rahman, "A hybrid approach for object proposal generation," in The Proceedings of the International Conference on Sensing and Imaging. Cham: Springer, 2017, pp. 251-259.
  3. D. Phan, C. M. Oh, S. H. Kim, I. S. Na, and C. W. Lee, "Object recognition by combining binary local invariant features and color histogram," in Proceedings of 2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR), Naha, Japan, 2013, pp. 466-470.
  4. N. Najva, "SIFT and tensor based object classification in images using Deep Neural Networks," in Proceedings of International Conference on Information Science (ICIS), Kochi, India, 2016, pp. 32-37.
  5. J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders, "Selective search for object recognition," International Journal of Computer Vision, vol. 104, no. 2, pp. 154-171, 2013. https://doi.org/10.1007/s11263-013-0620-5
  6. C. L. Zitnick and P. Dollar, "Edge boxes: locating object proposals from edges," in Computer Vision-ECCV 2014. Cham: Springer, 2014, pp. 391-405.
  7. D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  8. Y. Ke and R. Sukthankar, "PCA-SIFT: a more distinctive representation for local image descriptors," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004, pp. 506-513.
  9. C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995. https://doi.org/10.1007/BF00994018
  10. B. A. Tama and K. H. Rhee, "A detailed analysis of classifier ensembles for intrusion detection in wireless network," Journal of Information Processing Systems, vol. 13, no. 5, pp. 1203-1212, 2017. https://doi.org/10.3745/JIPS.03.0080
  11. P. Iswarya and V. Radha, "Speech query recognition for Tamil language using wavelet and wavelet packets," Journal of Information Processing Systems, vol. 13, no. 5, pp. 1135-1148, 2017. https://doi.org/10.3745/JIPS.02.0033
  12. F. Ciompi, C. Jacobs, E. T. Scholten, M. M. Wille, P. A. De Jong, M. Prokop, and B. van Ginneken, "Bagof-frequencies: a descriptor of pulmonary nodules in computed tomography images," IEEE Transactions on Medical Imaging, vol. 34, no. 4, pp. 962-973, 2015. https://doi.org/10.1109/TMI.2014.2371821
  13. T. Li and W. Zhang, "Classification of brain disease from magnetic resonance images based on multilevel brain partitions," in Proceedings of 2016 IEEE 38th Annual International Conference of the Engineering in Medicine and Biology Society (EMBC), Orlando, FL, 2016, pp. 5933-5936.
  14. E. Tola, V. Lepetit, and P. Fua, "A fast local descriptor for dense matching," in Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, 2008, pp. 1-8.
  15. H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, "Speeded-up robust features (SURF)," Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346-359, 2008. https://doi.org/10.1016/j.cviu.2007.09.014
  16. W. Tahir, A. Majeed, and T. Rehman, "Indoor/outdoor image classification using gist image features and neural network classifiers," in Proceedings of 2015 12th International Conference on High-Capacity Optical Networks and Enabling/Emerging Technologies (HONET), Islamabad, Pakistan, 2015, pp. 1-5.
  17. H. T. Manh and G. Lee, "Small object segmentation based on visual saliency in natural images," Journal of Information Processing Systems, vol. 9, no. 4, pp. 592-601, 2013. https://doi.org/10.3745/JIPS.2013.9.4.592
  18. P. F. Felzenszwalb and D. P. Huttenlocher, "Efficient graph-based image segmentation," International Journal of Computer Vision, vol. 59, no. 2, pp. 167-181, 2004. https://doi.org/10.1023/B:VISI.0000022288.19776.77
  19. J. Carreira and C. Sminchisescu, "CPMC: automatic object segmentation using constrained parametric min-cuts," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. 7, pp. 1312-1328, 2011. https://doi.org/10.1109/TPAMI.2011.231
  20. I. Endres and D. Hoiem, "Category-independent object proposals with diverse ranking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 2, pp. 222-234, 2014. https://doi.org/10.1109/TPAMI.2013.122
  21. B. Alexe, T. Deselaers, and V. Ferrari, "Measuring the objectness of image windows," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2189-2202, 2012. https://doi.org/10.1109/TPAMI.2012.28
  22. E. Rahtu, J. Kannala, and M. Blaschko, "Learning a category independent object detection cascade," in Proceedings of 2011 IEEE International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 1052-1059.
  23. D. Ghimire and J. Lee, "Extreme learning machine ensemble using bagging for facial expression recognition," Journal of Information Processing Systems, vol. 10, no. 3, pp. 443-458, 2014. https://doi.org/10.3745/JIPS.02.0004
  24. L. L. Zhu, Y. Chen, A. Yuille, and W. Freeman, "Latent hierarchical structural learning for object detection," in Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, 2010.
  25. P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, "Object detection with discriminatively trained part-based models," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, 2010. https://doi.org/10.1109/TPAMI.2009.167
  26. T. T. Yu and N. War, "Condensed object representation with corner HOG features for object classification in outdoor scenes," in Proceedings of 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Kanazawa, Japan, 2017, pp. 77-82.
  27. S. A. Korkmaz, A. Akcicek, H. Binol, and M. F. Korkmaz, "Recognition of the stomach cancer images with probabilistic HOG feature vector histograms by using HOG features," in Proceedings of 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), Subotica, Serbia, 2017, pp. 339-342.
  28. VLFeat algorithm [Online]. Available: http://www.vlfeat.org/.
  29. L. Liu, Y. Ma, X., Zhang, Y. Zhang, and S. Li, "High discriminative SIFT feature and feature pair selection to improve the bag of visual words model," IET Image Processing, vol. 11, no. 11, pp. 994-1001, 2017. https://doi.org/10.1049/iet-ipr.2017.0062
  30. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, vol. 25, pp. 1097-1105, 2012.
  31. M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in Computer Vision-ECCV 2014. Cham: Springer, 2014, pp. 818-833.
  32. M. Lin, Q. Chen, and S. Yan, "Network in network," 2013; https://arxiv.org/abs/1312.4400.
  33. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhouche, and A. Rabinovich, "Going deeper with convolutions," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 1-9.
  34. D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, "Scalable object detection using deep neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 2147-2154.
  35. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587.
  36. Fast R-CNN [Online]. Available: https://github.com/rbgirshick/fast-rcnn.
  37. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," 2014; https://arxiv.org/abs/1409.1556.
  38. D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  39. H. Naeem, G. Bing, M. R. Naeem, M. Aamir, and M. S. Javed, "A new approach for image detection based on refined Bag of Words algorithm," Optik-International Journal for Light and Electron Optics, vol. 140, pp. 823-832, 2017. https://doi.org/10.1016/j.ijleo.2017.05.018
  40. A. Oliva and A. Torralba, "Modeling the shape of the scene: a holistic representation of the spatial envelope," International Journal of Computer Vision, vol. 42, no. 3, pp. 145-175, 2001. https://doi.org/10.1023/A:1011139631724
  41. M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, "The Pascal visual object classes challenge: a retrospective," International Journal of Computer Vision, vol. 111, no. 1, pp. 98-136, 2015. https://doi.org/10.1007/s11263-014-0733-5