Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

Zhao, Yongwei;Li, Bicheng;Liu, Xin;Ke, Shengcai;

doi:10.3837/tiis.2016.01.021

KSII Transactions on Internet and Information Systems (TIIS)

제10권1호
/
Pages.364-380
/
2016
/
1976-7277(pISSN)
/
1976-7277(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

Zhao, Yongwei (China National Digital Switching System Engineering and Technological R&D Center) ;
Li, Bicheng (China National Digital Switching System Engineering and Technological R&D Center) ;
Liu, Xin (China National Digital Switching System Engineering and Technological R&D Center) ;
Ke, Shengcai (China National Digital Switching System Engineering and Technological R&D Center)

투고 : 2015.07.10
심사 : 2015.10.20
발행 : 2016.01.31

https://doi.org/10.3837/tiis.2016.01.021 인용 PDF KSCI KPUBS

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

The most popular approach in object classification is based on the bag of visual-words model, which has several fundamental problems that restricting the performance of this method, such as low time efficiency, the synonym and polysemy of visual words, and the lack of spatial information between visual words. In view of this, an object classification based on weakly supervised E2LSH and saliency map weighting is proposed. Firstly, E2LSH (Exact Euclidean Locality Sensitive Hashing) is employed to generate a group of weakly randomized visual dictionary by clustering SIFT features of the training dataset, and the selecting process of hash functions is effectively supervised inspired by the random forest ideas to reduce the randomcity of E2LSH. Secondly, graph-based visual saliency (GBVS) algorithm is applied to detect the saliency map of different images and weight the visual words according to the saliency prior. Finally, saliency map weighted visual language model is carried out to accomplish object classification. Experimental results datasets of Pascal 2007 and Caltech-256 indicate that the distinguishability of objects is effectively improved and our method is superior to the state-of-the-art object classification methods.

키워드

참고문헌

J. Sivic, A. Zisserman. "Video Google: a text retrieval approach to object matching in videos," in Proc. of 9th IEEE International Conference on Computer Vision, pp. 1470-1477, October 13-16, 2003. Article (CrossRef Link).
H. Jegou, M. Douze, C. Schmid. "Packing bag-of features," in Proc. of IEEE 12th International Conference on Computer Vision, pp. 2357-2364, September 29-October 2, 2009. Article (CrossRef Link).
Y. Z. Chen, A. Dick, X. Li, et al. “Spatially aware feature selection and weighting for object retrieval,” Image and Vision Computing, vol. 31, no. 6, pp. 935–948, December, 2013. Article (CrossRef Link). https://doi.org/10.1016/j.imavis.2013.09.005
J. Y. Wang, H. Bensmail, X. Gao. “Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification,” Pattern Recognition, vol. 46, no. 3, pp. 3249-3255, June, 2013. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2013.05.001
O. A. B. Penatti, F. B. Silva, Eduardo Valle, et al. “Visual word spatial arrangement for image retrieval and classification,” Pattern Recognition, vol. 47, no. 1, pp. 705-720, June, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.patcog.2013.08.012
D. G. Lowe. “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, April, 2004. Article (CrossRef Link). https://doi.org/10.1023/B:VISI.0000029664.99615.94
J. C. Van Gemert, C. J. Veenman, A. W. M. Smeulders, et al. “Visual word ambiguity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 7, no. 32, pp. 1271-1283, July, 2010. Article (CrossRef Link). https://doi.org/10.1109/TPAMI.2009.132
Raphaël Marée, Philippe Denis, Louis Wehenkel, et al. "Incremental indexing and distributed image search using shared randomized dictionaries," in Proc. of MIR 2010, pp. 91-100, May 05-07, 2010. Article (CrossRef Link).
D. Nister, H. Stewenius. Scalable recognition with a vocabulary tree[C], in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2161-2168June . 17-22, 2006. Article (CrossRef Link).
J. Philbin, O. Chum, M. Isard, et a1. "Object retrieval with large vocabularies and fast spatial matching," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, June 17-22, 2007. Article (CrossRef Link).
R. J. Zhang, F.S Wei, B. C. Li. “E2LSH based Multiple Kernel Learning Approach for Object Detection,” Neurocomputing, vol. 124, no. 1, pp. 105-110, March, 2014. Article (CrossRef Link). https://doi.org/10.1016/j.neucom.2013.07.027
Q. Zheng, W. Gao. “Constructing visual phrases for effective and efficient object-based image retrieval,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 5, no. 1, pp. 1-19, May, 2008. Article (CrossRef Link). https://doi.org/10.1145/1404880.1404887
T. Chen, K. H. Yap and D.J. Zhang. “Discriminative soft bag-of-visual phrase for mobile landmark recognition,” IEEE Transactions on Multimedia, vol. 16, no. 3, pp. 612-622. April, 2014. Article (CrossRef Link). https://doi.org/10.1109/TMM.2014.2301978
J. Philbin, O. Chum, M. Isard, et al. "Lost in quantization: Improving particular object retrieval in large scale image databases," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8. June 23-28. 2008. Article (CrossRef Link).
W. Jing-yan, L. Yong-ping, Z. Ying, et a1. “Bag-of-features based medical image retrieval via multiple assignment and visual words weighting,” IEEE Transactions on Medical Imaging, vol. 30, no. 11, pp. 1996-2011, November, 2011. Article (CrossRef Link). https://doi.org/10.1109/TMI.2011.2161673
S. Lazebnik, C. Schmid, J. Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169-2178. October 21-26. 2006. Article (CrossRef Link).
G. Sharma, F. Jurie. "Learning discriminative spatial representation for image classification," in Proc. of the 22nd British Machine Vision Conference, pp. 1-11. July 08-11, 2011. Article (CrossRef Link).
L. Xie, Q. Tian, B. Zhang. “Spatial Pooling of Heterogeneous Features for Image Classification,” IEEE Transactions on Image Processing, vol. 23, no. 5, pp. 1994-2008, May, 2014. Article (CrossRef Link). https://doi.org/10.1109/TIP.2014.2310117
Wu Lei, Li Ming, Li Z, et al. "Visual language modeling for image classification," in Proc. of the International Workshop on Workshop on Multimedia Information Retrieval. pp. 115-124. June14-17, 2007. Article (CrossRef Link).
Wu Lei, Hu Y, Li M, et al. “Scale-Invariant visual language modeling for object categorization,” IEEE Transactions on Multimedia, vol. 11, no. 2, pp. 286-294, February, 2009. Article (CrossRef Link). https://doi.org/10.1109/TMM.2008.2009692
S. Nakamoto and T. Toriu. “Combination way of local properties, classifiers and saliency in bag-of-keypoints approach for generic object recognition,” International Journal of Computer Science and Network Security, vol. 11, no. 1, pp. 35-42, July, 2011. Article (CrossRef Link).
M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni. "Locality-sensitive hashing scheme based on p-stable distributions," in Proc. of the 20th Annual Symposium on Computational Geometry, pp. 253-262, October 21-25, 2004. Article (CrossRef Link).
M. Slaney, M. Casey, ‘Locality-sensitive hashing for finding nearest neighbors,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 128-131, March, 2008. Article (CrossRef Link). https://doi.org/10.1109/MSP.2007.914237
J. Harel, C. Koch, and P. Perona. Graph-based visual saliency[C], in Proc. of Advances in Neural Information Processing Systems, pp. 545-552, November 12-15, 2007. Article (CrossRef Link).
L. Breiman. "Random forests," http://www.stat.berkeley.edu/-breiman/RandomForests/ 2014. 07.
L. Itti, C. Koch, and E. Niebur. “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254–1259, November, 1998. Article (CrossRef Link). https://doi.org/10.1109/34.730558
B. Geng, L. Yang, and C. Xu. "A study of language model for image retrieval," In: Proc. of IEEE International Conference on Data Mining Workshops, pp. 158-163, December 6-6, 2009. Article (CrossRef Link).
F.F. Li, R. Fergus, P. Perona. “Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories,” Computer Vision and Image Understanding, vol. 106, no. 1, pp. 59-70, Augest, 2005. Article (CrossRef Link). https://doi.org/10.1016/j.cviu.2005.09.012
M. Everingham, L. Van Gool, C. K. I. Williams, et al. "The PASCAL Visual Object Classes Challenge Results,"http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/results/index.shtml, 08. 2014.
S. hui, L. Zhenbao, Han Junwei et al. “Learning High-Level Feature by Deep Belief Networks for 3-D Model Retrieval and Recognition,” IEEE Transactions on Multimedia, vol. 16, no. 8, pp. 2154-2167, December, 2014. Article (CrossRef Link). https://doi.org/10.1109/TMM.2014.2351788

KSII Transactions on Internet and Information Systems (TIIS)

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)