DOI QR코드

DOI QR Code

An Development of Image Retrieval Model based on Image2Vec using GAN

Generative Adversarial Network를 활용한 Image2Vec기반 이미지 검색 모델 개발

  • Jo, Jaechoon (College of Informatics, Korea University) ;
  • Lee, Chanhee (Dept. of Computer Science and Engineering, Korea University) ;
  • Lee, Dongyub (Dept. of Computer Science and Engineering, Korea University) ;
  • Lim, Heuiseok (Dept. of Computer Science and Engineering, Korea University)
  • Received : 2018.10.30
  • Accepted : 2018.12.20
  • Published : 2018.12.28

Abstract

The most of the IR focus on the method for searching the document, so the keyword-based IR system is not able to reflect the feature information of the image. In order to overcome these limitations, we have developed a system that can search similar images based on the vector information of images, and it can search for similar images based on sketches. The proposed system uses the GAN to up sample the sketch to the image level, convert the image to the vector through the CNN, and then retrieve the similar image using the vector space model. The model was learned using fashion image and the image retrieval system was developed. As a result, the result is showed meaningful performance.

검색에서 이미지는 시각적 속성이 중요지만, 기존의 검색방법은 문서 검색을 위한 방법에 초점이 맞춰져 있어 이미지의 속성 정보가 미반영된 키워드 중심의 검색 시스템이 대부분이다. 본 연구는 이러한 한계를 극복하고자 이미지의 벡터정보를 기반으로 유사 이미지를 검색할 수 있는 모델과 스케치로 검색 쿼리를 제공하여 유사 이미지를 검색할 수 있는 시스템을 개발하였다. 제안된 시스템은 GAN을 이용하여 스케치를 이미지 수준으로 업 샘플링하고, 이미지를 CNN을 통해 벡터로 변환한 후, 벡터 공간 모델을 이용하여 유사 이미지를 검색한다. 제안된 모델을 구현하기 위하여 패션 이미지를 이용하여 모델을 학습시켰고 패션 이미지 검색 시스템을 개발하였다. 성능 측정은 Precision at k를 이용하였으며, 0.774와 0.445의 성능 결과를 보였다. 제안된 방법을 이용하면 이미지 검색 의도를 키워드로 표현하는데 어려움을 느끼는 사용자들의 검색 결과에 긍정적 효과가 나타날 것으로 기대된다.

Keywords

DJTJBT_2018_v16n12_301_f0001.png 이미지

Fig. 1. Extraction of Feature in Image using Global Average Pooling Layer

DJTJBT_2018_v16n12_301_f0002.png 이미지

Fig. 2. Image Feature Embedding using T-SNE

DJTJBT_2018_v16n12_301_f0003.png 이미지

Fig. 3. The Structure of Vector-based Image Retrieval and Sketch-based Image Retrieval Model

Table 1. Amazon Categories Sample

DJTJBT_2018_v16n12_301_t0001.png 이미지

Table 2. Result of Precision

DJTJBT_2018_v16n12_301_t0002.png 이미지

Table 3. Result of Precision at 5

DJTJBT_2018_v16n12_301_t0003.png 이미지

References

  1. Sheng Lu. (2017). Market Size of the Global Textile and Apparel Industry: 2015 to 2020. FASH455 Global Apparel & Textile Trade and Sourcing. https://shenglufashion.wordpress.com/2017/06/06/market-siz e-of-the-global-texti le-and-apparel-industry-2015-to-2020/
  2. K. T. Chen & J. Luo. (2017). When Fashion Meets Big Data: Discriminative Mining of Best Selling Clothing Features. in Proceedings of the 26th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee.
  3. S. Joo & J. Ha. (2016). Fashion Industry System and Fashion Leaders in the Digital Era, J. Korean Soc. Cloth. Text, 40(3), 506-515. https://doi.org/10.5850/JKSCT.2016.40.3.506
  4. C. Kang & H. Kim (2017). Industry Credit Outlook. Korea Ratings, Dec. 12.
  5. F. Perronnin & C. Dance. (2007, June). Fisher kernels on visual vocabularies for image categorization. In 2007 IEEE conference on computer vision and pattern recognition, 1-8.
  6. D. G. Lowe. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91-110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  7. A. Kannan, P. P. Talukdar, N. Rasiwasia & Q. Ke. (2011). Improving product classification using images. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, 310-319.
  8. K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778.
  9. D. Y. Lee, W. Yu. & H. Lim. (2017). Bi-directional LSTM-CNN-CRF for Korean Named Entity Recognition System with Feature Augmentation. Journal of the Korea Convergence Society, 8(12), 55-62. https://doi.org/10.15207/JKCS.2017.8.12.055
  10. J. Han, B. Koo & K. Cheoi. (2017). Obstacle Detection and Recognition System for Autonomous Driving Vehicle. Journal of Convergence for Information Technology, 7(6), 229-235. https://doi.org/10.22156/CS4SMB.2017.7.6.229
  11. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li & L. Fei-Fei. (2009). Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 248-255.
  12. D. Lee, J. Jo & H. Lim. (2017). User Sentiment Analysis on Amazon Fashion Product Review Using Word Embedding. Journal of the Korea Convergence Society, 8(4), 1-8 https://doi.org/10.15207/JKCS.2017.8.4.001
  13. A. Krizhevsky, I. Sutskever & G. E. Hinton. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097-1105.
  14. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever & R. R. Salakhutdinov. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
  15. M. Lin, Q. Chen & S. Yan. (2013). Network in network. arXiv preprint arXiv:1312.4400.
  16. P. D. Turney & P. Pantel. (2010). From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research, 37, 141-188. https://doi.org/10.1613/jair.2934
  17. Y. LeCun, Y. Bengio & G. Hinton. (2015). Deep learning. nature, 521(7553), 436. https://doi.org/10.1038/nature14539
  18. A. Sharif Razavian, H. Azizpour, J. Sullivan & S. Carlsson. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 806-813.
  19. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng & T. Darrell. (2013). A deep convolutional activation feature for generic visual recognition. arXiv preprint. arXiv preprint arXiv:1310.1531.
  20. A. Babenko, A. Slesarev, A. Chigorin & V. Lempitsky. (2014). Neural codes for image retrieval. In European conference on computer vision, 584-599.
  21. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair & Y. Bengio. (2014). Generative adversarial nets. In Advances in neural information processing systems, 2672-2680.
  22. C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta & W. Shi. (2017). Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In CVPR, 2(3), 4.
  23. H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang & D. Metaxas. (2017). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. arXiv preprint.
  24. J. Canny. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, 6, 679-698.
  25. S. Xie & Z. Tu. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, 1395-1403.
  26. P. Isola, J. Y. Zhu, T. Zhou & A. A. Efros. (2017). Image-to-image translation with conditional adversarial networks. arXiv preprint.
  27. Y. Jing, D. Liu, D. Kislyuk, A. Zhai, J. Xu, J. Donahue, & S. Tavel. (2015). Visual search at pinterest. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1889-1898.