DOI QR코드

DOI QR Code

Wasserstein Center 손실을 이용한 스케치 기반 3차원 물체 검색

Sketch-based 3D object retrieval using Wasserstein Center Loss

  • Ji, Myunggeun (Department of Computer Science, Kyonggi University) ;
  • Chun, Junchul (Department of Computer Science, Kyonggi University) ;
  • Kim, Namgi (Department of Computer Science, Kyonggi University)
  • 투고 : 2018.10.31
  • 심사 : 2018.11.17
  • 발행 : 2018.12.31

초록

스케치 기반 3차원 물체 검색은 다양한 3차원 물체를 사람이 손으로 그린 스케치를 질의(query)로 사용하여 물체를 편리하게 검색하는 방법이다. 본 논문에서는 스케치 기반 3차원 물체 검색을 위해 스케치 CNN(Convolutional Neural Network)과 Wasserstein CNN 모델에 Wasserstein Center 손실을 적용하여 물체의 검색 성공률을 향상시키는 새로운 방법을 제안한다. 제안된 Wasserstein Center 손실이란 각 물체의 클래스(category)의 중심을 학습하고, 동일한 클래스의 특징과 중심 간의 Wasserstein 거리가 작아지도록 만드는 방법이다. 이를 위하여 제안된 3차원 물체 검색은 다음의 단계로 수행된다. 첫 번째로, 3차원 물체의 특징은 3차원 물체를 여러 방향에서 촬영된 2차원 영상의 특징을 CNN을 이용하여 추출하고, 각 영상 특징의 Wasserstein 중심을 계산한다. 두 번째로, 스케치의 특징은 별도의 스케치 CNN을 이용하여 추출하였다. 마지막으로, 추출한 3차원 물체의 특징과 스케치의 특징을 본 논문에서 제안한 Wasserstein Center 손실을 이용하여 학습하고 스케치 기반의 3차원 물체 검색에 적용하였다. 본 논문에서 제안한 방법의 우수성을 입증하기 위하여 SHREC 13과 SHREC 14의 두 가지 벤치마크 데이터 집합을 이용하여 평가하였으며, 제안된 방법이 기존의 스케치 기반 검색방법들과 비교하여 모든 측정 기준에서 우수한 결과를 나타냄을 확인할 수 있었다.

Sketch-based 3D object retrieval is a convenient way to search for various 3D data using human-drawn sketches as query. In this paper, we propose a new method of using Sketch CNN, Wasserstein CNN and Wasserstein center loss for sketch-based 3D object search. Specifically, Wasserstein center loss is a method of learning the center of each object category and reducing the Wasserstein distance between center and features of the same category. To do this, the proposed 3D object retrieval is performed as follows. Firstly, Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate Sketch CNN. Finally, we learn the features of the extracted 3D object and the features of the sketch using the proposed Wasserstein center loss. In order to demonstrate the superiority of the proposed method, we evaluated two sets of benchmark data sets, SHREC 13 and SHREC 14, and the proposed method shows better performance in all conventional metrics compared to the state of the art methods.

키워드

OTJBCD_2018_v19n6_91_f0001.png 이미지

(그림 1) Wasserstein Center 손실을 이용한 스케치 기반 3차원 물체 검색의 개요 (Figure 1) The overview of Sketch-based 3D object retrieval using Wasserstein Center Loss

OTJBCD_2018_v19n6_91_f0002.png 이미지

(그림 2) Wasserstein Center 손실을 이용하여 학습된 특징 (임의의 10개 클래스 : 개, 열기구, 피아노, 화분, 플로어 램프, 용, 노트북, 신발, 뱀, 덤불) 시각화 (Figure 2) A visualization of learned features(randomly selected 10 classes : dog, hot air balloon, piano, potted plant, floor lamp, dragon, laptop, shoe, snake, bush) by Wasserstein center loss

OTJBCD_2018_v19n6_91_f0003.png 이미지

(그림 3) SHREC 13 데이터 셋에 대한 검색 예제. 회색은 잘못 검색된 클래스(검색 클래스 : 손) (Figure 3) Retrieval examples on SHREC 13 dataset.Mismatch highlighted in gray (Retrieval classes : hand)

OTJBCD_2018_v19n6_91_f0004.png 이미지

(그림 4) SHREC 13 데이터 셋의 PR-Curve 결과 비교 (Figure 4) The precision-recall curves in SHREC 13 dataset

OTJBCD_2018_v19n6_91_f0005.png 이미지

(그림 5) SHREC 14 데이터 셋에 대한 검색 예제. 회색은잘못 검색된 클래스(검색 클래스 : 안락의자) (Figure 5) Retrieval examples on SHREC 14 dataset.Mismatch highlighted in gray(Retrievalclasses : armchair)

OTJBCD_2018_v19n6_91_f0006.png 이미지

(그림 6) SHREC 14 데이터 셋의 PR-Curve 결과 비교 (Figure 6) The precision-recall curves in SHREC 14 dataset

(표 1) 실험 환경 (Table 1) Experimental Environments

OTJBCD_2018_v19n6_91_t0001.png 이미지

(표 2) SHREC 13 데이터 셋의 NN, FT, ST, E, DCG, mAP 결과 비교 (%) (Table 2) Comparison of NN, FT, ST, E, DCG, and mAP results in SHREC 13 dataset (%)

OTJBCD_2018_v19n6_91_t0002.png 이미지

(표 3) SHREC14 데이터 셋의 NN, FT, ST, E, DCG, mAP 결과 비교 (%) (Table 3) Comparison of NN, FT, ST, E, DCG, and mAP results in SHREC14 datasets (%)

OTJBCD_2018_v19n6_91_t0003.png 이미지

참고문헌

  1. M. Eitz, R. Richter, T. Boubekeur, K. Hildebrand, and M. Alexa, "Sketch-based shape retrieval," ACM Transactions on Graphics, vol. 31, no. 4, pp. 1-10, 2012. https://doi.org/10.1145/2185520.2335382
  2. B. Li, Y. Lu, A. Godil, T. Schreck, B. Bustos, A. Ferreira, T. Furuya, M. J. Fonseca, H. Johan, T. Matsuda, R. Ohbuchi, P. B. Pascoal, and J. M. Saavedra, "A comparison of methods for sketch-based 3D shape retrieval," Computer Vision and Image Understanding, vol. 119, pp. 57-80, 2014. https://doi.org/10.1016/j.cviu.2013.11.008
  3. Fang Wang, Le Kang, and Yi Li, "Sketch-based 3D shape retrieval using Convolutional Neural Networks," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2015.7298797
  4. R. Hadsell, S. Chopra, and Y. LeCun, "Dimensionality Reduction by Learning an Invariant Mapping," 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR'06), 2006 https://doi.org/10.1109/cvpr.2006.100
  5. F. Schroff, D. Kalenichenko, and J. Philbin, "FaceNet: A unified embedding for face recognition and clustering," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2015.7298682
  6. Y. Wen, K. Zhang, Z. Li, and Y. Qiao, "A Discriminative Feature Learning Approach for Deep Face Recognition," Lecture Notes in Computer Science, pp. 499-515, 2016. https://doi.org/10.1007/978-3-319-46478-7_31
  7. A. Rolet, M. Cuturi, and G. Peyr'e. "Fast dictionary learning with a smoothed wasserstein loss," International Conference on Artificial Intelligence and Statistics, Cadiz, Spain, pp. 630-638, 2016. http://www.jmlr.org/proceedings/papers/v51/rolet16.pdf
  8. B. Li, Y. Lu, A. Godil, T. Schreck, M. Aono, H. Johan, J. M. Saavedra, and S. Tashiro. "Shrec'13 track: Large scale sketchbased 3D shape retrieval," Eurographics Workshop on 3D Object Retrieval, Girona, Spain, pp. 89-96, 2013. https://dx.doi.org/10.2312/3DOR/3DOR13/089-096
  9. T. Furuya and R. Ohbuchi. "Ranking on cross-domain manifold for sketch-based 3D model retrieval," International Conference on Cyberworlds, Yokohama, Japan, pp. 274- 281, 2013. https://doi.org/10.1109/cw.2013.60
  10. J. Xie, G. Dai, F. Zhu, and Y. Fang, "Learning Barycentric Representations of 3D Shapes for Sketch-Based 3D Shape Retrieval," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. https://doi.org/10.1109/cvpr.2017.385
  11. He, Xinwei, et al. "Triplet-Center Loss for Multi-View 3D Object Retrieval," arXiv preprint arXiv:1803.06189, 2018. http://openaccess.thecvf.com/content_cvpr_2018/CameraReady/1632.pdf
  12. V. I. Bogachev and A. V. Kolesnikov, "The Monge-Kantorovich problem: achievements, connections, and perspectives," Russian Mathematical Surveys, vol. 67, no. 5, pp. 785-890, 2012. https://doi.org/10.1070/rm2012v067n05abeh004808
  13. Y. Rubner, C. Tomasi, and L. J. Guibas. "The Earth Mover's Distance as a metric for image retrieval," International Journal of Computer Vision, vol. 40, no. 2 pp. 99-121, 2000. https://doi.org/10.1023/a:1026543900054
  14. M. Cuturi. "Sinkhorn distances: Lightspeed computation of optimal transport," Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp. 2292-2300, 2013. https://papers.nips.cc/paper/4927-sinkhorn-distances-light speed-computation-of-optimal-transport.pdf
  15. R. Sinkhorn, "Diagonal Equivalence to Matrices with Prescribed Row and Column Sums," The American Mathematical Monthly, vol. 74, no. 4, p. 402, 1967. https://doi.org/10.2307/2314570
  16. J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyre, "Iterative Bregman Projections for Regularized Transportation Problems," SIAM Journal on Scientific Computing, vol. 37, no. 2, pp. A1111-A1138, 2015. https://doi.org/10.1137/141000439
  17. N. Bonneel, G. Peyre, and M. Cuturi, "Wasserstein barycentric coordinates," ACM Transactions on Graphics, vol. 35, no. 4, pp. 1-10, 2016. https://doi.org/10.1145/2897824.2925918
  18. P.-T. de Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, vol. 134, no. 1, pp. 19-67, 2005. https://doi.org/10.1007/s10479-005-5724-z
  19. L. van der Maaten and G. Hinton. "Visualizing highdimensional data using t-SNE.," Journal of Machine Learning Research, vol. 9, pp. 2579-2605, 2008. http://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf
  20. B. Li, Y. Lu, C. Li, A. Godil, T. Schreck, M. Aono, M. Burtscher, H. Fu, T. Furuya, H. Johan, J. Liu, R. Ohbuchi, A. Tatsuma, and C. Zou. "Extended large scale sketch-based 3D shape retrieval," Eurographics Workshop on 3D Object Retrieval, Strasbourg, France, pp. 121-130, 2014. http://dx.doi.org/10.2312/3dor.20141058
  21. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. https://doi.org/10.1109/cvpr.2016.90
  22. S. Ferradans, G.-S. Xia, G. Peyre, and J.-F. Aujol, "Static and Dynamic Texture Mixing Using Optimal Transport," Scale Space and Variational Methods in Computer Vision, pp. 137-148, 2013. https://doi.org/10.1007/978-3-642-38267-3_12