A Sketch-based 3D Object Retrieval Approach for Augmented Reality Models Using Deep Learning

Ji, Myunggeun;Chun, Junchul;

doi:10.7472/jksii.2020.21.1.33

Journal of Internet Computing and Services (인터넷정보학회논문지)

Volume 21 Issue 1
/
Pages.33-43
/
2020
/
1598-0170(pISSN)
/
2287-1136(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

A Sketch-based 3D Object Retrieval Approach for Augmented Reality Models Using Deep Learning

Ji, Myunggeun (Department of Computer Science, Kyonggi University) ;
Chun, Junchul (Professor , Department of Computer Science, Kyonggi University)

지명근 ;
전준철

Received : 2019.07.11
Accepted : 2019.10.21
Published : 2020.02.29

https://doi.org/10.7472/jksii.2020.21.1.33 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

Retrieving a 3D model from a 3D database and augmenting the retrieved model in the Augmented Reality system simultaneously became an issue in developing the plausible AR environments in a convenient fashion. It is considered that the sketch-based 3D object retrieval is an intuitive way for searching 3D objects based on human-drawn sketches as query. In this paper, we propose a novel deep learning based approach of retrieving a sketch-based 3D object as for an Augmented Reality Model. For this work, we introduce a new method which uses Sketch CNN, Wasserstein CNN and Wasserstein center loss for retrieving a sketch-based 3D object. Especially, Wasserstein center loss is used for learning the center of each object category and reducing the Wasserstein distance between center and features of the same category. The proposed 3D object retrieval and augmentation consist of three major steps as follows. Firstly, Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate Sketch CNN. Finally, we adopt sketch-based object matching method to localize the natural marker of the images to register a 3D virtual object in AR system. Using the detected marker, the retrieved 3D virtual object is augmented in AR system automatically. By the experiments, we prove that the proposed method is efficiency for retrieving and augmenting objects.

Keywords

1. Introduction

Object retrieval from a 3D database is preliminary work for various applications areas such as Augmented Reality and 3D scene generation. Especially, sketch-based 3D shape retrieval has been receiving attention in the computer vision and graphics application[1, 2, 3]. In comparison to earlier attempts to use keywords as queries, hand-drawn sketches provide more convenient and various input of shapes while retrieving the desired objects. However, the previous works including deep learning approaches could not meet the desired satisfaction in the efficiency of the retrieval rates.

In this paper, we propose a new deep learning based 3D model retrieval method by classifying 3D objects in efficient and convenient fashion. Especially we propose a sketch-based 3D model retrieval by using CNN and a newly introduced Wasserstein center loss function. The similarity measure methods using CNN have been broadly used in contrastive loss[4], tripple loss[5] and center loss[6]. The Wasserstein center loss introduced in this paper is a method to improve the measuring accuracy of the center loss function. Wasserstein center loss is the loss function which uses the Wasserstein distance instead of Euclidean distance between the center of the class and the feature[7].

The proposed Wasserstein CNN uses images rendered in various directions to obtain the features of the 3D model and then extracts the features of the object by evaluating the Wasserstein barycenters of the 2D image features of the 3D model. The feature of the sketch images is obtained by use of sketch CNN.

In the following section 2, the related works of the proposed method is described. In section 3, the concept of Wasserstein distance, Wasserstein center and Wasserstein center along with the structure of Wasserstein CNN and sketch-based 3D augmentation. In section 4, the experimental results by the proposed method included. Finally, in section 5, the concluding remarks and future works are discussed.

OTJBCD_2020_v21n1_33_f0001.png 이미지

(Figure 1) The framework of the proposed sketch-based 3D model retrieval.

2. Related Works

The studies of sketch-based 3D object retrieval become a major issue in the field of contents-based model retrieval area. However, the difficulties to use sketches for retrieving a 3D object is the sketch of the object is not uniquely defined depending on the person’s subjection. Due to this reason, the 2D sketches for the same 3D object can be presented in many different fashions.

In the study of the 2D projection of the 3D objects, a composite descriptor so called ZFEC which includes local region-based Zernike moment, boundary-based Fourier descriptor, and features of eccentricity and roundness is introduced[2]. In other study, the silhouette of a 3D model is used as a 2D sketch of the model[8]. In a work of sketch-based 3D retrieval by learning features, Eitz utilized sketches and 2D projections of the 3D objects by use of Gabor local line-based features and bag-of-feature (BOF) histogram[1]. In addition, Furuya proposed BF-SIFT to describe sketches and 2D projections of 3D objects[9].

Recently, studies of sketch-based 3D object retrieval using CNN(Convolutional Neural Network) have been introduced. Wang retrieved 3D object using two Siamese CNNs to extract features of sketches and 3D objects[3]. In Xie’s study, he extracts features from the images of the 3D object by CNN and obtains Wasserstein center of the features to match the objects and sketches[10].

In the studies of similarity measure based on loss function, Hadsell[4] proposed Contrastive loss and Schroff[5] proposed Triplet loss in the classifying the input data. Wen[6] introduced Center loss for recognizing the face. He etal[16] introduced Triplet Center loss which combined the Triplet loss and Center loss in sketch-based 3D object retrieval.

However, the demerit of Contrastive loss is the learning speed can be slow down when the pair of data is not properly designed. Triplet loss conventionally needs long learning time because of the triple pair of data.

Meanwhile, sketch-based image matching, which is known as a content-based retrieval[23, 24, 25] method to compare the database images with sketch images drawn by users, is used to detect a desired object in an input image and the detected object is used as natural maker of AR for augmenting a virtual 3D object.

3. Proposed Method

The proposed sketch-based 3D model retrieval system described in Fig 1 consists of three major parts: two CNN and Wasserstein Loss. The Wasserstein CNN extracts features of 3D models and the sketch CNN extracts features from the sketch, respectively. Wasserstein center loss is used for learning of obtained both features from CNNs.

In section 3.1 and 3.2 Wasserstein distance and Wasserstein center are introduced. In section 3.3 the characteristics of Wasserstein CNN and Sketch CNN which extracts the features of 3D model and the features of the user’s sketch, respectively are explained. In section 3.4, the Wasserstein center loss used for sketched-based 3D model retrieval is described.

3.1 Wasserstein Distance

Wasserstein barycenters[12] is the center point of a set of probability distributions calculated using the Wasserstein distance. Wasserstein distance is called Kantorovich- Rubinstein metric or Earth Mover's Distance as one of several methods of determining the distance of probability distributions. Let \(p I N \mathbb{R}^{r \times 1}\) and \(q I N \mathbb{R}^{s \times 1}\) be two probability distributions. The transmission plan can be defined as follow.

\(R(p, q)=\left\{T \in \mathbb{R}_{+}^{r \times s} ; T 1=p, T^{T} 1=q\right\}\) (1)

In Eq. (1), T is the transmission scheme, and 1 is a column vector in which all elements are 1. The Wasserstein distance \(D(p,q)\) between p this and q can be defined as follow.

\(D(p, q)=\min _{T I N R(p, q)}\) (2)

In Eq (2), \(M I N \mathbb{R}^{r \times s}\) is a pairwise distance matrix of p and q called ground matrix. is dot product of M and T. Wasserstein distance D(p,q) is the optimal transmission planning cost for transmitting the mass of p to q. In many cases, Eq. (2) may not have a unique solution and we use Eq. (3) [14] plus the entropy normalization term.

\(D(p, q)=\min _{\operatorname{TINR}(p, q)}+\gamma\) (3)

In Eq. (3) is negative entropy and \(\gamma\) is a normalized agent variable. The optimal solution of the above equation is obtained as shown in Eq. (4) below.

\(\widehat{T}=\operatorname{diag}(u) \operatorname{Kdiag}(v)\) (4)

In Eq. (4) \(K= e^{-M/\gamma}\), vector u and v use Sinkhorn algorithm[15].

3.2 Wasserstein Center

Wasserstein barycenters is the center point of a set of probability distributions calculated using the Wasserstein distance. When a probability distribution set is \(p_{i} I N \mathbb{R}^{r \times 1}, i=1,2, \cdots, n\) . barycenter p_b of this set is defined as follows[11].

\(\operatorname{argmin}_{p_{b}} \sum_{i=1}^{n} \lambda_{i} D\left(p_{b}, p_{i}\right)\) (5)

In Eq. (5) \(D(p_b, p_i)\) is the Wasserstein distance of p_b and p_i, \(\lambda_i\) is the weight. Wasserstein center \(p^t_b\)can be obtained by repeatedly calculated as follow:

\(\begin{array}{l} p_{b}^{t}=\prod_{i=1}^{n}\left(K^{T} a_{i}^{t}\right)^{\lambda_{i}} \\ c_{i}^{t+1}=\frac{p_{b}^{t}}{K^{T} a_{i}^{t}} ; a_{i}^{t+1}=\frac{p_{i}}{K^{T} a_{i}^{t}} \end{array}\) (6)

In Eq. (6) \(p^t_b\) is t iteration of Wasserstein center p_b and, \(c^{t+1}_i , a^{t+1}_i\) is are auxiliary variables[17].

3.3 Wasserstein CNN

As shown in Fig 1. the features of the 3D model can be extracted from the rendered multi-view images of the target model using CNN and the Wasserstein center will be obtained from the features[10]. In the fisrst stage, in order to extract the features of the model, the 12 images are taken from the model according to the 30 degree rotational direction of the 3D model as illustrated in Fig 2. Those images are feed into CNN to obtain the 3D feature of the model.

OTJBCD_2020_v21n1_33_f0002.png 이미지

(Figure 2) An example of 12 multi-view images from a 3D model

OTJBCD_2020_v21n1_33_f0003.png 이미지

(Figure 3) The structure of the Wasserstein CNN and the sketch CNN

The proposed Wasserstein CNN plays a role of four major parts: CNN for extracting feature from each view, Wasserstein barycenter for extracting features of the 3D model by calculating Wasserstein center of the all views, CNN2 for mapping the obtained 3D features to the same domain of the sketch features, and a classifier for classifying the mapping features. Fig. 3(a) shows the structure of the proposed Wasserstein CNN.

Meanwhile, the sketch CNN consists of three major parts: CNN for extracting feature from the user’s sketch, CNN2 for mapping the obtained feature to the same domain of the 3D features, and a classifier to classify the mapping features. Fig. 3(b) is the structure of the sketch CNN.

3.4 Wasserstein Center Loss

Wasserstein center loss is based on the understanding of center loss, which has been used for face recognition area to compensate for the Softmax loss of the supervised learning. Center loss obtains the center of a class and minimizes the distance between the center and each feature to be classified.

The formula of the center loss can be defined as Eq. (7).

\(L_{C}=\frac{1}{2} \sum_{i=1}^{m}\left\|x_{i}-c_{y_{i}}\right\|_{2}^{2}\) (7)

In Eq. (7) \(x_i\) is the input feature, \(c_{y_{i}} \in \mathbb{R}^{d}\) is the center of \(y_i\), d is the dimension of the feature, and m is the number of the feature.

Wasserstein center loss is the loss function which uses the Wasserstein distance instead of Euclidean distance between the center of the class and the feature[7]. Wasserstein center loss can be defined as follow.

\(L_{W C}=\sum_{i=1}^{m} D\left(x_{i}, c_{y_{i}}\right)\) (8)

In Eq. (8), x_i is the feature of 3D model by using Wasserstein CNN and feature of the sketch using Sketch CNN. \(c_{y_{i}} \in \mathbb{R}^{d}\) is the common center of the feature of 3D model and the sketch for y_ith class. The slope of L_WC according to x_i and the updating of \(c_{y_{i}}\) are defined as follow.

\(\begin{array}{c} \frac{\partial L_{W C}}{\partial x_{i}}=D\left(x_{i}, c_{y_{i}}\right) \\ \Delta c_{j}=\frac{\sum_{i=1}^{m} \delta\left(y_{i}=j\right) \cdot D\left(x_{i}, c_{y_{i}}\right)}{1+\sum_{i=1}^{m} \delta\left(y_{i}=j\right)} \end{array}\) (9)

In Eq. (9), δ is 1 when the condition is true or 0 when the condition is false. In this work, we utilize both Wasserstein center loss and cross entropy[18] as for the loss and the total loss(L) can be defined by

\(\begin{aligned} L &=L_{W C}+L_{C} \\ &=\sum_{i=1}^{m} D\left(x_{i}, c_{y_{i}}\right)+\sum_{i=1}^{m} y_{i} \log \left(y_{i}^{\prime}\right) \end{aligned}\) (10)

In Eq. (10), L_C is cross entropy loss, and y'_i is the class of the expected 3D model and sketch.

A comparative clustering result of 10 classes using both Wasserstein loss only and the combination of Wasserstein loss and cross entropy by t-SNE[19] is illustrated in Fig. 4.

OTJBCD_2020_v21n1_33_f0004.png 이미지

(Figure 4) Clustering results using both Wasserstein center loss and the combination of Wasserstein and entropy loss

3.5 Sketch-based 3D Augmentation

The main idea of suggested sketch-based 3D augmentation is that the sketch drawn by the user itself is considered as a natural marker of AR and the retrieved 3D object based on the proposed sketch-based 3D object retrieval method. And finally, the retrieved 3D object is registered on the detected natural marker. It is realized in the following order: detect sketch from input video images, retrieve 3D model using sketch CNN, and augment a 3D model on the detected natural marker.

In order to augment a 3D object in AR, first the sketch is detected from the input image then the features of the sketch are extracted using sketch CNN. Subsequently, the matched 3D object is retrieved by use of Wasserstein CNN which compares the extracted sketch features with already registered features of 3D models in the database.

Sketch-based matching is a way to compare features between the input sketch image made by users and the input video images. In order to convert the video image into a sketch, we use canny edge detection algorithm. When the edge image is obtained, the next step is comparing features between the sketched images by SURF (Speeded Up Robust Features) algorithm. SURF uses an integer approximation of the determinant of Hessian blob detector, which can be computed with 3 integer operations using a precomputed integral image. Its feature descriptor is based on the sum of the Haar wavelet response around the point of interest. In the use of SURF, square-shaped filters are generally used as an approximation of Gaussian smoothing. Integral image is used with a square and defined as:

\(S(x, y)=\sum_{i=0}^{x} \sum_{j=0}^{y} I(i, j)\) (11)

The sum of the original image within a rectangle can be evaluated conveniently using the integral image. SURF uses a blob detector based on the Hessian matrix to find points of interest and it also uses the determinant of the Hessian for selecting the scale. Given a point P(x,y) in an image , the Hessian matrix \(H(p, \sigma)\) at point p and scale \(\sigma\), is defined as follows:

\(H(p, \sigma)=\left(\begin{array}{l} L_{x x}(p, \sigma) L_{x y}(p, \sigma) \\ L_{x y}(p, \sigma) L_{y y}(p, \sigma) \end{array}\right)\) (12)

In the formula (11) \(L_{xx} (p,\sigma)\) etc. are the second-order derivatives of the grayscale image. Finally the exact shape of the marker can be extracted by GrabCut which utilizes a user-specified bounding box around the object to be segmented. GrabCut estimates the color distribution of the target object and that of the background using a Gaussian mixture model.

Fig 5. shows an example of sketch-based 3D object augmentation in AR system.

OTJBCD_2020_v21n1_33_f0005.png 이미지

(Figure 5) An example of the sketch-based 3D object augmentation

4. Experimental Results

In this section, the experimental results of the proposed Wasserstein CNN and Sketch CNN with the evaluation of the retrieved 3D object is discussed. In the experiments, we adopt SHREC 13[2] and SHREC 14[20] dataset for retrieving 3D objects.

SHREC 13 dataset consists of 7, 200 sketches and 90 classes of 1, 258 3D objects. Meanwhile, SHREC 14 is a extended version of SHREC 13 with 13, 680 sketches and 171 classes of 8, 987 3D objects. We use 50 sketches of each class for learning and 30 sketches from SHREC 13 and 14 individually in the experiments of the proposed method.

The environments for the test is illustrated in the table 1. The size of the view and sketch image from the 3D object is \(224 \times 224\). The proposed system is implemented by using Python, OpenCV library and PyTorch deep learning library. For the test of sketch-based 3D augmentation, Logitech C920 PRO HD web camera is used.

(Table 1) Environments of the experiments

OTJBCD_2020_v21n1_33_t0001.png 이미지

For the test, 12 directionally rendered multi-view images of the 3D objects are used. As for he CNN structure of both Wasserstein CNN and Sketch CNN Resnet-18[21] is used. In Wasserstein CNN, the multi-view images are assigned to the structure CNN of Resnet-18 and ‘average pool’ layer of Resnet-18 calculates the Wasserstein center by using 512 output and feed it into CNN2. In Sketch CNN, the sketch is assigned to Resnet-18 and ‘average pool’ layer of Resnet-18 feeds 512 output to CNN2. CNN2 of both Wasserstein CNN and Sketch CNN consists of 512-300-100 ouput layers and all of them use ReLU function. Classifier produces same numbers of classes for each dataset.

Fig. 6 shows the sketch-based 3D object retrieval by use of SHREC 13 and SHREC 14 respectively. Both sketches hand and chair classes are used for the experiments and most of the retrieval results are correctly matched with 3D object of the sketch.

The proposed Wasserstein center loss (WCL) method are compare with 3D object retrieval methods such as Fourier descriptor (FDC), Edge-based Fourier spectrum descriptor (EFSD), Sketch-based retrieval with view cluster (SVR-VC) [8], Cross domain morphology ranking (CDMR), Siamese network(Siamese)[3], Learned Wasserstein center representation (LWBR)[10], Depth correlation metric learning(DCML)[22], and Triple center loss(TCL)[16]. In the evaluation of the proposed method, the precision-recall curve(PR-Curve), nearest neighbor (NN), ﬁrst tier (FT), second tier (ST), E-measure (E), discounted cumulated gain (DCG) and mean average precision (mAP) are used for two datasets. (Table 2) and (Table 3) show the comparative results between currently available method and the proposed method. The proposed method shows higher accuracy rate in retrieving 3D objects.

OTJBCD_2020_v21n1_33_f0006.png 이미지

(Figure 6) An example of the proposed sketch-based 3D object retrieval

(Table 2) Comparison of NN, FT, ST, E, DCG, and mAP results in SHREC 13 dataset (%)

OTJBCD_2020_v21n1_33_t0002.png 이미지

(Table 3) Comparison of NN, FT, ST, E, DCG, and mAP results in SHREC14 dataset (%)

OTJBCD_2020_v21n1_33_t0003.png 이미지

Fig. 7 illustrates Precision-Recall(PR) rates of LWBR, EFSD, SBR-VC, FDC and the proposed WCL. The proposed WCL is high accuracy in precision and recall compared with other methods.

OTJBCD_2020_v21n1_33_f0007.png 이미지

(Figure 7) Comparative results of Precision-Recall of between the proposed WCL and other methods.

OTJBCD_2020_v21n1_33_f0008.png 이미지

(Figure 8) 3D object retrieval results using TCL and the proposed WCL

OTJBCD_2020_v21n1_33_f0009.png 이미지

(Figure 9) Sketch-based 3D object augmentation in AR system

Fig. 8 shows the results of retrieval results of 3D object between the TCL[16], which is considered as the state of the art work in retrieving 3D object, and the proposed method.

When these two methods are applied to SHREC 13 and SHREC 14, the proposed method retrieves more accurate 3D objects, which are marked by blue color, rather than using TCL.

Finally, the retrieved 3D object is augmented on the sketch which plays a role of the natural marker in AR system. As illustrated in Fig. 9, 3D objects of an airplane, a human hand and a tree are augmented on the corresponding sketch images.

5. Concluding Remarks

In this paper, we propose a deep learning based approach of retrieving a sketch-based 3D object as for an Augmented Reality Model. For this work, we utilize Sketch CNN, Wasserstein CNN and Wasserstein center loss for retrieving a sketch-based 3D object. We use two parts of networks to extract features of 3D data and user-drawn sketch from each image by Resnet. Wasserstein barycenters of 2D images taken from various directions of 3D data are evaluated from the extracted features of 3D data. The second CNN, which is called 'CNN2', maps the Wasserstein barycenters of 2D images and the sketch features to the corresponding outputs. In order to train the two parts of networks, Wasserstein distance loss function of the output is adopted. In the respect of the accuracy of retrieving 3D object, we can justify that the proposed method shows improved performance both on the SHREC 13 and SHREC 14 datasets. Moreover, we proposed sketch-based object matching scheme to localize the natural marker of the images to register a 3D virtual object in Augmented Reality. Using the detected sketch as a marker, the retrieved 3D object is augmented in AR automatically. Form the experiments, we prove that the proposed method is efficiency for retrieving and augmenting objects.

References

M. Eitz, R. Richter, T. Boubekeur, K. Hildebrand, and M. Alexa, "Sketch-based shape retrieval," ACM Transactions on Graphics, vol. 31, no. 4, pp. 1-10, 2012. https://doi.org/10.1145/2185520.2335382
B. Li, Y. Lu, A. Godil, T. Schreck, B. Bustos, A. Ferreira, T. Furuya, M. J. Fonseca, H. Johan, T. Matsuda, R. Ohbuchi, P. B. Pascoal, and J. M. Saavedra, "A comparison of methods for sketch-based 3D shape retrieval," Computer Vision and Image Understanding, vol. 119, pp. 57-80, 2014. https://doi.org/10.1016/j.cviu.2013.11.008
Fang Wang, Le Kang, and Yi Li, "Sketch-based 3D shape retrieval using Convolutional Neural Networks," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2015.7298797
R. Hadsell, S. Chopra, and Y. LeCun, "Dimensionality Reduction by Learning an Invariant Mapping," 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR'06), 2006. https://doi.org/10.1109/cvpr.2006.100
F. Schroff, D. Kalenichenko, and J. Philbin, "FaceNet: A unified embedding for face recognition and clustering," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2015.7298682
Y. Wen, K. Zhang, Z. Li, and Y. Qiao, "A Discriminative Feature Learning Approach for Deep Face Recognition," Lecture Notes in Computer Science, pp. 499-515, 2016. https://doi.org/10.1007/978-3-319-46478-7_31
A. Rolet, M. Cuturi, and G. Peyre. "Fast dictionary learning with a smoothed wasserstein loss," International Conference on Artificial Intelligence and Statistics, Cadiz, Spain, pp. 630-638, 2016. http://www.jmlr.org/proceedings/papers/v51/rolet16.pdf
B. Li, Y. Lu, A. Godil, T. Schreck, M. Aono, H. Johan, J. M. Saavedra, and S. Tashiro. "Shrec'13 track: Large scale sketchbased 3D shape retrieval," Eurographics Workshop on 3D Object Retrieval, Girona, Spain, pp. 89-96, 2013. https://dx.doi.org/10.2312/3DOR/3DOR13/089-096
T. Furuya and R. Ohbuchi. "Ranking on cross-domain manifold for sketch-based 3D model retrieval," International Conference on Cyberworlds, Yokohama, Japan, pp. 274-281, 2013. https://doi.org/10.1109/cw.2013.60
J. Xie, G. Dai, F. Zhu, and Y. Fang, "Learning Barycentric Representations of 3D Shapes for Sketch-Based 3D Shape Retrieval," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. https://doi.org/10.1109/cvpr.2017.385
J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyre, "Iterative Bregman Projections for Regularized Transportation Problems," SIAM Journal on Scientific Computing, vol. 37, no. 2, pp. A1111-A1138, 2015. https://doi.org/10.1137/141000439
V. I. Bogachev and A. V. Kolesnikov, "The Monge-Kantorovich problem: achievements, connections, and perspectives," Russian Mathematical Surveys, vol. 67, no. 5, pp. 785-890, 2012. https://doi.org/10.1070/rm2012v067n05abeh004808
Y. Rubner, C. Tomasi, and L. J. Guibas. "The Earth Mover's Distance as a metric for image retrieval," International Journal of Computer Vision, vol. 40, no. 2 pp. 99-121, 2000. https://doi.org/10.1023/a:1026543900054
M. Cuturi. "Sinkhorn distances: Lightspeed computation of optimal transport," Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp. 2292-2300, 2013. https://papers.nips.cc/paper/4927-sinkhorn-distances-lightspeed-computation-of-optimal-transport.pdf
R. Sinkhorn, "Diagonal Equivalence to Matrices with Prescribed Row and Column Sums," The American Mathematical Monthly, vol. 74, no. 4, p. 402, 1967. https://doi.org/10.2307/2314570
He, Xinwei, et al. "Triplet-Center Loss for Multi-View 3D Object Retrieval," arXiv preprint arXiv:1803.06189, 2018. http://openaccess.thecvf.com/content_cvpr_2018/Camera Ready/1632.pdf
N. Bonneel, G. Peyre, and M. Cuturi, "Wasserstein barycentric coordinates," ACM Transactions on Graphics, vol. 35, no. 4, pp. 1-10, 2016. https://doi.org/10.1145/2897824.2925918
P.-T. de Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, vol. 134, no. 1, pp. 19-67, 2005. https://doi.org/10.1007/s10479-005-5724-z
L. van der Maaten and G. Hinton. "Visualizing highdimensional data using t-SNE.," Journal of Machine Learning Research, vol. 9, pp. 2579-2605, 2008. http://www.jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf
B. Li, Y. Lu, C. Li, A. Godil, T. Schreck, M. Aono, M. Burtscher, H. Fu, T. Furuya, H. Johan, J. Liu, R. Ohbuchi, A. Tatsuma, and C. Zou. "Extended large scale sketch-based 3D shape retrieval," Eurographics Workshop on 3D Object Retrieval, Strasbourg, France, pp. 121-130, 2014. http://dx.doi.org/10.2312/3dor.20141058
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. https://doi.org/10.1109/cvpr.2016.90
S. Ferradans, G.-S. Xia, G. Peyre, and J.-F. Aujol, "Static and Dynamic Texture Mixing Using Optimal Transport," Scale Space and Variational Methods in Computer Vision, pp. 137-148, 2013. https://doi.org/10.1007/978-3-642-38267-3_12
K.V. Shriram, P.L.K. Priyadarsini, and A. Baskar, "An intelligent system of content-based image retrieval for crime investigation", Int. J. of Advanced Intelligence Paradigms, Vol. 7, No. 3/4, pp. 264-279. 2015. https://doi.org/10.1504/IJAIP.2015.073707
Eitz, M., Hildebrand, K., etal. "Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors," IEEE Transactions on Visualization and Computer Graphics, Vol. 17, No. 11, pp. 1624-1636, 2010. https://doi.org/10.1109/TVCG.2010.266
Loris Nanni, Alessandra Lumini, and Sheryl Brahnam, "Ensemble of shape descriptors for shape retrieval and classification," Int. J. of Advanced Intelligence Paradigms, Vol. 6, No. 2, pp.136-156. https://doi.org/10.1504/IJAIP.2014.062177

Cited by

A method of generating virtual shadow dataset of buildings for the shadow detection and removal vol.21, pp.5, 2020, https://doi.org/10.7472/jksii.2020.21.5.49

Journal of Internet Computing and Services (인터넷정보학회논문지)

A Sketch-based 3D Object Retrieval Approach for Augmented Reality Models Using Deep Learning

Abstract

Keywords

1. Introduction

2. Related Works

3. Proposed Method

3.1 Wasserstein Distance

3.2 Wasserstein Center

3.3 Wasserstein CNN

3.4 Wasserstein Center Loss

3.5 Sketch-based 3D Augmentation

4. Experimental Results

5. Concluding Remarks

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)