1. Introduction
With the rapid growth of the number of vehicles, more fake plate vehicles appear than before, and a large number of surveillance cameras have to be added on roads to monitor these violations [1]. However, it is very difficult to deal with so many videos in time by the traditional manual processing method, therefore, it is of great significance to design an automatic fake plate vehicle detection algorithm for surveillance video. Affected by the installation position and angle of surveillance camera, only the vehicle face area can be captured as shown in Fig. 1.
Fig. 1. The captured vehicle image
Obviously, the vehicle face area contains less features comparing with the whole vehicle, which brings some difficulties to multi-category vehicles recognition, so we need to pay more attention to the characteristics of several key regions in vehicle face images such as logo, grille, light and rearview mirror, etc. In the other words, the vehicle can be recognized by these key regions as shown in Fig. 2. From the above analysis, it is very important to establish the basis images of these key areas, where each vehicle image can be represented by the linear superposition of the basis images. In addition, we hope that while acquiring a suitable set of basis images, the new features obtained by decomposition can be conducive to the correct recognition of vehicle face images. Therefore, the major innovation of this paper is that for the multi-category and limited annotated vehicle face images, the feature bases which represent the key regions of vehicle face can be acquired based on the improved NMF model, where the NMF model can meet the requirements of the basis images and new features well.
Fig. 2. The relation between the whole vehicle face and some key regions
The remainder of this paper is organized as follows. In section 2, some related works are addressed. The original feature extraction of vehicle face image is completed in section 3. In section 4, an improved NMF method for vehicle face recognition is proposed. Section 5 uses a projected gradient algorithm to solve the proposed NMF objective function. We prove the effectiveness of the proposed algorithm through experiments in section 6. Finally, the conclusion is drawn in section 7.
2. Related Work
The development of vehicle recognition technology mainly experiences two key stages, which are based on traditional artificial feature extraction and deep learning respectively.
(1) Artificial feature extraction and classification. As an important global feature, color information has been widely used in vehicle recognition by scholars. For instance, Baek [2] and Kim [3] extracted the color histograms as the vehicle feature from RGB and HSV color spaces respectively; However, in the captured image the vehicle color is susceptible to light, so the specular-free image and the weighted-light-influence image were introduced by Hu [4], which make the extracted color features more robust to illumination variation; And besides color information, the texture, edge and shape of vehicle were also used as important global features by Chen [5]□Negri [6]□Zhang [7]. With the increasing number of vehicle types, there are minor visual differences between some vehicles of different types, so it is necessary to describe vehicle details by extracting local features. Lam [8] proposed a multi-scale spatial model to describe the vehicle local texture; Similarly using multi-scale theory, Psyllos [9] realized vehicle Logo recognition based on Scale-Invariant Feature Transform (SIFT) feature; And for some non-coorperative factors such as partial occlusion, attitude or angle variation, Deformable Part Model (DPM) and feature descriptor were adopted by Li [10] and Zhang [11], which make the recognition algorithm robust to the above factors. In addition, in order to represent the vehicle structure better, Leotta [12] and Yebes [13] modeled the vehicle in three-dimensional space.
(2) Vehicle recognition based on deep learning. The purpose of deep learning is to analyze the image layer by layer from shallow to deep by simulating the brain, and improve the accuracy recognition [14]. In recent years, deep learning has been widely used in vehicle recognition. For instance, Zhang [15] and Hu [16] achieved vehicle body color recognition accurately by combining Convolutional Neural Networks (CNN) model with spatial pyramid model; And Liu [17] and Hu [18] achieved vehicle recognition based on Fast Region Convolutional Neural Networks (Fast R-CNN) and Boltzmann machine respectively; In addition, Deep neural network (DBN) model is adopted by Wu [19] and Wang [20] to realize vehicle classification. In brief, the deep learning models mentioned above have achieved certain results in varying degrees.
From the above analysis, we can see that most of the researchers focused on the recognition based on the whole vehicle image, and there are relatively few studies on vehicle face recognition at present. Therefore, it will be of great significance to propose an effective vehicle face recognition algorithm.
3. Original Feature Extraction
Influenced by illumination variation, there may be some color difference for the same vehicle in different captured images as shown in Fig. 3, which reduces the effectiveness of the color feature-based recognition algorithms. In addition, under the condition that the number of annotated samples is limited, the algorithms based on deep learning are also difficult to achieve good effect. Therefore, we will pay more attention to the local regions with significant features, such as logo, grille, light and rearview mirror, etc.
Fig. 3. Vehicle color variation under different light intensities
According to image processing knowledge, the regions are surrounded by edges, and the edges are formed by the high-frequency pixels with similar directions, so it is critical to represent the frequency information of these pixels accurately for original feature extraction [21]. Because the Histogram of Oriented Gradient (HOG) takes into account both thefrequency and direction features, it is reasonable to use HOG as the original feature of vehicle face image.
First, the vehicle face area is segmented from the captured image based on Fast R-CNN model [22] and normalized into NN× pixels in size as shown in Fig. 4;
Fig. 4. Image preprocessing
Then, the preprocessed image is divided into some blocks, each of which isMM× pixels in size, and the adjacent blocks overlap T pixels. As a result, the number of blocks k can be obtained by Eq.1.
\(k=\left(\left\lfloor\frac{N-M}{M-T}\right\rfloor+1\right) \times\left(\left\lfloor\frac{N-M}{M-T}\right\rfloor+1\right)\) (1)
In addition, the number of angle intervals is supposed as t, the original eigenvector dimension will be n, where n = k × t.
4. Feature Dimension Reduction Based on Improved NMF
It is very important to acquire the effective feature bases for vehicle recognition which represent the key regions of vehicle face, the common dimension reduction methods include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), etc [23-24], where the elements of decomposition matrix can be positive or negative based on these methods. The negative elements are acceptable mathematically, but are hard to be explained in building basis images because of lacking of physical meanings [25]. For example, we know that face image can be constructed through the linear superposition of basis images, where the pixel values and the weight coefficients should be not negative in the factorization matrices. Therefore, it is more appropriate to adopt NMF based dimension reduction method, where the idea of NMF is that a given nonnegative matrix can be represented by two nonnegative matrices multiplication approximatively as Eq.2 [26],
Ynxm ≈ UnxrVrxm, s.t. ukivij ≥ 0 (2)
where all column vectors of Y are the original feature vectors, all column vectors of U and V represent the basis vectors and the weighted coefficient vectors respectively, and the decomposition error should be small enough as Eq.3.
\(\boldsymbol{U}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{V}\|_{2}\) (3)
In order to make the decomposition more conducive to accurate recognition, it is necessary to add some appropriate constraint conditions to the decomposition besides the nonnegative constraint. Here, we can consider this problem from the following three aspects.
(1) Weighted constraint. The decomposed basis vectors can be considered to represent the different regions of vehicle face, and the important degrees of these regions are different during recognition, so it is reasonable to add weighted constraints to basis vectors, where the objective function Eq.3 can be improved to Eq.4,
\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{U, V, Z}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}\) (4)
where Z represents the weight matrix.
(2) Sparse constraint. Generally, only a small number of basis vectors play important roles in recognition, which are considered to represent the key regions of vehicle face, so it is necessary to add sparse constraint to the weight matrix Z, and the objective function Eq.4 can be improved to Eq.5.
\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{0}\right\}\) (5)
According to compressed sensing [27], it is a NP-hard problem to solve matrix 0-norm, so we use 2-norm instead of 0-norm in solving the sparsity of matrix, and Eq.5 is further improved to Eq.6,
\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}\right\}\) (6)
where α is a balance parameter.
(3) Classification property constraint. According to pattern recognition theory, the features of the samples with the same label should be as similar as possible [28-29]. Therefore, we add the within-class similarity and inter-class distinction measures to the objective function, and the final objective function is shown as Eq.7,
\(\boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*}=\underset{\boldsymbol{U}, \boldsymbol{V}, \boldsymbol{Z}}{\arg \min }\left\{\frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}+\frac{\beta}{2}\left(f_{i}(\boldsymbol{V})-f_{e}(\boldsymbol{V})\right)\right\}\) (7)
where β is another balance parameter other than α, ()ifV and ()efV represent the within-class similarity and the inter-class distinction measures respectively. Then, we will give the detailed functional forms of fi(V) and fe(V).
The within-class similarity function fi(V):
The auxiliary matrix A is required as shown in Eq.8,
\(\boldsymbol{A}=\left[\begin{array}{llll} \boldsymbol{A}_{1} & & & \\ & \boldsymbol{A}_{2} & & \\ & & \mathrm{O} & \\ & & & \boldsymbol{A}_{c} \end{array}\right]_{m \times m}\) (8)
\(\boldsymbol{A}_{i}=\left[\begin{array}{cccc} \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \\ \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \\ \mathrm{M} & \mathrm{M} & \mathrm{O} & \mathrm{M} \\ \frac{1}{d_{i}} & \frac{1}{d_{i}} & \mathrm{~L} & \frac{1}{d_{i}} \end{array}\right], i=1,2, \mathrm{~K}, c\) (9)
\(\boldsymbol{V} \boldsymbol{A}=\left[\begin{array}{lllllll} \overline{\boldsymbol{V}}_{1} & \overline{\boldsymbol{V}}_{1} & \mathrm{~L} & \overline{\boldsymbol{V}}_{1} & \overline{\boldsymbol{V}}_{2} & \mathrm{~L} & \overline{\boldsymbol{V}}_{c} \end{array}\right]_{m \times n}\) (10)
where c and di are the numbers of the labels and the samples with label-i respectively in Y, and \(\overline{\boldsymbol{V}}_{i}\) represents the average feature vector of the samples with label-i.
\(f_{i}(\boldsymbol{V})=\|\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A}\|_{2}\) (11)
The inter-class distinction function fe(V) :
The auxiliary matrix B is required as shown in Eq.12,
\(\boldsymbol{B}=\left[\begin{array}{cccc} \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \\ \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \\ \mathrm{M} & \mathrm{M} & \mathrm{O} & \mathrm{M} \\ \frac{1}{m} & \frac{1}{m} & \mathrm{~L} & \frac{1}{m} \end{array}\right]_{m \times m}\) (12)
\(\boldsymbol{V B}=\left[\begin{array}{llll} \overline{\boldsymbol{V}} & \overline{\boldsymbol{V}} & \mathrm{L} & \overline{\boldsymbol{V}} \end{array}\right]\) (13)
where \(\overline{\boldsymbol{V}}\) is the average feature vector of all samples.
\(V A-V B=\left[\begin{array}{lllllll} \bar{V}_{1}-\bar{V} & \bar{V}_{1}-\bar{V} & \mathrm{~L} & \bar{V}_{1}-\bar{V} & \bar{V}_{2}-\bar{V} & \mathrm{~L} & \bar{V}_{c}-\bar{V} \end{array}\right]_{m \times n}\) (14)
\(f_{e}(\boldsymbol{V})=\|\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B}\|_{2}\) (15)
In summary, the objective function can be written in the form of Eq.16.
\(\begin{aligned} \boldsymbol{U}^{*}, \boldsymbol{Z}^{*}, \boldsymbol{V}^{*} &=\underset{U, Z, V}{\arg \min } J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V}) \\ &=\underset{U, Z, V}{\arg \min } \frac{1}{2}\|\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}\|_{2}+\frac{\alpha}{2}\|\boldsymbol{Z}\|_{2}+\frac{\beta}{2}\left(\|\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A}\|_{2}-\|\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B}\|_{2}\right) \end{aligned}\) (16)
5. Objective Function Solution Based on Projected Gradient Method
In order to take the partial derivative conveniently, the function (),J,UZV can be written to Eq.17,
\(\begin{aligned} J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})=& \frac{1}{2} \operatorname{tr}\left[(\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V})^{T}(\boldsymbol{Y}-\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V})\right]+\frac{\alpha}{2} \operatorname{tr} \boldsymbol{Z}^{T} \boldsymbol{Z}+\\ & \frac{\beta}{2}\left\{\operatorname{tr}\left[(\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A})^{T}(\boldsymbol{V}-\boldsymbol{V} \boldsymbol{A})\right]-\operatorname{tr}\left[(\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B})^{T}(\boldsymbol{V} \boldsymbol{A}-\boldsymbol{V} \boldsymbol{B})\right]\right\} \end{aligned}\) (17)
The partial derivatives are shown in Eq.18, Eq.19 and Eq.20.
\(\frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{U}}=-\boldsymbol{Y} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}+\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\) (18)
\(\frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{Z}}=-\boldsymbol{U}^{T} \boldsymbol{Y} \boldsymbol{V}^{T}+\boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T}+\alpha \boldsymbol{Z}\) (19)
\(\begin{aligned} \frac{\partial J(\boldsymbol{U}, \boldsymbol{Z}, \boldsymbol{V})}{\partial \boldsymbol{V}} &=-\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{Y}+\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}+\beta \boldsymbol{V}-\beta \boldsymbol{V} \boldsymbol{A}^{T}-\beta \boldsymbol{V} \boldsymbol{A}+\beta \boldsymbol{V} \boldsymbol{A} \boldsymbol{B}^{T}+\\ & \beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{A}^{T}-\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{B}^{T} \end{aligned}\) (20)
Then, the multiplicative update rules can be deduced finally as shown in Eq.21, Eq.22 and Eq.23.
\(u_{i j} \leftarrow u_{i j} \frac{\left(\boldsymbol{Y} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\right)_{i j}}{\left(\boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T} \boldsymbol{Z}^{T}\right)_{i j}}\) (21)
\(z_{i j} \leftarrow z_{i j} \frac{\left(\boldsymbol{U}^{T} \boldsymbol{Y} \boldsymbol{V}^{T}\right)_{i j}}{\left(\boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V} \boldsymbol{V}^{T}+\alpha \boldsymbol{Z}\right)_{i j}}\) (22)
\(v_{i j} \leftarrow v_{i j} \frac{\left(\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{Y}+\beta \boldsymbol{V} \boldsymbol{A}+\beta \boldsymbol{V} \boldsymbol{A}^{T}+\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{B}^{T}\right)_{i j}}{\left(\boldsymbol{Z}^{T} \boldsymbol{U}^{T} \boldsymbol{U} \boldsymbol{Z} \boldsymbol{V}+\beta \boldsymbol{V}+\beta \boldsymbol{V} \boldsymbol{A} \boldsymbol{B}^{T}+\beta \boldsymbol{V} \boldsymbol{B} \boldsymbol{A}^{T}\right)_{i j}}\) (23)
After the iteration rules are determined, the training and recognition methods of the proposed NMF model can be summarized as shown in Method 1 and Method 2 respectively.
Method 1
Input: All original feature vectors and their labels in Y , two balance parameters α and β .
Step.1: Initialize U(0) , Z(0) , V(0) , the maximum number of iterations nmax , and the error threshold e . In addition, the count variable t is set as 0.
Step.2: t = t +1 .
Step.3: Solve J(U(t), Z(t), V(t)).
if J(U(t), Z(t), V(t)) < e of t > nmax
goto Step.5
else
goto Step.4
Step.4: Update U , Z and V .
\(u_{i j}^{(t+1)} \leftarrow u_{i j}^{(t)} \frac{\left(\boldsymbol{Y}\left(\boldsymbol{V}^{(t)}\right)^{T} \boldsymbol{Z}^{(t) T}\right)_{i j}}{\left(\boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}\left(\boldsymbol{V}^{(t)}\right)^{T}\left(\boldsymbol{Z}^{(t)}\right)^{T}\right)_{i j}}\)
\(z_{i j}^{(t+1)} \leftarrow z_{i j}^{(t)} \frac{\left(\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{Y}\left(\boldsymbol{V}^{(t)}\right)^{T}\right)_{i j}}{\left(\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}\left(\boldsymbol{V}^{(t)}\right)^{T}+\alpha \boldsymbol{Z}^{(t)}\right)_{i j}}\)
\(v_{i j}^{(t+1)} \leftarrow v_{i j}^{(t)} \frac{\left(\boldsymbol{Z}^{(t)}\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{Y}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A}^{T}+\beta \boldsymbol{V}^{(t)} \boldsymbol{B} \boldsymbol{B}^{T}\right)_{i j}}{\left(\left(\boldsymbol{Z}^{(t)}\right)^{T}\left(\boldsymbol{U}^{(t)}\right)^{T} \boldsymbol{U}^{(t)} \boldsymbol{Z}^{(t)} \boldsymbol{V}^{(t)}+\beta \boldsymbol{V}^{(t)}+\beta \boldsymbol{V}^{(t)} \boldsymbol{B} \boldsymbol{A}^{T}+\beta \boldsymbol{V}^{(t)} \boldsymbol{A} \boldsymbol{B}^{T}\right)_{i j}}\)
goto Step.2
Step.5: Acquire U∗ , Z∗ and V∗ .
End training.
Method 2
Input: The original feature vector of the unknown vehicle face image Yw , and the similarity threshold ξ .
Step 1: Solve \(\boldsymbol{V}_{w}=\left(\boldsymbol{Z}^{* T} \boldsymbol{U}^{* T} \boldsymbol{U}^{*} \boldsymbol{Z}^{*}\right)^{-1} \boldsymbol{Z}^{* T} \boldsymbol{U}^{* T} \boldsymbol{Y}_{w}\) .
Step 2: if D(\(\boldsymbol{V}_{w}\), \(\bar{V}_{i}\)) = max {D(\(\boldsymbol{V}_{x}\), \(\bar{V}_{i}\)), i = 1,2,K, s} and D(\(\boldsymbol{V}_{w}\), \(\bar{V}_{i}\)) < ξ
The label of Yw is i ;
else
There is no matching result of Yw in the data set;
where \(D\left(\boldsymbol{V}_{i}, \boldsymbol{V}_{j}\right)=\frac{\left\langle\boldsymbol{V}_{i}, \boldsymbol{V}_{j}\right\rangle}{\left\|\boldsymbol{V}_{i}\right\| \mathrm{g}\left\|\boldsymbol{V}_{j}\right\|}\).
According to [25], the iteration has been proved to be convergent.
6. Experiment
6.1 Data Set
At present, there is no large-scale public data set of vehicle face images, so we build a new data set, where all vehicle face images were taken from 22 surveillance cameras which were distributed on different roads. The number of captured images is 103028, of which 80197 are effective, and some of the samples are shown in Fig. 5.
Fig. 5. Partial samples in data set
In the effective samples, there are 4136 pairs of vehicle face images, each of which represents the same vehicle, so we select these images as the positive test samples. In addition, 5000 pairs of vehicle images are selected as the negative test samples, where each pair of images represent the different vehicles.
6.2 Parameters Setting
In the proposed algorithm, some parameters are determined according to experience, and the others are acquired based on experimental results.
(1) The experience-based parameters.
According to the research experience about other recognition problems, the empirical values of some parameters can be given in the proposed algorithm. In Eq.1, 256N=, 32M=, 8T=; In Eq.2, the number of training samples 600m=; In addition, the number of HOG angle intervals t and the maximum number of iterations nmax are set as 8 and 25000 respectively.
(2) The experiment-based parameters.
Besides the above empirical parameters, there are 4 parameters r, α, β and ξ which need to be determined experimentally, that is, through comparing the recognition performances under different parameters, we can obtain the most appropriate parameters which make the recognition effect best.
From the above analysis, it is reasonable to determine the parameters r, α and β according to Eq.24,
\(r^{*}, \alpha^{*}, \beta^{*}=\underset{r, \alpha, \beta}{\arg \min }\left(F_{\text {far }}(r, \alpha, \beta)+F_{f r r}(r, \alpha, \beta)\right)\) (24)
where r/n ∈ {0.2,0.3,...,0.7} , α, β∈ {10,1,0.1,0.01} , Ffar and Ffrr represent False Accept Rate (GAR) and False Reject Rate (FRR) respectively, and the experimental results are shown in Fig. 6.
Fig. 6. Comparison of recognition performances under different parameters
From Fig. 6, when r = 0.4n, α = 0.1, and β = 1, the best recognition performance can be achieved.
Unlike the parameters r, α and β, the similarity threshold ξ can be determined according to Eq.25
\(\xi^{*}=\underset{\xi}{\arg \min }\left(G_{g a r}(\xi)-F_{f a r}(\xi)\right)\) (25)
where Ggar represents Genuine Accept Rate (GAR), and the experimental result is shown in Fig. 7.
Fig. 7. The GAR and FAR curves under different thresholds
It can be seen that when ξ = 0.86, the best classification result can be acquired.
6.3 Comparison and Analysis of Algorithms
After determining all parameters of the algorithm, we compare the proposed algorithm with some dimension reduction methods such as PCA [30], LDA [24], Sparse NMF (SNMF) [31], Discriminant NMF (DNMF) [25], and t-SNE [32] some existed vehicle recognition algorithms which are based on color feature [3], SIFT feature [9], 3D model [13] and CNN [15,17] respectively by FAR-FRR curves, where the comparison results are shown in Fig. 8.
Fig. 8. The performance comparison result of different algorithms
From Fig. 8, we can see that the proposed algorithm outperforms the other algorithms in performance, where the main reasons are as follows:
(1) PCA belongs to unsupervised learning, although the feature dimensions can be reduced effectively, the contribution to classification is not great, which leads to that the recognition effect is unsatisfactory. Different from PCA, LDA belongs to supervised learning, and the dimension reduction is realized according to the feature differences, so the recognition effect based on LDA is better than those based on PCA. From the analysis in Section 4, NMF has been used more and more because of its physical meaning, and according to pattern recognition theory, sparse and discriminant constraints are usually added to the NMF model, which can further improve the recognition effect. In addition, t-SNE is a non-linear dimensionality reduction method, and it can retain the local features of the vehicle face images well, so the recognition effect can be also improved. In the proposed algorithm, we add classification property constraint to the NMF model according to the special characteristics of vehicle face image, which makes the feature more conductive to recognition after dimension reduction.
(2) Under different illumination conditions, the same color vehicles will appear a certain degree of color differences in the captured images, which weakens the effectiveness of the color-based recognition algorithms, in other words, the algorithms have weak robustness to illumination variation. The key point in vehicle image is another important feature besides color such as SIFT point, but there are relatively few feature points in some vehicle face images, which reduces the feature effectiveness and brings difficulties to recognition. Different from the above manually extracted features, more effective features can be extracted automatically based on deep learning methods such as CNN. However, only a limited number of samples are annotated in the data set, so the over-fitting problem may be easily caused in the training process of model, which will affect the universality of the model greatly. Since only the vehicle face region is captured, it is concluded that the extracted features based on 3D are not very effective.
According to the above experimental results and analysis, the better recognition results can be achieved based on the proposed algorithm in terms of accuracy and robustness.
7 Conclusion
According to the proposed idea that vehicle type can be determined by a few key regions in vehicle face image, we can acquire a set of effective feature bases through the improved NMF model, and achieve the correct recognition of vehicle face, that is, the accurate detection of fake plate vehicles. Therefore, the proposed algorithm is of great significance both in theoretical research and in practical application. However, although some good results have been achieved, there are still some problems to be solved in order to further improve the universality of the algorithm, such as the scale of the data set needs to be expanded, and the same type of vehicles with different license plates need to be differentiated precisely.
References
- M. Swathy, P. S. Nirmala and P. C. Geethu, "Survey on vehicle detection and tracking techniques in video surveillance," International Journal of Computer Applications, vol. 160, no. 7, pp. 22-25, 2017. https://doi.org/10.5120/ijca2017913086
- N. Baek, S. M. Park, K. J. Kim and S. B. Park, "Vehicle color classification based on the support vector machine method," in Proc. of International Conference on Intelligent Computing, pp. 1133-1139, August 21-24, 2007.
- K. J. Kim, S. M. Park and Y. J. Choi, "Deciding the number of color histogram bins for vehicle color recognition," in Proc. of Asia-Pacific Services Computing Conference, pp. 134-138, December 9-12, 2008.
- W. Hu, J. Yang, L. Bai, Bai L and L. X. Yao, "A new approach for vehicle color recognition based on specular-free image," in Proc. of sixth International Conference on Machine Vision, pp. 90671Q, December 24, 2013.
- P. Chen, X. Bai and W. Liu, "Vehicle color recognition on urban road by feature context," IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 5, pp. 2340-2346, 2014. https://doi.org/10.1109/TITS.2014.2308897
- P. Negri, X. Clady, M. Milgram and R. Poulenard, "An oriented-contour point based voting algorithm for vehicle type classification," in Proc. of International Conference on Pattern Recognition, pp. 574-577, August 20-24, 2006.
- B. Zhang, "Reliable classification of vehicle types based on cascade classifier ensembles," IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 1, pp. 322-332, 2013. https://doi.org/10.1109/TITS.2012.2213814
- W. W. L. Lam, C. C. C. Pang and N. H. C. Yung, "Vehicle-Component Identification Based on Multiscale Textural Couriers," IEEE Transactions on Intelligent Transportation Systems, vol. 8, no. 4, pp. 681-694, 2007. https://doi.org/10.1109/TITS.2007.908144
- A. P. Psyllos, C. N. E. Anagnostopoulos and E. Kayafas, "Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme," IEEE Transactions on Intelligent Transportation Systems, vol. 11, no. 2, pp. 322-328, 2010. https://doi.org/10.1109/TITS.2010.2042714
- L. J. Li, H. Su, E. and E. P. Xing, "Object bank: a high-level image representation for scene classification and semantic feature sparsification," Advances in Neural Information Processing Systems, pp. 1378-1386, 2010.
- N. Zhang, R. Farrell, F. Iandola and T. Darrell, "Deformable part descriptors for fine-grained recognition and attribute prediction," in Proc. of 2013 IEEE International Conference on Computer Vision, pp. 729-736, December 1-8, 2013.
- M. J. Leotta and J. L. Mundy, "Vehicle surveillance with a generic, adaptive, 3D vehicle model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 7, pp. 1457-1469, 2011. https://doi.org/10.1109/TPAMI.2010.217
- J. J. Yebes, L. M. Bergasa and M. Garca-garrido, "Visual object recognition with 3D-aware features in KITTI urban scenes," Sensors, vol. 15, no. 4, pp. 9228-9250, 2015. https://doi.org/10.3390/s150409228
- Y. Lecun, Y. Bengio and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
- Q. Zhang, Z. Li, J. F. Li, J. Zhang, H. Zhang and X. G. Li, "Vehicle color recognition using Multiple-Layer Feature Representations of lightweight convolutional neural network," Signal Processing, vol. 147, pp. 146-153, 2018. https://doi.org/10.1016/j.sigpro.2018.01.021
- C. Hu, X. Bai, L. Qi and P. Chen, "Vehicle Color Recognition With Spatial Pyramid Deep Learning," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp. 1-10, 2015. https://doi.org/10.1109/TITS.2015.2393752
- M. Liu, C. Yu, H. F. Ling and J. Lei, "Hierarchical joint CNN-based models for fine-grained cars recognition," in Proc. of International Conference on Cloud Computing and Security, pp. 337-347, July 29-31, 2016.
- A. Hu, H. Li, F. Zhang and W. Zhang, "Deep Boltzmann machines based vehicle recognition," in Proc. of The 26th Chinese Control and Decision Conference, pp. 3033-3038, May 31-June 2, 2014.
- Y. Y. Wu and C. M. Tsai, "Pedestrian, bike, motorcycle, and vehicle classification via deep learning: deep belief network and small training set," in Proc. of 2016 International Conference on Applied System Innovation, pp. 1-4, May 26-31, 2016.
- H. Wang, Y. F. Cai and L. Chen, "A Vehicle Detection Algorithm Based on Deep Belief Network," The Scientific World Journal, vol. 2014, pp. 1-7, 2014.
- Z. H. Liu, Z. H. Lai, W. H. Ou, K. B. Zhang and R. J. Zheng, "Structured optimal graph based sparse feature extraction for semi-supervised learning," Signal Processing, vol. 170, 107456, 2020. https://doi.org/10.1016/j.sigpro.2020.107456
- S. Q. Ren, K. M. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39. no. 6, pp. 1137-1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
- A. Sharma, K. K. Paliwal and G. C. Onwubolu, "Class-dependent PCA, MDC and LDA: A combined classifier for pattern classification," Pattern Recognition, vol. 39, no. 7, pp. 1215-1229, 2006. https://doi.org/10.1016/j.patcog.2006.02.001
- Z. H. Liu, J. J. Wang, G. Liu and L. Zhang, "Discriminative low-rank preserving projection for dimensionality reduction," Applied Soft Computing, vol. 85, 105768, 2019. https://doi.org/10.1016/j.asoc.2019.105768
- J. Sun, X. B. Cai and F. M. Sun, "Dual graph-regularized Constrained Nonnegative Matrix Factorization for Image Clustering," KSII Transactions on Internet and Information Systems, vol. 11, no.5, pp. 2607-2627, 2017. https://doi.org/10.3837/tiis.2017.05.017
- M. H. Wan, Z. H. Lai, Z. Ming and G. W. Yang, "An improve face representation and recognition method based on graph regularized non-negative matrix factorization," Multimedia Tools and Applications, vol. 78, no. 15, pp. 22109-22126, 2019. https://doi.org/10.1007/s11042-019-7454-2
- D. L. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, 2006. https://doi.org/10.1109/TIT.2006.871582
- M. H. Wan, M. Li, G. W. Yang, S. Gai and Z. Jin, "Feature extraction using two-dimensional maximum embedding difference," Information Sciences, vol. 274, pp. 55-69, 2014. https://doi.org/10.1016/j.ins.2014.02.145
- M. H. Wan, Z. H. Lai, G. W. Yang, Z. J. Yang, F. L. Zhang and H. Zheng, "Local graph embedding based on maximum margin criterion via fuzzy set," Fuzzy Sets and Systems, vol. 318, pp. 120-131, 2017. https://doi.org/10.1016/j.fss.2016.06.001
- Y. Tang, C. Z. Zhang, R. S. Gu, P. Li and B. Yang, "Vehicle detection and recognition for intelligent traffic surveillance system," Multimedia tools and applications, vol. 76, no. 4, pp. 5817-5832, 2017. https://doi.org/10.1007/s11042-015-2520-x
- F. Esposito, N. Gillis, D. Del Buono, "Orthogonal joint sparse NMF for microarray data analysis," Journal of mathematical biology, vol. 79, no. 4, pp. 223-247, 2019. https://doi.org/10.1007/s00285-019-01355-2
- L. Hajderanj, I. Weheliye, D. Chen, "A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification," in Proc. of the 2019 8th International Conference on Software and Information Engineering, pp. 232-236, April 9-12, 2019.
Cited by
- Vehicle Face Re-identification Based on Nonnegative Matrix Factorization with Time Difference Constraint vol.15, pp.6, 2020, https://doi.org/10.3837/tiis.2021.06.009