DOI QR코드

DOI QR Code

An Eye Location based Head Posture Recognition Method and Its Application in Mouse Operation

  • Chen, Zhe (School of Information and Communication Engineering, Dalian University of Technology) ;
  • Yang, Bingbing (School of Information and Communication Engineering, Dalian University of Technology) ;
  • Yin, Fuliang (School of Information and Communication Engineering, Dalian University of Technology)
  • Received : 2014.08.27
  • Accepted : 2015.01.22
  • Published : 2015.03.31

Abstract

An eye location based head posture recognition method is proposed in this paper. First, face is detected using skin color method, and eyebrow and eye areas are located based on gray gradient in face. Next, pupil circles are determined using edge detection circle method. Finally, head postures are recognized based on eye location information. The proposed method has high recognition precision and is robust for facial expressions and different head postures, and can be used in mouse operation. The experimental results reveal the validity of proposed method.

Keywords

1. Introduction

Human-Computer Interaction (HCI) has the intensive applications in some fields, such as computer, intelligent robot [1-3], mobile phone, consumer appliance [5-7], game [8-10], information retrieval [22-23], etc. Traditional HCI devices include mouse and keyboard. In recent few years, speech recognition and hand gestures have become gradually popular in some applications, however, in some occasions, speech recognition and hand gestures are limited. For example, speech recognition has poor performance in a car or aircraft due to background noise. It is difficult to apply hand gestures and speech recognition in the circumstance of high-altitude operations. In a surgery, doctor must use hands to take scalpel or other surgical instruments, and speech is not clear due to wear a mask, so the performances of speech recognition and hand gestures are poor in this circumstance. In addition, for language disorders or arm disabilities, speech recognition or hand gestures cannot be also used. Head and eyes are relatively stable organs, less affected by outside environment, such as background noise, face expression, and ambient enviroment, therefore, HCI techniques based on head posture and eye position have been developed in recent years.

Head pose estimation and recognition methods can be broadly divided into five categories: the template method, the appearance method, the classification methods, the regression methods, and the geometric methods. The template method [1-3] first creates a template for a head, then estimates and identifies the head pose; The appearance method [4,5] usually assumes that there is a certain mapping between three- and two-dimensional face poses, and obtains the mapping relationship by training a large number of head or face images; The classification or regression methods [24-27] learn a mapping from the low dimensional manifold to the head angles, and then head postures are recognized using the mapping relation; The geometric methods [28-30] rely on the detection of facial features like corners of nose or mouth to estimate or recognize the head posture. However, these head posture recognition methods either are not robust to occlusion, scaling and other interferences, or require a heavy training burden. To copy with these problems, this paper tries to uses eye location information to recognize head posture. Moreover, different from the depth information or multiple cameras -based head posture recognition method, the proposed method only uses one camera and 2D eye location information to recognize head posture.

Eyes are important feature organs in the head or face, and can play an important support role in head gesture recognition. Researches on eye location in head or face image have significant progress in recent years. In 2008, Bacivarov et al [12] used the appearance method to locate eye. The method extends the conventional statistical model of appearance to the component-based active appearance model (AAM) by combining a global model (the two eyes together) with sub-models (the two eyes separately), however, it needs a lot of training data, which results in a slow computational procedure. In 2010, Cristiani et al used inverse compositional algorithm (ICA) to build the best fitting active appearance model, then the AAM is used to detect and locate human eye [6]. In 2010, Kasinski et al used haar cascade classifiers augmented with some simple knowledge-based rules to locate human eye [9-11]. This method has low computational complexity, but it is easily interfered by eyebrows, face pose and noise. In 2012, Mehta et al used the eye shape and the template method to position eye [7,8]. This method designs an eye generic template and then uses designed template to match eye in the eye search region, but it needs to design the different templates based on different illumination conditions and backgrounds, and the model could not adapt to different people. In 2013, Akinlar et al proposed an edge detection circle (ED-Circle) method [13] to locate eyes, which has the low computational complexity and high location precision.

The existing head posture recognition methods are not robust to occlusion, scaling and other interferences, or require a heavy training burden. To remedy these problems, a head posture reorganization method based on eye location is proposed in this paper. Face is fist detected by skin color model, and eyebrow and eye areas are located based on the characteristics of larger gray gradient in face image. Next, pupil circle is determined by edge detection circle (ED-Circle). Finally, head postures are recognized based on eye location.

The rest sections of this paper are organized as follows. Head posture recognition method based on eye location is proposed in Section II; Simulation results and analysis are given in Section III; Some conclusions are drawn in Section IV.

 

2. Head posture recognition method based on eye location

The existing head posture recognition methods are not robust to occlusion, scaling and other interferences, or require a heavy training burden. To resolve these problems, a head posture recognition method based on eye location is proposed. Block diagram of proposed head posture recognition scheme is shown in Fig. 1, which includes the five models: color correction, face detection, eyebrows and eye area location, pupil position, and head posture recognition, where color correction module corrects image color, eyebrow and eye area location module locates the eye region, pupil position module precisely locates eye, and the gesture recognition module identifies a head gesture based on eye location information.

Fig. 1.Block diagram of head posture recognition system based on eye location

Each module of the proposed method will be described in detail in following subsections.

2.1 Color correction

Due to capture video images color cast, the performance of skin color-based face detection will degenerate. To remedy the problem, the raw image needs color correction. In proposed scheme, Gray World theory is used to correct image color.

Let R(i,j), G(i,j), and B(i,j) denote the color values of Red, Green, and Blue of the pixel at (i, j) respectively; M and N are length and width of image respectively; Ra, Ga and Ba are the averages of R, G and B color components in image respectively. The conversion of RGB space of image is as follows

Let the gray world value be V= Ra+Ga+Ba. The values of R(i, j), G(i, j) and B(i, j) are adjusted with the gray world value respectively

Let f = max(R1(i, j),G1(i, j),B1(i, j)). If f > 255 , the R1(i, j),G1(i, j),and B1(i, j) should be scaled down

2.2 Face detection based on skin color model

In order to locate eye and estimate head posture, human face first needs to be detected and located. The appearance of skin color of human face is not easily affected by some factors such as facial expression, position and orientation, so a skin-color based face detection method is used.

To solve the color bias problem caused by colored light source, the values of R, G and B components of an input color image are adjusted by using the adjusted Y, Cb,Cg, and Cr components.The image from the RGB color space is converted into the YCbCrCg color space [14], i.e.

To use ellipsoid skin model, the YCbCrCg color space is further transformed into the CbrCbgCgr color space [14], i.e.

Thus the ellipsoid skin model in space CbrCbgCgris the following [14]

where l is the value of ellipsoid equation, and P(i, j) is the pixel value. P(i, j) is used to judge to the pixel whether is face pixel.

To trade off between the computational complexity and the detection accuracy, the fast face localization method based on skin color model [14] is adopted. The detailed information is described in the reference [14].

2.3 Position of eyebrows and eye area

In this paper, the gray-level gradient information is used to locate eyebrows and eye area from the face image. As eyes locate in the upper part of captured image, to reduce the computational load, the upper part of image only is traversed. Since eyebrow and eye areas have the feature of large gray gradient in face, this paper computes the gradient value of area B, and then moves area B by a certain step in area A to get area C, and computes the gradient value of area C. And then up to D.In Fig. 2, area B is total gray gradient value.

Fig. 2.Traversal diagram

The 3×3 gray-level matrix of an image at pixel (i, j) is shown in Fig. 3. The gradients at each direction at (i, j) are as follows

where f(i, j) is the gray value of image at point (i, j), Gx1(i, j), Gx2(i, j), Gy1(i, j), Gy2(i, j), Gz1(i, j), Gz2(i, j), Gz3(i, j) and Gz4(i, j) are the gradient values in the x-direction, y direction, and 45° direction respectively. The average of absolute value of these eight gradients is computed as

Fig. 3.Gray matrix of the image

Selecting a sub-block k (such as B in Fig. 2) from the gradient matrix processed by Eq. (9), where size of sub-block is determined by image size, the total gray gradient value of sub-block k computed as

where M1 and N1 are length and width of sub-block k, respectively.

The total gray gradient value of each sub-block is computed by the integral principle [16], and total gray gradient value of sub-block in the whole image is obtained by traversing image only one time, thus can reduces the computational load. Selecting some sub-blocks whose gray gradient values are larger than the others and merging these sub-blocks into a rectangle, a candidate of eyebrows and eye area can be obtained.

To further accelerate processing speed, the face area is downsamplinged [15], thus the positioning time of eyebrow and eye area can be reduced.

2.4 Pupil circle detection

2.4.1 Edge detection circle

To detect pupil circle, edge segments are first detected by edge drawing parameter free [17], and edge segments are converted into line segments [18]. The motivation of this step comes from the observation that any circular shape can be approximated by a consecutive set of lines, and these lines can easily be turned into circular arcs by a simple post-processing step. Then, the entire or part circles are detected by circular arc detection method. Finally, the candidate of circle are detected by arc join. The edge detection circle method is described in detail in reference [13].

2.4.2 Pupil circle discrimination

Usually, after processing of edge detection circle, there are many candidate circles, including pupil circles. It seems that pupil detection can be finished based on the fact that the total gray value of pupil circle is larger than others. In fact, the pupil circle can not be figured out only by this fact, because some circles in the eyebrow area also have large gray value. Thus in this paper, orientation of gradient is used to delete these eyebrow circles.

The pupil circle detection method is presented in the following:

After the eyebrow and eye area location, the output image 4 shown in Fig. 1 averages into two parts, g(i1, j1) and p(i2, j2); For the k-th candidate circle, the combination Fk of its gray value Hk and gray gradient value Sk are calculated as

where w is the weight factor, 0

Each candidate circle in the two parts g(i1, j1) and p(i2, j2) is substituted into Eq.(11). If the Fk value is the largest in all circles, the corresponding circle is identified as a pupil circle.

2.5 Head posture recognition method based on eye location

The current head posture recognition method mostly used the depth information or multiple cameras to recognize head posture, but these methods need expensive hardware devices or heavy training burden. In order to remedy these problems, the proposed method only uses one camera and 2D eye location information to recognize head posture.

Eyes are relatively stable organs in face, and can be used to recognize head posture. To estimate and recognize head posture, the geometry model of head posture recognition is designed as shown in Fig. 4. Seven head postures of of look at front, slant to left, slant to right, rise up, look down, gaze to right, and gaze to left are shown in Figs. 5(a)~(g).

Fig. 4.Geometry structure of head model

Fig. 5.Seven postures of head based on eye information

If the judgment is not slant to left and right, the coordinates of left eye and right eye in the previous frame images is defined as (ipl, jpl) and (ipr, jpr) respectively, the coordinates of left eye and right eye in current frame images is defined as (icl, jcl) and (icr, jcr) respectively. Let size of face image be P×Q pixels.

(1) Recognitions of head slant to left and right

The geometry structures of head slant to the left or right are shown in Figs. 5(b) and (c). After eyes are positioned, the Cartesian coordinates of left-and right eyes are available, including the left eye (icl, jcl) and right eye (icr, jcr), ordering inclination θ = arctan[(icr - icl)/(jcr - jcl)]. Assume that θ1, and θ2 are slant thresholds which are used to judge head slants to the left or right, respectively. Thus if θ ∈ (θ1, θ2), head slants to the righ; if θ ∈ (-θ2, -θ1), head slants to the left.

(2) Recognitions of head rise up and look down

The geometry structures of head rise up or look down are shown in Figs. 5(d) and (e). If (jcr - jpr) ≥ T1P and (jcl - jpl) ≥ T1P , the head posture is asserted to up; If (jpr - jcr) ≥ T1P and (jpl - jcl) ≥ T1P , the head posture is asserted to down, where T1 is the threshold.

(3) Recognitions of head gaze to left and right

The geometry structures of Gaze to right or left are shown in Figs. 5(f) and (g). If icl < ipl, icr < ipr, jcl ≈ jpl, jcr ≈ jpr and (icr - icl) ≤ T2 (ipr - ipl), the orientation of eyes gaze is asserted to left; If icl > ipl, icr > ipr, jcl ≈ jpl, jcr ≈ jpr and (icr - icl) ≤ T2 (ipr - ipl) , the orientation of eyes gaze is asserted to right, where T2 is the threshold.

The proposed method can improve the head posture recognition precision, and is robust to occlusion, scaling and other interferences.

2.6 Seven postures of head and its application in mouse operation

Traditional mouse is an important HCI device, but as it needs arm to finish HCI operation, it has some limitations for user. Head postures can be used to simulate mouse operation. In this paper, the seven kinds head postures may define seven operations, i.e. UP, DOWN, LEFT, RIGHT, OK, CANCEL, and NO-OPERATION. The mapping relations between the head postures and the mouse operations are shown in Table 1. Even though the precision of the virtual mouse is not better than the traditional mouse, it is friendlier to users and easier to use than traditional mouse, and it is easy to realize interaction with computer when hands could not use in some reason, for example, the people whose arm is disable can operate simply computer by the virtual mouse.

Table 1.Mapping relations between head postures and mouse operations

 

3. Simulations and Result Discussions

To verify the effectiveness of the proposed method, some experiments are carried out. In simulations, 120 images from IMM-FACE Database, issued by the Technical University of Denmark [19], are used. The size of each images is 640×480 pixels, and these images are front faces and have not occlusion in eyes and head. Self made with simple and complex background face databases are used. The size of images self made with simple background face database is 640×480 pixels, and size of images self made with complex background face database is 640×480 pixels and 760×460 pixels, respectively. The images from simple background face database are also front faces and have not occlusion in eyes and head. There are 600 face images in the complex background face database, including wearing glass, head covering and normal, and the number of each category is 200.

3.1 The experiments of pupil circle detection

Experiments of pupil circle detection are carried out under diffuse light, and simple and complex background conditions. The experiment of pupil circle detection with different weight w is shown in Fig. 6. It is observed from Fig. 6 that when w is approximately 0.5, the pupil detection performance is the best, so the value of w is set as 0.5 in following experiments. If the w value is not equal to 0.5, the pupil circle may be falsely located in eyebrow area.

Fig. 6.Detect performance of eye location with different weight w

Experiments of head posture recognition with different θ, T1, and T2 are shown in Figs. 7(a)~(c). It is observed from Figs. 7(a)~(c) that when θ ∈ (500, 850), T1 ∈ (0.04, 0.1), and T2 ∈ (0.75, 0.95), the head posture recognition performance is the best. Considering that a person head has a certain extent tilt, the values of θ1, θ2, T1, and T2 are set as 500, 750, 0.07, and 0.8 respectively.

Fig. 7.Head posture recognition performance with different weight (w) and thresholds (θ’, T1,T2)

3.2 The experiments of the human eye location

3.2.1 Experiments result of human face detection and location

In order to illustrate effectiveness of the proposed method, the results of face detection and location of Ref. [14] method are described in Table 2.

Table 2.Results of face detection and location of Ref. [14] method

Form the Table 2, the precision of face location of Ref.[14] method is high, and is enough for eye location and other operations, such as, face recognition, expression analysis, and so on. So the face location method is robust to different backgrounds, different ganders, occlusion, etc.

3.2.2 Experiments of the human eye location

To illustrate the performance of the proposed method, the template method [7], the projection function method [9], and the appearance method [12] are also implemented, and are compared with proposed method, and Quantitative method is used to verify effectiveness of the proposed method,

For performance comparison, a quantitative method is defined as [20]

where dl and dr are the Euclidean distances between the found left and right eye centers. d is the Euclidean distance between the eyes in the ground truth, and max(·) is the finding maximum function.

The experiments of human eye location are carried out under diffuse light, and simple and complex background conditions. The results of eye location of different methods in different databases are shown in Table 3 and Table 4 and Fig. 8. It is observed from Tab.3 that compared with Refs [7,9,12], the performance of the Ref. [7] method is obviously better than the Ref. [9] method and the Ref. [12] method, while the proposed method has more high position accuracy than the Ref. [7] method, that is, the performance of the proposed method is the best among these human eye location methods. Figure 8 also shows that the proposed method has the best location performance, no matter for high-precision localization (e<0.05), detection within the iris area (e<0.1), and coarse detection of the eye area (e<0.25).

Table 3.Comparison of normalized error scores of different method in different databases

Table 4.Eye location result in different face databases, when e<0.1

Fig. 8.Comparison results of normalized error of different methods

When e<0.1, the eye location results in different face databases are shown in Table 4. Simulation results of the proposed method for three face databases are shown in Figs. 9 and 10, where the green rectangle in Fig. 9 represents eyebrows and eye area; Fig. 9 shows the location results of eyebrow and eye region, and Fig. 10 shows the final eye location results. From Fig. 10, it is observed that the proposed method is robust to the head gesture (i.e. Figs. 10(c), 10(d), 10(f), 10 (k), and 10(l)), facial expression (i.e. Fig. 10(b)), hair shelter (i.e. Fig. 10(g), 10(h), and 10(j)), mustache (i.e. Figs. 10(a), 10(c), and 10 (e)). Fig. 10(g) and Fig. 10(h) show the eye location result of same person with different image size respectively. The simulation results in Fig. 10(g) and Fig. 10(h) show the proposed method is robust to image scaling; Figs. 10(i), 10(n), 10(o) show that the proposed method is robust to glasses. From Fig. 10(i), 10(n), the proposed method successfully locates eye, even though eye is partly occluded, so Figs. 10(i), and 10(n) shows that the proposed method is robust to occlusion. Fig. 10(o) shows that the proposed method is robust to wear hat. The Ref.[7] method has well location result with different temples, however, many temples must be designed based on different condition, i.e., it is not robust to different people or backgrounds. The Refs [9] method and the [12] method have bad eye location results in female images, especially if female has long hair in forehead.

Fig. 9.Detect results of face, eyebrow and eye area: (a)~(f) simulation results of the IMM face database, (g)~(h) Self-made with simple background face database, and (i)~(o) self-made with complex background face database

Fig. 10.Results of eye location: (a)~(f): IMM face database; (a) simulation result of a bearded man; (b) simulation result of a woman; (c) simulation result of a man whose eyes gaze to left; (d) simulation of a man whose eyes gaze to right; (e) simulation result of a man whose eyes gaze to right; (f) simulation result of a woman whose eyes gaze to right. (g)~(h): self-made with simple background face database; (g)~(h) simulation result of a woman;(i)~(o): self-made with complex background face database; (i) simulation result of man who wear glass; (j) simulation result of woman having long hair; (k) simulation result of man whose hair occludes partly eye;(l) simulation result of a man whose head and eyes were up; (m) simulation result of a woman whose head or eyes gaze to right; (n) simulation result of a man whose eye is partly occlude; (o) simulation result of a man wearing glass and hat

3.3 Recognitions of head postures in mouse operation

Recognition experiments of seven head postures are carried out under the conditions with diffuse light, simple and complex backgrounds, and one people in video image. In order to verify the effectiveness of the proposed method, the three performance measures [21] are used.

The miss rate of head postures is defined as

The false positives rate of head postures is defined as

The true recognition rate of head postures is defined as

where mr is the number of false negatives, fr is the number of false positives, s is the number of true recognition head posture, Ps is the number of positive sample, and Ns is the number of negative sample; positive sample is these images including eyes and head, negative sample is these images including no eyes and head.

The recognition results of seven head postures are shown in Table 5. Each head posture in Table 5 includes 150 positive sample images and 100 negative sample images, and total test images include 1050 positive sample images and 700 negative sample images.

Table 5.Tr: true recognition rate, Mr: miss rate, and Fr: false positive rate

From Table 5, we can see that the Ref.[7] method has better recognition performance than the Ref.[9] method and the Ref.[12] method, while the recognition performance of the proposed method is better than that of the Ref.[7] method, i.e., the proposed method no only has good recognition precision, but also has low miss rate and false positive rate, this is, the performance of proposed method is stable. It is observed from Table 5 that when a head is down, the recognition precision for DOWN is a little low, because the pupil is easily obscured by eyelid, which leads to eye location fail and then results in a low head postures recognition precision

 

4. Conclusions

In this paper, a head posture recognition method based on eye location is proposed. Eyes are first located using the larger gradient and gray values, and then head postures are recognized based on eye position information. Finally, the proposed method is used to emulate typical mouse operations. Simulation results show that the proposed method has high recognition precision, low miss rate and false positive rate. The proposed method can be used in human computer interaction applications

References

  1. Ma. B. P, Yang. X, Shan. S. G, "Head Pose Estimation Via Background Removal," in Proc. of 5th Int. Workshop on Intelligent Science and Intelligent Data Engineering, pp.140-147, Oct. 2013.
  2. Murphy C. E, Trivedi M M, "Head pose estimation and augmented reality tracking: an integrated system and evaluation for monitoring driver awareness," IEEE Trans. on Intelligent Transportation Systems, vol.11, no.2, pp. 300-311, Jun. 2010. https://doi.org/10.1109/TITS.2010.2044241
  3. Cai, Q, Sankaranarayanan. A, Zhang. Q, Zhang. Z, Liu. Z, "Real time head pose tracking from multiple cameras with a generic model," in Proc. of 2010 IEEE Computer Society Conf. on Vision and Pattern Recognition Workshops, pp. 25-32, Jun. 2010.
  4. Huang. D, Storer. M, DeLaTorre. F, Bischof. H, "Supervised local subspace learning for continuous head pose estimation," in Proc. of 2011 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2921-2928, Jun. 2011.
  5. Fu. Y, Huang. T. S, "Graph embedded analysis for head pose estimation." in Proc. of 7th Int. Conf. on Automatic Face and Gesture Recognition, pp. 6-8, Apr. 2006.
  6. Cristiani. A, Porta. M, Gandini. D, Bertolotti. G. M, Serbedzija. N, "Driver drowsiness identification by means of passive techniques for eye detection and tracking," in Proc. of IEEE 4th Int. Conf. on Self-Adaptive and Self-Organizing Systems Workshop, pp. 142-146, Sep. 2010.
  7. Mehta. R, Shrivastava. M, "Real time eye template detection and tracking," Int. Journal of Advanced Computer Research, vol.2, no.2, pp. 18-22, Jun. 2012. http://www.theaccents.org/ijacr/papers/current_jun_2012/3.pdf
  8. Choi. J. K, Chung. S. T, "Reliable and fast eye detection," in Proc. of 9th Int. Conf. on Signal Processing, Computational Geometry and Artificial Vision, pp. 78-82, 2009. http://dl.acm.org/citation.cfm?id=1627548
  9. Lu. L, Yang. Y, Wang. L, Tang. B, "Eye location based on gray projection," in Proc. of IEEE 3th Int. Symposium on Intelligent Information Technology Application, pp.58-60, Nov. 2009.
  10. Yan. B, Zhang. X, Gao. L, "Improvement on pupil positioning algorithm in eye tracking technique," in Proc. of IEEE Int. Conf.on Information Engineering and Computer Science, pp. 1- 4, Dec. 2009.
  11. Kasinski. A, Schmidt. A, "The architecture and performance of the face and eyes detection system based on the Haar cascade classifiers," Pattern Analysis & Applications, vol.13, no.2, pp. 197-211, May, 2010. https://doi.org/10.1007/s10044-009-0150-5
  12. Bacivarov. I, Ionita. M, Corcoran. P, "Statistical models of appearance for eye tracking and eye-blink detection and measurement," IEEE Trans. on Consumer Electronics, vol.54, no.3, pp.1312-1320, Aug. 2008. https://doi.org/10.1109/TCE.2008.4637622
  13. Akinlar C, Topal C, "ED-Circles: A real-time circle detector with a false detection control," Pattern Recognition, vol.46, no.3, pp.725-740, March, 2013. https://doi.org/10.1016/j.patcog.2012.09.020
  14. Li W, Jiao F, He C, "Face detection using ellipsoid skin model," Emerging Technologies for Information Systems, Computing, and Management, Lecture Notes in Electrical Engineering, vol 236, pp.513-521, 2013. https://doi.org/10.1007/978-1-4614-7010-6_58
  15. Burt. P, Adelson. E, "The Laplacian pyramid as a compact image code," IEEE Trans. on Communications, vol.31, no.4, pp. 532-540, Apr. 1983. https://doi.org/10.1109/TCOM.1983.1095851
  16. Viola. P, Jones. M, "Rapid object detection using aBoosted cascade of simple features," in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 511-518, Dec. 2001.
  17. Akinlar C, Topal C, "EDPF: A real-time parameter-free edge segment detector with a false detection control," Int. Journal of Pattern Recognition and Artificial Intelligence, vol. 26, no.1, pp.1-22, May 2012.
  18. Akinlar C, Topal C, "EDLines: A real-time line segment detector with a false detection control," Pattern Recognition Letters, vol.32, no.13, pp.1633-1642, Oct. 2011. https://doi.org/10.1016/j.patrec.2011.06.001
  19. Nordstrom M M, Larsen M, Sierakowski J, et al, "The IMM face database-an annotated dataset of 240 face images," Technical University of Denmark, DTU Informatics, Building 321, 2004. http://orbit.dtu.dk/fedora/objects/orbit:79272/datastreams/file_3069003/content
  20. Jesorsky O, Kirchberg K J, Frischholz R W, "Robust face detection using the hausdorff distance," in Proc. of 3th Int. Conf. Audio-and Video-based Biometric Person Authentication, pp.90-95, Jun. 2001.
  21. Dalal N, Triggs B, "Histograms of oriented gradients for human detection," in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, Jun. 2005.
  22. Li Z, Liu J, Zhu X, "Multi-modal multi-correlation person-centric news retrieval," in Proc. of 19th ACM Int. Conf. on Information and Knowledge Management, pp.179-188, Oct. 2010.
  23. Li Z, Liu J, Lu H, "Sparse constraint nearest neighbour selection in cross-media retrieval," in Proc. of IEEE Int.Conference on Image Processing, pp.1465-1468, Sep. 2010.
  24. Oyini Mbouna R, Kong S G, Chun M G, "Visual analysis of eye state and head pose for driver alertness monitoring," IEEE Transactions on Intelligent Transportation Systems, vol.14, no.3, pp. 1462-1469, Sep. 2013. https://doi.org/10.1109/TITS.2013.2262098
  25. Huang D, Storer M, De la Torre F, "Supervised local subspace learning for continuous head pose estimation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921-2928, Jun. 2011.
  26. Fanelli G, Gall J, Van Gool L, "Real time head pose estimation with random regression forests," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 617-624, Jun. 2011.
  27. Yan Y, Subramanian R, Ricci, "Evaluating multi-task learning for multi-view head pose classification in interactive environments," in Proc. of IEEE Int. Conference on Pattern Recognition, pp.4182-4187, Aug. 2014.
  28. Valenti R, Sebe N, Gevers T, "Combining head pose and eye location information for gaze estimation," IEEE Transactions on Image Processing, vol.21, no.2, pp.802-815, Jan. 2012. https://doi.org/10.1109/TIP.2011.2162740
  29. Zhang W, Wang Z, Xu J, "A Method of gaze direction estimation considering head posture," Int. Journal of Signal Processing, Image Processing and Pattern Recognition, vol.6, no.2, pp.103-111, Apr. 2013. http://www.sersc.org/journals/IJSIP/vol6_no2/9.pdf
  30. Zhang Y, Yue J, Dong Y, "Robust eye openness and head pose estimation based on multi modal camera," in Proc. of IEEE Workshop on Advanced Research and Technology in Industry Applications, pp.418-422, Sep. 2014.