1. Introduction
Psychological studies showed that humans have small but significant ability to recognize people they know well by their gait. This ability has encouraged the research for using gait as the means of biometric identification. Early studies on point light displays (PLD) [1], which enable isolated study of motion by removing all other contexts from observed subjects, confirmed this ability.
Commonly used biometrics based on fingerprints, face, iris etc. have two obvious deficiencies. They perform badly at low image resolutions and need active user participation. Gait on the other hand does not suffer from these deficiencies. It can be captured with ordinary equipment without individual’s awareness or even consent. The main deficiencies of such biometric are the unknown level of uniqueness and factors that change gait characteristics. These can be external (changes of view, direction or speed of movement, illumination conditions, weather, clothing, footwear, terrain etc.) or internal (changes due to illness, injuries, aging, pregnancy etc.). Problems are also caused by uncertain measurements, occlusions and the use of noninvasive acquiring techniques (without sensors or markers). All these deficiencies negatively influence the recognition performance in real-life environment, which is still to weak for efficient use in biometry.
Gait biometry offers high potential in applications from the area of security and surveillance [2], in medicine for diagnosis of gait related diseases such as Parkinson’s disease, hemiplegia, stroke, fall detection [3], in sports for improvement of athletic capabilities, training evaluation [4], in the area of motion capture [5], traffic control [6] etc.
In the paper we introduce two state-of-the-art methods based on averaged silhouettes [7] and on probabilistic spatio-temporal model build on trajectories of tracked points [8]. We test them on the data that reflect the effects of walking speed changes and evaluate the consequences on recognition results. In response we suggest an improvement that mitigates the effects of walking speed variation on recognition results.
In the second section we present related work, in the third section we introduce baseline gait recognition methods, in the fourth section we propose improvement for walking speed variation and in the fifth section we present the results of our experiments together with analysis of identified problems and possible solutions. Finally, we conclude with directions for future work.
2. Related Work
Gait is defined as every coordinated cyclical combination of one’s gestures that compose human motion (e.g. walking, running etc.). The area of gait recognition tries to recognize certain expressed property on the basis of gait characteristics, which are usually observed through gait cycles. Gait cycle begins when one foot contacts the ground and ends when the same foot contacts the ground again.
Methods can be categorized into two main groups. Model based approaches [9-11] build the model of human body or its movement in 3D and acquire gait features from this model (features like step dimensions, cadence, human skeleton, body dimensions, locations and orientations of body parts, joint kinematics etc.). The methods in this group mostly focus on gait dynamics and less on appearance of individuals, which makes them more resistant to problems like changes of view and scale, but in general do not achieve as good results as methods that also consider appearance. Furthermore, such methods are computationally demanding and especially susceptible to problems like occlusions.
Model-free approaches [7,12,13] acquire gait parameters by performing measurements directly on 2D images, without adopting specific model of human body or motion. Feature correspondence in consecutive images is obtained by prediction of speed, shape, texture and color. They mostly use geometric representations like silhouettes, optical flow, joint trajectories, history of movement etc. The methods do not rely only on gait dynamics, but also measure the individual during movement – with it they also take appearance of individual into consideration. Methods are therefore less sensitive to influence factors that result in variations of gait dynamics (e.g. aging, illness, walking speed change) but more susceptible to factors that result in changes of appearance (e.g. clothing, obesity, hairstyle etc.), changes of view and direction of movement.
Our work focuses on variations of walking speed, which represents one of the more important influence factors that affect gait recognition results, is almost always present in real environment and therefore requires special attention. Several different approaches to handling walking speed changes exist in literature.
Authors in [13] researched the influence of walking speed changes to recognition performance based on cadence and step length and suggested the improvement by silhouette normalization. Authors proposed a stride normalization of double-support gait silhouettes based on a statistical relation between the walking speed and the stride. They used baseline algorithm [12] on only five silhouettes of gait cycle (two single-support images and three double-support images) for recognition and discarded other images.
Authors in [14] proposed HMM-based time-normalized gait feature extraction with standard gait poses and tested it on the slow and fast walking data. The method does, however, not consider spatial changes (e.g. stride changes).
Authors in [15] introduced a spatio-temporal Shape Variation-Based (SVB) frieze pattern representation for gait, which captures motion information over time and represents normalized frame difference over gait cycles. A temporal symmetry map of gait patterns is constructed and combined with vertical/horizontal SVB frieze patterns for measuring the dissimilarity between gait sequences.
Authors in [16] proposed an approach based on Dynamic Time Warping (DTW), which used a set of DTW functions to represent the distribution of gait patterns using uniform and wrapped Gaussian distributions.
Authors in [17] proposed a three-way (x-, y-, and time-axis) method of autocorrelation for extracting features that effectively extracted spatio-temporal local geometric features to characterize motions called Cubic Higher-order Local Auto-Correlation (CHLAC). It is relatively robust against variations in walking speed, since it only uses the sums of local features over a gait sequence, and thus does not explicitly use the phase information of the gait. Researchers have assumed that walking speed does not change much within or across gait sequences.
Authors in [18] separated static and dynamic features from gait silhouettes by fitting a human model, then created a factorization based speed transformation model for the dynamic features using a training set of multiple persons for multiple speeds. The model can transform the dynamic features from a reference speed to another arbitrary speed.
Changes of view represent even bigger problem. Although attempts of using multiple-view geometry exist in literature [19], the main problem here is that the same movement can have drastically different appearance when observed from different angles. Because of this, appearance based methods cannot be efficiently generalized to multiple camera systems. In general, separate classifier is required for each view. Other approaches use view transformations [20] or multi-view fusion [21] to achieve view-invariance for minor view changes.
3. Baseline Gait Recognition
Our work is based on two gait recognition methods that serve as a baseline for evaluating walking speed normalization step (described below) performed on the top of these methods. The methods were chosen as representatives of different approaches to gait recognition. First group concentrates on using appearance for recognition purposes, whereas the other mostly considers gait dynamics. In this work we show how methods from both groups can benefit from proposed walking speed normalization.
Basic approach presented in [12] is based on direct comparison of gallery and probe gait cycles by computing the Euclidean distance of corresponding silhouettes within these two gait cycles. Because this method achieves encouraging results under similar capturing conditions for gallery and probe, it is often used as the baseline algorithm for new method evaluation. In upgraded method [7] averaged silhouette image is introduced. Averaged silhouette represents the sum of silhouettes for one gait cycle (Fig. 1) and all gait cycles together represent individual’s gait. Such approach eliminates comparison problems of subjects with different cycle period and different choice of initial pose for the gait cycle. We implemented and tested the method based on averaged silhouettes as a representative of the appearance based gait recognition methods.
Fig. 1.Averaged silhouette image (right) composed out of silhouettes of an entire gait cycle (left).
The next method [8] is based on optical flow tracking of random points in extracted silhouettes. Points are tracked based on texture information of the small area around them. The points and their movements form point trajectories in time (Fig. 2). Probabilistic spatio-temporal model is build by Principal Component Analysis (PCA), Expectation Maximization (EM) and Gaussian mixtures. This method concentrates on using gait dynamics for recognition purposes.
Fig. 2.Tracking of points in image sequence: a. original image, b. extracted silhouette, c. tracked points, d. trajectories of tracked points.
The method consists of four basic modules. The first module performs motion segmentation by background subtraction and consequentially extracts human silhouettes.
In the next module gait features are extracted. Gait features are represented by trajectories of tracked points, which are described with spatio-temporal curves. Since curves are periodic, the period corresponding to gait cycle length can be estimated by autocorrelation and voting. The trajectories are then sliced into shorter pieces based on that period, phase aligned and interpolated to the same length. Since these slices correspond to gait cycle period, they are also cyclic. Cyclic curves are described by curve centroid (1) represented by arithmetic mean and centroid subtracted curve shape vector , (2):
where o represents curve centroid, x curve shape, N curve length corresponding to cycle length and xi = (xi, yi) original curve points.
In the next module gait model is built. Curve shapes are first projected into 4D PCA space. Single point in PCA space represents one cyclic curve shape. Then 6D curve descriptor is composed for each curve out of curve centroid (2D) and PCA shape (4D). The distribution of this 6D vectors is described by Gaussian mixture model obtained by EM algorithm. Gait model is therefore presented by the mixture of 15 Gaussians describing curve descriptor distribution. The models of different walking subjects are finally stored in the gallery.
In the recognition module, we similarly obtain probe cycle curves, project them into each gait model from the gallery. Statistical estimates for probe fitting into each gait model from the gallery are then obtained based on Bayesian rule (3):
where gi is the model from the gallery, Y is the matrix of probe curve descriptors and yn are the rows of this matrix corresponding to curve descriptors. The model with maximal estimate is chosen as a match.
4. Walking Speed Normalization
Changes in walking speed present one of the major influence factors with negative impact on recognition results. When a person changes walking speed, dynamic features (e.g. stride length and joint angles) are changed, while static features (e.g. thigh and shin lengths) remain unchanged.
The first direct consequence of such change reflects in gait cycle duration (i.e. execution rate). The faster we walk, the shorter time period is required to complete the gait cycle. Methods must therefore consider this effect in order to be able to compare gaits of humans walking with different speeds. Methods can handle this implicitly by observing gait cycle as one spatio-temporal sample. In the case of averaged silhouettes [7] all silhouettes of one gait cycle are consolidated into single summed average image. Another option is to consider only specific key frames and ignoring all others, similar as the authors in [13,15] took only some frames from double-support and single-support mid-stance phases. Some methods explicitly adapt to changed gait cycle duration. Authors in [14] use gait dynamics normalization, authors in [16] use dynamic time warping and authors in [8] use temporal re-sampling by interpolating point trajectories to the same length.
Most of aforementioned methods, however, do not consider anthropometry spatial changes caused by varying walking speed. It can be observed that changes in walking speed reflect in changed step size [22]. Fast walk means larger steps, whereas slow walk means smaller steps. These changes can be handled either by invariant features [16] or by scaling based transformations [13], where authors normalized double-phase frames in the lower silhouette part to adjust to target walking speed.
We used similar idea on the averaged silhouettes method. Since step size is proportional to maximal silhouette width in the area of legs, averaged silhouettes need to be either shrunk or stretched for specific factor to adopt to target speed (Fig. 3). Such normalization enables better comparison of silhouettes taken from subjects with different walking speeds.
Fig. 3.Side-view averaged silhouette examples for fast walk (left) and slow walk (right).
Furthermore, we adapted this normalization for trajectories based probabilistic spatio-temporal model, where similar effect can be achieved by transforming the trajectory curves (Fig. 4). We transform the curve shape in the dominant direction of walking, which corresponds to the orientation of main principal vector obtained by PCA step of spatio-temporal model method. Since curve shape is described by displacements of curve points from centroid (see (1), (2)), the normalization can be achieved by multiplication by speed-change factor s , which we previously assessed from the difference of maximal curve widths of two comparing gait examples in the area of legs. To apply transformation to arbitrary walking direction (as opposed to walking direction from right to left usually observed from the side-view as in the case of averaged silhouettes), the curve must first be rotated to achieve alignment with x axis, then scaled and finally rotated back to the original orientation (Fig. 5).
Fig. 4.Side-view trajectories based method examples for fast walk (left) and slow walk (right).
Fig. 5.Curve rotation and scaling: a. shows original curve under walking direction ϕ , b. shows rotated curve by −ϕ to align to x axis, c. shows scaled curve along x axis, and d. shows scaled curve rotated back to the original walking directionϕ .
Curve points xi are modified by geometric transformation (4). R is rotation by angle ϕ (5) and scaling matrix S is composed of previously assessed speed-change factor s , which scales the curve in the direction of x axis (6).
In the case of side-view, as demonstrated on Fig. 4, dominant walking direction is already aligned to x axis, therefore ϕ is 0 .
5. Evaluation and Results
Described methods were tested on MoBo gait database [23], which contains 25 subjects walking on treadmill, captured from six different view angles and four different walking types (slow walk – 3.3 km/h, fast walk – 4.5 km/h, incline and object carrying).
Several experiments were performed to illustrate the differences of recognition performance on unchanged and changed walking speeds for gallery and probe. In the case of unchanged walking speed, we split the video sequences into two halves, one for training the gallery and other for probe, whereas under changed conditions the whole sequences were used, since gallery and probe are taken from different video groups.
In Table 1 and Table 2 we present the results of averaged silhouette method and probabilistic spatio-temporal model consecutively. Both methods exhibit good performance under unchanged capturing conditions and indicate that potential use for biometric identification is possible. However, in real-life applications we usually have to deal with uncontrolled environment, where capturing conditions constantly change. Gallery models are usually build under controlled environment (in the lab), whereas probe examples are captured under some other changed conditions. The results shown in Table 1 and Table 2 (left column) confirm this assumption: relatively good performance is achieved for unchanged walking speeds and considerably worse performance for experiments with changed walking speeds.
Table 1.Recognition results for Averaged Silhouette (AS) method on unchanged walking speed (F/F and S/S) and changed walking speed (F/S and S/F) with and without walking-speed normalization (S – slow walk, F – fast walk). Rank 1: correct hit from gallery was found in the best estimated model, rank 5: correct hit was found among 5 best estimated models.
Table 2.Recognition results for Probabilistic Spatio-Temporal Model (PSTM) with and without walking speed normalization.
In the previous section we described the idea for mitigating the effects of walking speed changes on recognition results by transformation based on walking speed normalization. The results shown in Table 1 and Table 2 (right columns) exhibit significant improvement over un-normalized versions for both methods, which confirm the assumption that walking speed changes reflect in step size and methods can benefit from such normalization. In the case of averaged silhouettes we observe 16% of average performance gain, whereas in the case of probabilistic spatio-temporal model average performance gain is 24% when comparing to the baseline method performance.
For all the experiments, the results of averaged silhouette method are better than the results of probabilistic spatio-temporal model, which is due to the increased role that appearance plays in silhouette based methods. Silhouette shapes do not change much when changing walking speed. On the other hand, gait dynamics, which plays an important role in probabilistic spatio-temporal model, is much more affected by this change.
In Table 3 we provide the results of other state-of-the-art works treating walking speed changes, as reported by authors for same or similar experiments. Some authors used different gait databases with different walking speeds (see results below for additional info). Such comparison is possible as long as we concentrate on comparing the performance gain and the difference between fast and slow walking speed is similar to the difference in our experiment.
Table 3.Comparison to other works treating walking speed changes. Methods are presented by columns BL, GDN, GSVT, FT, SVBFP, ASn and PSTMn. MINUS (–) denotes the lack of reported results. Row PG states the performance gain obtained by specially treated walking speed changes as opposed to no such treatment for F/S experiment only. For GDN and GSVT baseline performance is used, as they are silhouette based and no other baseline performance is reported.
Baseline algorithm (BL) [12] does not specially treat walking speed changes, but serves as performance gain baseline for some algorithms. Other methods are Gait Dynamics Normalization (GDN) [14], Gait Silhouette Volume Transformation (GSVT) [18], Feature Transformation (FT) [13], Shape Variation Based Frieze Patterns (SVBFP) [15] and previously described normalized Averaged Silhouette (ASn) and normalized Probabilistic Spatio-Temporal Model (PSTMn). Row PG states the performance gain of specially treated walking speed changes as opposed to no special treatment for F/S experiment only, as this experiment is reported for most methods. If not specifically reported, baseline performance was used for this purpose.
The results in Table 3, especially the bottom row, clearly indicate that the performance gain of walking speed normalization presented in this paper can be beneficial for both method types, appearance based methods and also for methods that concentrate on modeling gait dynamics when handling changing walking speeds. With its use, the methods that do not account for walking speed changes, can improve the results closer to the performance level of methods that consider walking speed changes. In general, it is possible that even the methods that already handle walking speed changes in some other manner, can benefit from this normalization.
6. Conclusion
In this paper we compared several state-of-the-art methods for gait recognition, gave testing results and clearly indicated problems caused by changed walking speed, which gait recognition methods should solve in order to be useful in biometric applications. Furthermore, we demonstrated how walking speed changes can be mitigated by normalization based on geometric transformations and showed how methods from different groups (appearance based and gait dynamics based methods) can benefit from such transformations and improve their recognition results.
Although MoBo gait database is relatively small, it still serves the purpose of demonstrating the method performance, especially since it contains several influence factors (e.g. speed change, view change, object carrying condition, incline walk) that can be evaluated with respect to proposed methods and the proof of concept can be shown even on such database. Nevertheless, we are aware that experiments on larger database would be required to confirm the exact percentages of improved recognition performance for more precise result analysis and comparison.
Also, there are further problems that need to be addressed before methods can be used in biometric applications. Especially those caused by changes of view remain prominent. Some possibilities employing multiple view geometry are given in [19], another possibility is the employment of view-based feature transformations as described in [20], and feature selection process can be adopted [24] to only use features that remain invariant across changed conditions when performing recognition.
Finally, we can conclude that although today’s gait recognition methods show promising results from the aspect of recognition at a distance, a bunch of unsolved problems still need to be addressed before they can be successfully used in biometric applications. Today we still don’t have a generic method that would solve all of the problems in one go. Methods rather specialize in solving each problem separately, which offers slow but steady stepwise approach to the final goal of usable recognition performance in real-life environment.
References
- G. Johansson "Visual motion perception," Scientific American, vol. 232, no. 6, pp. 76-88, 1976.
- I. Bouchrika, M. S. Nixon, "People detection and recognition using gait for automated visual surveillance," IET Conference on Crime and Security, pp. 576-581, 2006.
- B. Pogorelc, Z. Bosnić., M. Gams, "Automatic recognition of gait-related health problems in the elderly using machine learning," Multimedia Tools and Applications, vol. 58, no. 2, pp. 333-354, 2012. https://doi.org/10.1007/s11042-011-0786-1
- R. Bartlett, "Artificial intelligence in sports biomechanics: New dawn or false hope?," Journal of Sports Science and Medicine, vol. 5, pp. 474-479, 2006.
- T. B. Moeslund, A. Hilton, V. Krüuger, "A survey of advances in vision-based human motion capture and analysis," Computer Vision and Image Understanding, vol. 104, no. 2-3, pp. 90-126, 2006. https://doi.org/10.1016/j.cviu.2006.08.002
- I. Bouchrika, J. N. Carter, M. S. Nixon, R. Morzinger, G. Thallinger, "Using Gait Features for Improving Walking People Detection," in Proc. of International Conference on Pattern Recognition, pp. 3097-3100, 2010.
- Z. Liu, S. Sarkar, "Simplest Representation Yet for Gait Recognition: Averaged Silhouette," in Proc. of International Conference on Pattern Recognition, vol. 4, pp. 211-214, 2004.
- M. Peternel, A. Leonardis, "Visual Learning and Recognition of a Probabilistic Spatio-Temporal Model of Cyclic Human Locomotion," in Proc. of International Conference on Pattern Recognition, vol. 4, pp. 146-149, 2004.
- H. Lu, K. Plataniotis, A. Venetsanopoulos, "A full-body layered deformable model for automatic model-based gait recognition," EURASIP Journal on Advances in Signal Processing, pp. 1-14, 2008.
- J. H. Yoo, M. S. Nixon, "Automated Markerless Analysis of Human Gait Motion for Recognition and Classification," ETRI Journal of Information, Telecommunications & Electronics, vol. 33, no. 2, pp. 259-266, 2011.
- G. Ariyanto, M. S. Nixon, "Marionette mass-spring model for 3D gait biometrics," in Proc. of International Conference on Biometrics (ICB), pp. 354-359, 2012.
- S. Sarkar, P. J. Phillips, Z. Liu, I. R. Vega, P. Grother, K. W. Bowyer, "The humanID gait challenge problem: data sets, performance, and analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 162-177, 2005. https://doi.org/10.1109/TPAMI.2005.39
- R. Tanawongsuwan, A Bobick, "Modelling the effects of walking speed on appearance-based gait recognition," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 783-790, 2004.
- Z. Liu, S. Sarkar, "Improved gait recognition by gait dynamics normalization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 6, pp. 863-76, 2006. https://doi.org/10.1109/TPAMI.2006.122
- S. Lee, R. Collins, "Shape Variation-Based Frieze Pattern for Robust Gait Recognition," IEEE Conference on Computer Vision and Pattern Recognition, 2007.
- A. Veeraraghavan, A. Srivastava, A. K. Roy-Chowdhury, R. Chellappa, "Rate-invariant recognition of humans and their activities," IEEE Transactions on Image Processing, vol. 18, no. 6, pp. 1326-39, 2009. https://doi.org/10.1109/TIP.2009.2017143
- T. Kobayashi, N. Otsu, "Three-way auto-correlation approach to motion recognition," Pattern Recognition Letters, vol. 30, no. 3, pp. 212-221, 2009. https://doi.org/10.1016/j.patrec.2008.09.006
- A. Tsuji, Y. Makihara, Y. Yagi, "Silhouette transformation based on walking speed for gait identification," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 717-722, 2010.
- Y. Iwashita, R. Baba, K. Ogawara, R. Kurazume, "Person Identification from Spatio-temporal 3D Gait," in Proc. of International Conference on Emerging Security Technologies, pp. 30-35, 2010.
- W. Kusakunniran, Q. Wu, J. Zhang, H. Li, "Cross-view and multi-view gait recognitions based on view transformation model using multi-layer perceptron," Pattern Recognition Letters, vol. 33, no. 7, pp. 882-889, 2011.
- I. F. Nizami, S. Hong, H. Lee, S. Ahn, K. Toh, E. Kim, "Multi-view Gait Recognition Fusion Methodology," in Proc. of IEEE Conference on Industrial Electronics and Applications, pp. 2101-2105, 2008.
- M. P. Murray, "Gait As A Total Pattern of Movement," American Journal of Physical Medicine, vol. 46, no. 1, pp. 290-333, 1967.
- R. Gross, J. Shi, "The CMU Motion of Body (MoBo) Database," Tech. report, Robotics Institute, Carnegie Mellon University, 2001.
- R. Martin-Felez, T. Xiang, "Gait Recognition by Ranking," European Conference on Computer Vision, pp. 328-341, 2012.
Cited by
- Human Skeleton Model Based Dynamic Features for Walking Speed Invariant Gait Recognition vol.2014, pp.None, 2013, https://doi.org/10.1155/2014/484320
- Frame-based classification for cross-speed gait recognition vol.78, pp.5, 2019, https://doi.org/10.1007/s11042-017-5469-0