Active Shape 모델과 Gaussian Mixture 모델을 이용한 입술 인식

(Lip Recognition Using Active Shape Model and Gaussian Mixture Model)

  • 장경식 (동의대학교 멀티미디어공학과) ;
  • 이임건 (동의대학교 영화영상공학과)
  • 발행 : 2003.06.01

초록

이 논문은 입술의 형태를 효과적으로 인식하는 방법을 제안하였다. 입술은 PDM(Point Distribution Model)을 기반으로 점들의 집합으로 표현하였다. 주성분 분석법을 적용하여 입술 모델을 구하고 모델에서 사용하는 형태계수의 분포를 GMM(Gaussian Mixture Model)을 이용하여 구하였다. 이 과정에서 계수를 정하기 위하여 EM(Expectation Maximization) 알고리듬을 사용하였다. 입술 경계선 모델은 입술을 구성하는 각 점과 주변 영역에서의 화소간 변화를 이용하여 구성하였으며 입술 탐색시 사용되었다. 여러 영상을 대상으로 실험한 결과 좋은 결과를 얻었다.

In this paper, we propose an efficient method for recognizing human lips. Based on Point Distribution Model, a lip shape is represented as a set of points. We calculate a lip model and the distribution of shape parameters using Principle Component Analysis and Gaussian mixture, respectively. The Expectation Maximization algorithm is used to determine the maximum likelihood parameter of Gaussian mixture. The lip contour model is derived by using the gray value changes at each point and in regions around the point and used to search the lip shape in a image. The experiments have been performed for many images, and show very encouraging result.

키워드

참고문헌

  1. Mlrhosseinl A. R., H. Yan and K. M. Lam, 'Adaptive Deformable Model for Mouse Boundary Detection', Optical Engineering, Vol. 37 No. 3(1998), pp. 869-875 https://doi.org/10.1117/1.601920
  2. Oliver N., A. Pentland, 'LAFTER: Lips and Face Real Time Tracker', Proceedings of the 1997 Conf. on Computer Vision and Pattern Recognition, (1997), pp. 123-129 https://doi.org/10.1109/CVPR.1997.609309
  3. Yang J., R. Stiefelhagen, U. Meier and A. Waibel, 'Real-time Face and Facial Feature Tracking and Application', Proceedings of Auditory-Visual Speech Processing, pp. 79-84, 1998
  4. Kaucic R.. A. Blake, 'Accurate, Real-Time, Unadorned Lip Tracking', Proceedings of the 6th International Conf. on Computer Vision, pp. 370-375, 1998
  5. Iain Matthnews, Timothy F. Cootes, J. Andrew Banghan, Stephen Cox and Richard Marvey, 'Extractoin of Visual Features for Lipreading', IEEE Tans. on Pattern Recognition and Machine Analysis, Vol 24, No. 2, , pp. 198-213, Feb. 2002 https://doi.org/10.1109/34.982900
  6. Wark T., Sridharan and V. Chandran, 'An Approach to Statistical Lip Modelling for Speaker Identification via Chromatic Feature Extraction', Proceedings of the 14th International Conf. on Pattern Recognition, Vol. 1, pp. 123-125, 1998 https://doi.org/10.1109/ICPR.1998.711095
  7. L. Zhang, 'Estimation of the mouth features using deformable templates', IEEE International Conference on Image Processing, Vol. III, pp. 328-331, 1997 https://doi.org/10.1109/ICIP.1997.632107
  8. Basu S., N. Oliver and A. Pentlan, '3D Modeling and Tracking of Human Lip Motions', Proceedings of the 6th International Conf. on Computer Vision, pp. 337-343, 1998 https://doi.org/10.1109/ICCV.1998.710740
  9. Delmas P., Y. Coulon and V. Fristot, 'Automatic Snakes for Robust Lip Boundaries Extraction', IEEE International Conf. on Acoustics, Speech and Signal Processing, Vol. 6, pp. 3069-3072, 1999 https://doi.org/10.1109/ICASSP.1999.757489
  10. Lievin M., F. Luthon, 'Unsupervised Lip Segmentation under Natural Conditions', IEEE International Conf. on Acoustics, Speech and Signal Processing, Vol. 6, pp. 3065-3068, 1999
  11. Lievin M., P. Delmas, Y. Coulon, F. Luthon and V. Fristot, 'Automatic Lip Tracking : Bayesian Segmentation and Active Contours in a Cooperative Scheme', IEEE Conf. on Multimedia, Computing and System, pp. 691-696, 1999 https://doi.org/10.1109/MMCS.1999.779283
  12. Luettin, J, and Thacker, NA, 'Speechreading using probabilistic models', COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 65, pp. 163-178, 1997 https://doi.org/10.1006/cviu.1996.0570
  13. Luettin J., N. A. Thacker and S. W. Beet, 'Locating and Tracking Facial Speech Features', Proceedings of The International Conf. on Pattern Recognition, pp. 652-656, 1996
  14. M. B. Stegmann, R. Fisker, 'On Properties of Active Shape Models', Informatics and Mathematical Modelling, Technical University of Denmark, 2000
  15. Chad Carson, Serge Belongie, Hayit Greenspan, Jitendra Malik, 'Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying', IEEE Tans. on Pattern Recognition and Machine Analysis, Vol. 24, No. 8, pp. 1026-1038, Aug. 2002 https://doi.org/10.1109/TPAMI.2002.1023800
  16. Movellan J. R., 'Visual Speech Recognition with Stochastic Networks', Advances in Neural Information Processing System. Vol. 7, MT Press Cambridge, 1995