연속된 수화 인식을 위한 자동화된 Coarticulation 검출

Automatic Coarticulation Detection for Continuous Sign Language Recognition

  • 양희덕 (조선대학교 컴퓨터공학과) ;
  • 이성환 (고려대학교 컴퓨터.통신공학과)
  • 발행 : 2009.01.15

초록

수화 적출은 연속된 손 동작에서 의미 있는 수화 단어를 검출 및 인식하는 것을 말한다. 수화는 손의 움직임과 모양의 변화가 다양하기 때문에 수화 문장에서 수화를 적출하는 것은 쉬운 문제가 아니다. 특히, 자연스러운 수화 문장에는 의미 있는 수화, 수화가 아닌 손동작이 무작위로 발생한다. 본 논문에서는 CRF(Conditional Random Field)에 기반한 적응적 임계치 모델을 제안한다. 제한된 모델은 수화 어휘집에 정의된 수화 손동작과 수화가 아닌 손동작을 구별하기 위한 적응적 임계치 역할을 수행한다. 또한, 수화 적출 및 인식의 성능 향상을 위해 손 모양 기반 수화 인증기, 짧은 수화 적출기, 부사인(subsign) 추론기를 제안된 시스템에 적용하였다. 실험 결과, 제안된 방법은 연속된 수화 동작 데이타에서 88%의 적출률, 사전에 적출된 수화 동작 데이타에서 94%의 인식률을 보였으며, 적응적 임계치 모델, 짧은 수화 적출기, 손 모양 기반 수화 인증기, 부사인 추론기를 사용하지 않은 CRF 모델은 연속된 수화 동작 데이터에서 74%의 적출률, 사전에 적출된 수화 동작 데이타에서 90%의 인식률을 보였다.

Sign language spotting is the task of detecting and recognizing the signs in a signed utterance. The difficulty of sign language spotting is that the occurrences of signs vary in both motion and shape. Moreover, the signs appear within a continuous gesture stream, interspersed with transitional movements between signs in a vocabulary and non-sign patterns(which include out-of-vocabulary signs, epentheses, and other movements that do not correspond to signs). In this paper, a novel method for designing a threshold model in a conditional random field(CRF) model is proposed. The proposed model performs an adaptive threshold for distinguishing between signs in the vocabulary and non-sign patterns. A hand appearance-based sign verification method, a short-sign detector, and a subsign reasoning method are included to further improve sign language spotting accuracy. Experimental results show that the proposed method can detect signs from continuous data with an 88% spotting rate and can recognize signs from isolated data with a 94% recognition rate, versus 74% and 90% respectively for CRFs without a threshold model, short-sign detector, subsign reasoning, and hand appearance-based sign verification.

키워드

참고문헌

  1. R. Bowden, D. Windridge, T. Kadir, A. Zisserman, and M. Brady, 'A Linguistic Feature Vector for the Visual Interpretation of Sign Language,' Proc. of European Conference on Computer Vision, Plague, Czech Republic, pp. 390-401, 2004
  2. A. Braffort, 'Argo: An Architecture for Sign Language Recognition and Interpretation,' Proc. of Int. Gesture Workshop on Progress in Gestural Interaction, London, UK, pp. 17-30, 1996
  3. R.D. Yang, S. Sarkar, and B. Loeding, 'Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition,' Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Minnesota, USA, pp. 1-8, Aug. 2007 https://doi.org/10.1109/TPAMI.2009.26
  4. H.-K. Lee and J.-H. Kim, 'An HMM-based Threshold Model Approach for Gesture Recognition,' IEEE Trans. on Pattern Analysis and Machine Recognition, Vol. 21, No. 10, pp. 961-973, 1999 https://doi.org/10.1109/34.799904
  5. H.-D. Yang, A.-Y. Park, and S.-W. Lee, 'Gesture Spotting and Recognition for Human-Robot Interaction,' IEEE Trans. on Robotics, Vol. 23, No. 2, pp. 256-270, 2007 https://doi.org/10.1109/TRO.2006.889491
  6. W.C. Stokoe, Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf, Studies in Linguistics: Occasional Papers 8, Linstok Press, 1960
  7. J. Alon, V. Athitsos, and S. Sclaroff, 'Accurate And Efficient Gesture Spotting via Pruning and Subgesture Reasoning,' Proc. of ICCV-HCI, Beijing, China, pp. 199-207, Oct. 2005
  8. A. McCallum, D. Freitag, and F. Pereira, 'Maximum Entropy Markov Models for Information Extraction and Segmentation,' Proc. of Int. Conf. on Machine Learning, Standford, USA, pp. 591-598, 2000
  9. J. Lafferty, A. McCallum, and F. Pereira, 'Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data,' Proc. of Int. Conf. on Machine Learning, Williamstown, USA, pp. 282-289, Jun. 2001
  10. L.-P. Morency, A. Quattoni, and T. Darrell, 'Latent-dynamic Discriminative Models for Continuous Gesture Recognition,' Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Minneapolis, USA, 2007, pp. 1-8, http://sourceforge. net/projects/crf
  11. S. Wang, A. Quattoni, L.P. Morency, D. Demirdjian, and T. Darrell, 'Hidden Conditional Random Fields for Gesture Recognition,' Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, New York, USA, pp. 1521-1527, Jun. 2006 https://doi.org/10.1109/CVPR.2006.132
  12. C.W. Ong and S. Ranganath, 'Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 27, No. 6, pp. 873-891, 2005 https://doi.org/10.1109/TPAMI.2005.112
  13. W. Gao, G. Fang, D. Zhao, and Y. Chen, 'Transition Movement Models for Large Vocabulary Continuous Sign Language Recognition,' Proc. of Int. Conf. on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 553-558, May 2004 https://doi.org/10.1109/AFGR.2004.1301591
  14. C. Vogler and D. Metaxas, 'A Framework for Recognizing the Simultaneous Aspects of American Sign Language,' Computer Vision and Image Understanding, Vol. 81, No. 3, pp. 358-384, 2001 https://doi.org/10.1006/cviu.2000.0895
  15. T. Starner, J. Weaver, and A. Pentland, 'Real- Time American Sign Language Recognition Using Desk and Wearable Computer Based Video,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 20, No. 12, pp. 1371-1375, 1998 https://doi.org/10.1109/34.735811
  16. E.-J. Holden, G. Lee, and R. Owens, 'Australian Sign Language Recognition,' Machine Vision and Applications, Vol. 1, No. 5, pp. 312-320, 2005 https://doi.org/10.1007/s00138-005-0003-1
  17. M. Yang, N. Ahuja, and M. Tabb, 'Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, No. 8, pp. 1061-1074, 2002 https://doi.org/10.1109/TPAMI.2002.1023803
  18. R.D. Yang and S. Sarkar, 'Detecting Coarticulation in Sign Language Using Conditional Random Fields,' Proc. of Int. Conf. on Pattern Recognition, Hong Kong, China, pp. 108-112, Aug. 2006 https://doi.org/10.1109/ICPR.2006.431
  19. H. M. Wallach, 'Conditional Random Fields: An Introduction,' Technical Report MS-CIS-04-21, University of Pennsylvania, 2004
  20. R. Kasturi and R. Jain, Computer Vision: Principles, IEEE Computer Society Press, 1991
  21. C.-C. Chang and C.-J. Lin, LIBSVM: A Library for Support Vector Machine, 2001, http://www. csie.ntu.edu.tw/cjlin/libsvmtools/
  22. T. Kudo, CRF++: Yet Another CRF Toolkit, 2005, http://chasen.org/taku/software/CRF++/