DOI QR코드

DOI QR Code

Recognizing User Engagement and Intentions based on the Annotations of an Interaction Video

상호작용 영상 주석 기반 사용자 참여도 및 의도 인식

  • Jang, Minsu (Electronics and Telecommunications Research Institute) ;
  • Park, Cheonshu (Electronics and Telecommunications Research Institute) ;
  • Lee, Dae-Ha (Electronics and Telecommunications Research Institute) ;
  • Kim, Jaehong (Electronics and Telecommunications Research Institute) ;
  • Cho, Young-Jo (Electronics and Telecommunications Research Institute)
  • Received : 2014.02.15
  • Accepted : 2014.03.30
  • Published : 2014.06.01

Abstract

A pattern classifier-based approach for recognizing internal states of human participants in interactions is presented along with its experimental results. The approach includes a step for collecting video recordings of human-human interactions or humanrobot interactions and subsequently analyzing the videos based on human coded annotations. The annotation includes social signals directly observed in the video recordings and the internal states of human participants indirectly inferred from those observed social signals. Then, a pattern classifier is trained using the annotation data, and tested. In our experiments on human-robot interaction, 7 video recordings were collected and annotated with 20 social signals and 7 internal states. Several experiments were performed to obtain an 84.83% recall rate for interaction engagement, 93% for concentration intention, and 81% for task comprehension level using a C4.5 based decision tree classifier.

Keywords

References

  1. M. Pantic, R. Cowie, F. D'Errico, D. Heylen, M. Mehu, C. Pelachaud, I. Poggi, M. Schroeder, and A. Vinciarelli, "Social signal processing: The research agenda," In T. B. Moeslund, A. Hilton, V. Krger, & L. Sigal (Eds.), Visual Analysis of Humans, pp.511-538, 2011.
  2. M. Kipp, "Multimedia annotation, querying and analysis in ANVIL," In: M. Maybury (ed.) Multimedia Information Extraction, Chapter 19, IEEE Computer Society Press.
  3. C.L. Sidner et al., "Explorations in engagement for humans and robots," Artificial Intelligence, vol. 166, pp. 140-164, 2005. https://doi.org/10.1016/j.artint.2005.03.005
  4. R. Conte and C. Castelfranchi. "Cognitive and social action," University College London, 1995.
  5. I. Poggi, "Mind, hands, face and body, A goal and belief view of multimodal communication," Weidler, Berlin, 2007.
  6. G. Castellano et al., "Detecting User Engagement with a Robot Companion Using Task and Social Interaction-based Features," 2009.
  7. H. S. Ahn, D.-W. Lee, D. Choi, D.-Y. Lee, M. Hur, and H. Lee, "Uses of facial expressions of android head system according to gender and age," In Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on, pp. 2300-2305.
  8. M. Jang, D. H. Lee, J. Kim, and Y. Cho, "Identifying principal social signals in private student-teacher interactions for robotenhanced education," In RO-MAN, 2013 IEEE, pp. 621-626, Aug. 2013.
  9. A. Vinciarelli, M. Pantic, and H. Bourlard, "Social signal processing: Survey of an emerging domain," Image and Vision Computing Journal, vol. 27, no. 12, pp. 17431759, 2009. https://doi.org/10.1016/j.imavis.2008.11.007
  10. G. Castellano, I. Leite, A. Pereira, C. Martinho, A. Paiva, and P. W. McOwan, "Detecting engagement in HRI: An exploration of social and task-based context," in Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom), pp. 421-428, 2012.
  11. A. Kapoor, W. Burleson, and R. W. Picard. "Automatic prediction of frustration," International Journal of Human- Computer Studies, vol. 65, no. 8, pp. 724-736, 2007. https://doi.org/10.1016/j.ijhcs.2007.02.003