DOI QR코드

DOI QR Code

Accelerometer-based Gesture Recognition for Robot Interface

로봇 인터페이스 활용을 위한 가속도 센서 기반 제스처 인식

  • 장민수 (한국전자통신연구원 로봇/인지시스템연구부) ;
  • 조용석 (건양대학교 전자정보공학과) ;
  • 김재홍 (한국전자통신연구원 로봇/인지시스템연구부) ;
  • 손주찬 (한국전자통신연구원 로봇/인지시스템연구부)
  • Received : 2011.02.09
  • Accepted : 2011.02.26
  • Published : 2011.03.31

Abstract

Vision and voice-based technologies are commonly utilized for human-robot interaction. But it is widely recognized that the performance of vision and voice-based interaction systems is deteriorated by a large margin in the real-world situations due to environmental and user variances. Human users need to be very cooperative to get reasonable performance, which significantly limits the usability of the vision and voice-based human-robot interaction technologies. As a result, touch screens are still the major medium of human-robot interaction for the real-world applications. To empower the usability of robots for various services, alternative interaction technologies should be developed to complement the problems of vision and voice-based technologies. In this paper, we propose the use of accelerometer-based gesture interface as one of the alternative technologies, because accelerometers are effective in detecting the movements of human body, while their performance is not limited by environmental contexts such as lighting conditions or camera's field-of-view. Moreover, accelerometers are widely available nowadays in many mobile devices. We tackle the problem of classifying acceleration signal patterns of 26 English alphabets, which is one of the essential repertoires for the realization of education services based on robots. Recognizing 26 English handwriting patterns based on accelerometers is a very difficult task to take over because of its large scale of pattern classes and the complexity of each pattern. The most difficult problem that has been undertaken which is similar to our problem was recognizing acceleration signal patterns of 10 handwritten digits. Most previous studies dealt with pattern sets of 8~10 simple and easily distinguishable gestures that are useful for controlling home appliances, computer applications, robots etc. Good features are essential for the success of pattern recognition. To promote the discriminative power upon complex English alphabet patterns, we extracted 'motion trajectories' out of input acceleration signal and used them as the main feature. Investigative experiments showed that classifiers based on trajectory performed 3%~5% better than those with raw features e.g. acceleration signal itself or statistical figures. To minimize the distortion of trajectories, we applied a simple but effective set of smoothing filters and band-pass filters. It is well known that acceleration patterns for the same gesture is very different among different performers. To tackle the problem, online incremental learning is applied for our system to make it adaptive to the users' distinctive motion properties. Our system is based on instance-based learning (IBL) where each training sample is memorized as a reference pattern. Brute-force incremental learning in IBL continuously accumulates reference patterns, which is a problem because it not only slows down the classification but also downgrades the recall performance. Regarding the latter phenomenon, we observed a tendency that as the number of reference patterns grows, some reference patterns contribute more to the false positive classification. Thus, we devised an algorithm for optimizing the reference pattern set based on the positive and negative contribution of each reference pattern. The algorithm is performed periodically to remove reference patterns that have a very low positive contribution or a high negative contribution. Experiments were performed on 6500 gesture patterns collected from 50 adults of 30~50 years old. Each alphabet was performed 5 times per participant using $Nintendo{(R)}$ $Wii^{TM}$ remote. Acceleration signal was sampled in 100hz on 3 axes. Mean recall rate for all the alphabets was 95.48%. Some alphabets recorded very low recall rate and exhibited very high pairwise confusion rate. Major confusion pairs are D(88%) and P(74%), I(81%) and U(75%), N(88%) and W(100%). Though W was recalled perfectly, it contributed much to the false positive classification of N. By comparison with major previous results from VTT (96% for 8 control gestures), CMU (97% for 10 control gestures) and Samsung Electronics(97% for 10 digits and a control gesture), we could find that the performance of our system is superior regarding the number of pattern classes and the complexity of patterns. Using our gesture interaction system, we conducted 2 case studies of robot-based edutainment services. The services were implemented on various robot platforms and mobile devices including $iPhone^{TM}$. The participating children exhibited improved concentration and active reaction on the service with our gesture interface. To prove the effectiveness of our gesture interface, a test was taken by the children after experiencing an English teaching service. The test result showed that those who played with the gesture interface-based robot content marked 10% better score than those with conventional teaching. We conclude that the accelerometer-based gesture interface is a promising technology for flourishing real-world robot-based services and content by complementing the limits of today's conventional interfaces e.g. touch screen, vision and voice.

로봇 자체 또는 로봇에 탑재된 콘텐츠와의 상호작용을 위해 일반적으로 영상 또는 음성 인식 기술이 사용된다. 그러나 영상 음성인식 기술은 아직까지 기술 및 환경 측면에서 해결해야 할 어려움이 존재하며, 실적용을 위해서는 사용자의 협조가 필요한 경우가 많다. 이로 인해 로봇과의 상호작용은 터치스크린 인터페이스를 중심으로 개발되고 있다. 향후 로봇 서비스의 확대 및 다양화를 위해서는 이들 영상 음성 중심의 기존 기술 외에 상호보완적으로 활용이 가능한 인터페이스 기술의 개발이 필요하다. 본 논문에서는 로봇 인터페이스 활용을 위한 가속도 센서 기반의 제스처 인식 기술의 개발에 대해 소개한다. 본 논문에서는 비교적 어려운 문제인 26개의 영문 알파벳 인식을 기준으로 성능을 평가하고 개발된 기술이 로봇에 적용된 사례를 제시하였다. 향후 가속도 센서가 포함된 다양한 장치들이 개발되고 이들이 로봇의 인터페이스로 사용될 때 현재 터치스크린 중심으로 된 로봇의 인터페이스 및 콘텐츠가 다양한 형태로 확장이 가능할 것으로 기대한다.

Keywords

References

  1. 김계경, 김혜진, 조수현, 이재연, "인간-로봇 상호 작용을 위한 제스처 인식 기술", 전자통신동향분석, 20권 2호(2005).
  2. 곽근창, 김혜진, 배경숙, 윤호섭, "오디오 기반 인간 로봇 상호작용 기술", 전자통신동향분석, 22권 2호(2007).
  3. Milner, B., "Handwriting recognition using acceleration- based motion detection", IEE Colloquium on Document Image Processing and Multimedia(Ref. No. 1999/041), 1999.
  4. Sawada, H. and S. Hashimoto, "Gesture Recognition Using an Acceleration Sensor and Its Application to Musical Performance Control", Electronics and Communications in Japan, Part III, Vol.80, No.5(1997), 9-17. https://doi.org/10.1002/(SICI)1520-6440(199705)80:5<9::AID-ECJC2>3.0.CO;2-J
  5. Farella E., L. Benini, B. Ricco and A. Acquaviva, "MOCA : A Low-Power, Low-Cost Motion Capture System Based on Integrated Accelerometers", Advances in Multimedia, 2007.
  6. Benbasat, A. Y. and A. Paradiso, "An Inertial Measurement Framework for Gesture Recognition and Applications", International Gesture Workshop on Gesture and Sign Language in Human-Computer Interaction, London, 2001.
  7. Wilson, D. H. and A. Wilson, "Gesture Recognition Using The XWand", Technical Report CMU-RI-TR-04-57, Robotics Institute, Carnegie Mellon University, 2004.
  8. Kela, J., P. Korpipaa, J. Mantyjarvi, S. Kallio, G. Savino, L. Jozzo and D. Marca, "Accelerometer- based gesture control for a design environment", Personal and Ubiquitous Computing, Vol.10, No.5(2006), 285-299. https://doi.org/10.1007/s00779-005-0033-8
  9. Bailador, G., D. Roggen, G. Troster and G. Trivino, "Real time gesture recognition using Continuous Time Recurrent Neural Networks", In Proceedings of the ICST 2nd international conference on Body area networks, 2007.
  10. Oh, J. K., S. J. Cho, W. C. Bang, W. Chang, E. S. Choi, J. Yang, J. K. Cho and D. Y. Kim, "Inertial Sensor Based Recognition of 3-D Character Gestures with an Ensemble of Classifiers", In Proceedings of Ninth International Workshop on Frontiers in Handwriting Recognition, 2004.
  11. Lim, J. G., Y. I. Sohn and D. S. Kwon, "Realtime Accelerometer Signal Processing of End Point Detection and Feature Extraction for Motion Detection", International Federation of Automatic Control-Human Machine System, Seoul, Korea, 2007.
  12. Liu, J., Z. Wang, L. Zhon, J. Wickramasuriya and V. Vasudevan, "uWave : Accelerometerbased Personalized Gesture Recognition", TR0630-08, Rice University and Motorola Labs, 2008.
  13. Leong, T. S., J. Lai, J. Panza, P. Pong and J. Hong, "Wii Want to Write : An Accelerometer Based Gesture Recognition System", 2009.
  14. Kratz, L., M. Smith and F. J. Lee, "Wiizards : 3D gesture recognition for game play input", Proceedings of the 2007 conference on Future Play, Toronto, Canada, 2007.
  15. Kallio, S., J. Kela, P. Korpupaa and J. Mantyjarvi, "User Independent Gesture Interaction For Small Handheld Devices", International Journal of Pattern Recognition, Vol.20, No.4 (2006), 505-524. https://doi.org/10.1142/S0218001406004776
  16. Nittono, H., "Event-Related Brain Potentials Corroborate Subjectively Optimal Delay in Computer Response to a User's Action", HCII 2007, LNAI 4562(2007), 575-581.
  17. Wobbrock, J. O., A. D. Wilson and Y. Li, "Gestures without libraries, toolkits, or training: a $1 recognizer for user interface prototypes", In Proceesings of the 20th Annuam ACM Symposium on User Interface Software and Technology(2007), 159-168.

Cited by

  1. Performance Evaluations for Leaf Classification Using Combined Features of Shape and Texture vol.18, pp.3, 2011, https://doi.org/10.13088/jiis.2012.18.3.001