DOI QR코드

DOI QR Code

Therapeutic Robot Action Design for ASD Children Using Speech Data

음성 정보를 이용한 자폐아 치료용 로봇의 동작 설계

  • Lee, Jin-Gyu (Dept. of Electrical Engineering, Semyung University) ;
  • Lee, Bo-Hee (Dept. of Electrical Engineering, Semyung University)
  • Received : 2018.12.04
  • Accepted : 2018.12.18
  • Published : 2018.12.31

Abstract

A cat robot for the Autism Spectrum Disorders(ASD) treatment was designed and conducted field test. The designed robot had emotion expressing action through interaction by the touch, and performed a reasonable emotional expression based on Artificial Neural Network(ANN). However these operations were difficult to use in the various healing activities. In this paper, we describe a motion design that can be used in a variety of contexts and flexibly reaction with various kinds of situations. As a necessary element, the speech recognition system using the speech data collection method and ANN was suggested and the classification results were analyzed after experiment. This ANN will be improved through collecting various voice data to raise the accuracy in the future and checked the effectiveness through field test.

이전 연구에서 자폐성 장애의 여러 특징적 증상을 갖는 아이들의 치료를 위해 사용될 수 있는 로봇을 설계 및 제작하여 현장실험이 진행 되었으며 기존 로봇은 터치에 의한 아이들과 상호작용을 통해 감정 표현 동작을 한다. 이러한 터치 상호작용에 감정 교육 및 치료를 위해 인공신경망을 이용한 동작 설계를 하였다. 하지만 이러한 물리적 접촉은 치료 활동의 초기에 사용되기 어려워 초기 치료 효과를 기대하기 어려웠다. 이에 본 논문에서는 동작 방식을 보완하여 음성 정보를 이용한 빠른 상호작용을 통해 치료 활동이 초기에 가능하고 유연한 대처와 다양한 상황에서 로봇이 사용될 수 있는 동작 설계를 기술한다. 이에 필요한 요소로서 음성 데이터 수집 방법 및 인공신경망을 이용한 음성 인식 구조가 설계되었으며 실험을 통하여 분류 결과를 분석하였다. 이렇게 설계된 인공신경망은 향후 다양한 음성 데이터를 수집하여 정확도를 향상시키고 현장실험을 통하여 동작의 효용성을 살펴볼 것이다.

Keywords

JGGJB@_2018_v22n4_1123_f0001.png 이미지

Fig. 1. Field test. 그림 1. 로봇 현장 실험

JGGJB@_2018_v22n4_1123_f0002.png 이미지

Fig. 2. Robot structure and skin. 그림 2. 로봇 뼈대 및 외관

JGGJB@_2018_v22n4_1123_f0003.png 이미지

Fig. 3. Controller block diagram. 그림 3. 제어기 블록 선도

JGGJB@_2018_v22n4_1123_f0004.png 이미지

Fig. 4. Representative emotional action. 그림 4. 대표적인 감정표현 동작

JGGJB@_2018_v22n4_1123_f0005.png 이미지

Fig. 5. Data graph for three representative Emotions. 그림 5. 3가지 감정의 데이터 그래프

JGGJB@_2018_v22n4_1123_f0006.png 이미지

Fig. 6. Speech data acquisition and transmission. 그림 6. 음성 데이터 수집 및 전송

JGGJB@_2018_v22n4_1123_f0007.png 이미지

Fig. 7. Spectrogram of three speeches. 그림 7. 3가지 음성의 스펙트로그램

JGGJB@_2018_v22n4_1123_f0008.png 이미지

Fig. 8. Control process. 그림 8. 제어기 동작 순서

JGGJB@_2018_v22n4_1123_f0009.png 이미지

Fig. 9. Sampled data graph. 그림 9. 음성과 노이즈 데이터 그래프

JGGJB@_2018_v22n4_1123_f0010.png 이미지

Fig. 10. Designed ANN. 그림10.설계된 인공신경망 구조

JGGJB@_2018_v22n4_1123_f0011.png 이미지

Fig. 11. LSTM learning process. 그림11.LSTM 학습 과정

JGGJB@_2018_v22n4_1123_f0012.png 이미지

Fig. 12. Actions using speech data. 그림12. 음성 명령에 의한 로봇 동작

JGGJB@_2018_v22n4_1123_f0013.png 이미지

Fig. 13. comparison to Desired value and LSTM output. 그림13. 목표값과 LSTM 출력값 비교

Table 1. Action table of robot. 표 1. 고양이 로봇의 동작 테이블

JGGJB@_2018_v22n4_1123_t0001.png 이미지

References

  1. Yang-Soon Kim, "A Case Study of an Animal Assisted Play Therapy with Two Autistic Children," The Korea Journal of Counseling, vol.6, no.2, pp.485-497, 2005.
  2. Jin-Gyu Lee, Bo-Hee Lee, Jin-Soun Jung, Ja-Young Kwon, "Robot Design and Action Study for the Treatment of Autistic Spectrum Disorders Children," Institute of Korean Electrical And Electronics Engineers, vol.20, no.2, pp.196-199, 2016. DOI:10.1145/1463689.1463716
  3. Sujirat Attawibulkul, Boonserm Kaewkamnerdpong, Yoshikazu Miyanaga, "Noisy speech training in MFCC-based speech recognition with noise suppression toward robot assisted autism therapy," 2017 10th Biomedical Engineering International Conference (BMEiCON), pp.1-5, 2017. DOI:10.1109/BMEiCON.2017.8229135
  4. Sungho Jeon, Jong-Woo Shin, Young-Jun Lee, Woong-Hee Kim, YoungHyoun Kwon, Hae-Yong Yang, "Empirical study of drone sound detection in real-life environment with deep neural networks," 2017 25th European Signal Processing Conference (EUSIPCO), pp.1858-1862, 2017.
  5. Fei Tao, Carlos Busso, "Aligning Audiovisual Features for Audiovisual Speech Recognition," 2018 IEEE International Conference on Multimedia and Expo (ICME), pp.1-6, 2018. DOI:10.1109/ICME.2018.8486455
  6. Sreeram Ganji, Rohit Sinha, "Exploring recurrent neural network based acoustic and linguistic modeling for children's speech recognition," TENCON 2017-2017 IEEE Region 10 Conference, pp.2880-2884, 2017. DOI:10.1109/ICME.2018.8486455
  7. Po-Wei Hsiao, Chia-Ping Chen, "Effective Attention Mechanism in Dynamic Models for Speech Emotion Recognition," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.2526-2530, 2018. DOI:10.1109/ICASSP.2018.8461431
  8. Giambattista Parascandolo, Heikki Huttunen, Tuomas Virtanen, "Recurrent neural networks for polyphonic sound event detection in real life recordings," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6440-6444, 2016. DOI:10.1109/ICASSP.2016.7472917
  9. Izhak Shafran, Tom Bagby, R. J. Skerry-Ryan, "Complex Evolution Recurrent Neural Networks (ceRNNs)," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5854-5858, 2018. DOI:10.1109/ICASSP.2018.8462556
  10. Gregory Gelly, Jean-Luc Gauvain, "Optimization of RNN-Based Speech Activity Detection," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.26, no.3, pp.646-656, 2018. DOI:10.1109/TASLP.2017.2769220
  11. Naima Zerari, Samir Abdelhamid, Hassen Bouzgou, Christian Raymond, "Bi-directional recurrent end-to-end neural network classifier for spoken Arab digit recognition," 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP), 2018, pp.1-6. DOI:10.1109/ICNLSP.2018.8374374
  12. Kazuki Irie, Zhihong Lei, Ralf Schlüter, Hermann Ney, "Prediction of LSTM-RNN Full Context States as a Subtask for N-Gram Feedforward Language Models," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp.6104-6108. DOI:10.1109/ICASSP.2018.8461743.
  13. Ha Manh Do, Weihua Sheng, Meiqin Liu, Senlin Zhang, "Context-aware sound event recognition for home service robots," 2016 IEEE International Conference on Automation Science and Engineering (CASE), pp.739-744, 2016. DOI:10.1109/COASE.2016.7743476
  14. Qingyang Hong, Caihong Zhang, Xiaoyang Chen, Yan Chen, "Embedded speech recognition system for intelligent robot," 2007 14th International Conference on Mechatronics and Machine Vision in Practice, pp.35-382. 2007. DOI:10.1109/COASE.2016.7743476
  15. Youngjoo Suh, Younggwan Kim, Hyungjun Lim, Jahyun Goo, Youngmoon Jung, Yeonjoo Choi, Hoirin Kim, Dae-Lim Choi, Yongju Lee, "Development of distant multi-channel speech and noise databases for speech recognition by in-door conversational robots," 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA), pp.5-8, 2017. DOI:10.1109/ICSDA.2017.8384419
  16. Jin-Gyu Lee, Bo-Hee Lee, Ju-Yeong Jang, Ja-Young Kwon, Keum-Hi Mun, Jin-Soun Jung, "Study on Cat Robot Utilization for Treatment of Autistic Children," International Journal of Humanoid Robotics, vol.14, no.2, pp.1-15, 2017. DOI:10.1142/S0219843617500013
  17. Yonghui Xing, Wenzhuo Chen, "Design of Speech Recognition Robot Based on MCU," 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics, pp.253-256, 2012. DOI:10.1109/IHMSC.2012.70
  18. Jin-Gyu Lee, Bo-Hee Lee, "Therapeutic Behavior of Robot for Treating Autistic Child Using Artificial Neural Network," Fuzzy Systems and Data Mining IV Proceedings of FSDM 2018, vol.309, pp.358-364, Sep. 2018. DOI:10.3233/978-1-61499-927-0-358
  19. Jonathan C. Kim, Paul Azzi, Myounghoon Jeon, Ayanna M. Howard, Chung Hyuk Park, "Audio-based emotion estimation for interactive robotic therapy for children with autism spectrum disorder," 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), pp.39-44, 2017. DOI:10.1109/URAI.2017.7992881