DOI QR코드

DOI QR Code

LSTM(Long Short-Term Memory)-Based Abnormal Behavior Recognition Using AlphaPose

AlphaPose를 활용한 LSTM(Long Short-Term Memory) 기반 이상행동인식

  • Received : 2020.12.18
  • Accepted : 2021.03.06
  • Published : 2021.05.31

Abstract

A person's behavioral recognition is the recognition of what a person does according to joint movements. To this end, we utilize computer vision tasks that are utilized in image processing. Human behavior recognition is a safety accident response service that combines deep learning and CCTV, and can be applied within the safety management site. Existing studies are relatively lacking in behavioral recognition studies through human joint keypoint extraction by utilizing deep learning. There were also problems that were difficult to manage workers continuously and systematically at safety management sites. In this paper, to address these problems, we propose a method to recognize risk behavior using only joint keypoints and joint motion information. AlphaPose, one of the pose estimation methods, was used to extract joint keypoints in the body part. The extracted joint keypoints were sequentially entered into the Long Short-Term Memory (LSTM) model to be learned with continuous data. After checking the behavioral recognition accuracy, it was confirmed that the accuracy of the "Lying Down" behavioral recognition results was high.

사람의 행동인식(Action Recognition)은 사람의 관절 움직임에 따라 어떤 행동을 하는지 인식하는 것이다. 이를 위해서 영상처리에 활용되는 컴퓨터 비전 태스크를 활용하였다. 사람의 행동인식은 딥러닝과 CCTV를 결합한 안전사고 대응서비스로서 안전관리 현장 내에서도 적용될 수 있다. 기존연구는 딥러닝을 활용하여 사람의 관절 키포인트 추출을 통한 행동인식 연구가 상대적으로 부족한 상태이다. 또한 안전관리 현장에서 작업자를 지속적이고 체계적으로 관리하기 어려운 문제점도 있었다. 본 논문에서는 이러한 문제점들을 해결하기 위해 관절 키포인트와 관절 움직임 정보만을 이용하여 위험 행동을 인식하는 방법을 제안하고자 한다. 자세추정방법(Pose Estimation)의 하나인 AlphaPose를 활용하여 신체 부위의 관절 키포인트를 추출하였다. 추출된 관절 키포인트를 LSTM(Long Short-Term Memory) 모델에 순차적으로 입력하여 연속적인 데이터로 학습을 하였다. 행동인식 정확률을 확인한 결과 "누워있기(Lying Down)" 행동인식 결과의 정확도가 높음을 확인할 수 있었다.

Keywords

Acknowledgement

이 논문은 행정안전부 극한재난대응기반기술개발사업의 지원을 받아 수행된 연구임(2020-MOIS31-014).

References

  1. Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, and Cewu Lu, "Rmpe: Regional multi-person pose estimation," Proceedings of the IEEE International Conference on Computer Vision, 2017.
  2. Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang, "Deep high-resolution representation learning for human pose estimation," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
  3. Felix A. Gers, Jurgen Schmidhuber, and Fred Cummins, "Learning to forget: Continual prediction with LSTM," (1999): 850-855.
  4. Tsung-Yi Lin, et al., "Microsoft coco: Common objects in context," European Conference on Computer Vision, Springer, Cham, 2014.
  5. Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh, "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields," arXiv preprint arXiv:1812.08008, 2018.
  6. Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick, "Mask r-cnn," Proceedings of the IEEE International Conference on Computer Vision, 2017.
  7. Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, and Cewu Lu, "Crowdpose: Efficient crowded scenes pose estimation and a new benchmark," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  8. Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals, "Recurrent neural network regularization," arXiv preprint arXiv:1409.2329 (2014).
  9. Statistic, US Bureau of Labor, "Nonfatal Occupational Injuries and Illnesses Requiring Days Away from Work, 2011," UDo Labor, Editor (2012).
  10. Lieyun Ding, Weili Fang, Hanbin Luo, Peter E. D. Love, Botao Zhong, and Xi Ouyang, "A deep hybrid learning model to detect unsafe behavior: Integrating convolution neural networks and long short-term memory," Automation in Construction, Vol.86, pp.118-124, 2018. https://doi.org/10.1016/j.autcon.2017.11.002
  11. D. P. Kingma, and B. Jimmy, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
  12. Human Pose Estimation Image AI Data [Internet], https://aihub.or.kr/aidata/138
  13. Toshev, Alexander, and Christian Szegedy, "Deeppose: Human pose estimation via deep neural networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014.
  14. Xiao, Bin, Haiping Wu, and Yichen Wei, "Simple baselines for human pose estimation and tracking," Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  15. Yan, Sijie, Yuanjun Xiong, and Dahua Lin, "Spatial temporal graph convolutional networks for skeleton-based action recognition," Proceedings of the AAAI Conference on Artificial Intelligence, Vol.32. No.1. 2018.