• Title/Summary/Keyword: Human Pose Recognition

Search Result 83, Processing Time 0.026 seconds

Viewpoint Unconstrained Face Recognition Based on Affine Local Descriptors and Probabilistic Similarity

  • Gao, Yongbin;Lee, Hyo Jong
    • Journal of Information Processing Systems
    • /
    • v.11 no.4
    • /
    • pp.643-654
    • /
    • 2015
  • Face recognition under controlled settings, such as limited viewpoint and illumination change, can achieve good performance nowadays. However, real world application for face recognition is still challenging. In this paper, we propose using the combination of Affine Scale Invariant Feature Transform (SIFT) and Probabilistic Similarity for face recognition under a large viewpoint change. Affine SIFT is an extension of SIFT algorithm to detect affine invariant local descriptors. Affine SIFT generates a series of different viewpoints using affine transformation. In this way, it allows for a viewpoint difference between the gallery face and probe face. However, the human face is not planar as it contains significant 3D depth. Affine SIFT does not work well for significant change in pose. To complement this, we combined it with probabilistic similarity, which gets the log likelihood between the probe and gallery face based on sum of squared difference (SSD) distribution in an offline learning process. Our experiment results show that our framework achieves impressive better recognition accuracy than other algorithms compared on the FERET database.

Human Activity Recognition using View-Invariant Features and Probabilistic Graphical Models (시점 불변인 특징과 확률 그래프 모델을 이용한 인간 행위 인식)

  • Kim, Hyesuk;Kim, Incheol
    • Journal of KIISE
    • /
    • v.41 no.11
    • /
    • pp.927-934
    • /
    • 2014
  • In this paper, we propose an effective method for recognizing daily human activities from a stream of three dimensional body poses, which can be obtained by using Kinect-like RGB-D sensors. The body pose data provided by Kinect SDK or OpenNI may suffer from both the view variance problem and the scale variance problem, since they are represented in the 3D Cartesian coordinate system, the origin of which is located on the center of Kinect. In order to resolve the problem and get the view-invariant and scale-invariant features, we transform the pose data into the spherical coordinate system of which the origin is placed on the center of the subject's hip, and then perform on them the scale normalization using the length of the subject's arm. In order to represent effectively complex internal structures of high-level daily activities, we utilize Hidden state Conditional Random Field (HCRF), which is one of probabilistic graphical models. Through various experiments using two different datasets, KAD-70 and CAD-60, we showed the high performance of our method and the implementation system.

Design and Development of the Multiple Kinect Sensor-based Exercise Pose Estimation System (다중 키넥트 센서 기반의 운동 자세 추정 시스템 설계 및 구현)

  • Cho, Yongjoo;Park, Kyoung Shin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.3
    • /
    • pp.558-567
    • /
    • 2017
  • In this research, we developed an efficient real-time human exercise pose estimation system using multiple Kinects. The main objective of this system is to measure and recognize the user's posture (such as knee curl or lunge) more accurately by employing Kinects on the front and the sides. Especially it is designed as an extensible and modular method which enables to support various additional postures in the future. This system is configured as multiple clients and the Unity3D server. The client processes Kinect skeleton data and send to the server. The server performs the multiple-Kinect calibration process and then applies the pose estimation algorithm based on the Kinect-based posture recognition model using feature extractions and the weighted averaging of feature values for different Kinects. This paper presents the design and implementation of the human exercise pose estimation system using multiple Kinects and also describes how to build and execute an interactive Unity3D exergame.

Investigation of image preprocessing and face covering influences on motion recognition by a 2D human pose estimation algorithm (모션 인식을 위한 2D 자세 추정 알고리듬의 이미지 전처리 및 얼굴 가림에 대한 영향도 분석)

  • Noh, Eunsol;Yi, Sarang;Hong, Seokmoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.285-291
    • /
    • 2020
  • In manufacturing, humans are being replaced with robots, but expert skills remain difficult to convert to data, making them difficult to apply to industrial robots. One method is by visual motion recognition, but physical features may be judged differently depending on the image data. This study aimed to improve the accuracy of vision methods for estimating the posture of humans. Three OpenPose vision models were applied: MPII, COCO, and COCO+foot. To identify the effects of face-covering accessories and image preprocessing on the Convolutional Neural Network (CNN) structure, the presence/non-presence of accessories, image size, and filtering were set as the parameters affecting the identification of a human's posture. For each parameter, image data were applied to the three models, and the errors between the actual and predicted values, as well as the percentage correct keypoints (PCK), were calculated. The COCO+foot model showed the lowest sensitivity to all three parameters. A <50% (from 3024×4032 to 1512×2016 pixels) reduction in image size was considered acceptable. Emboss filtering, in combination with MPII, provided the best results (reduced error of <60 pixels).

Efficient Object Tracking System Using the Fusion of a CCD Camera and an Infrared Camera (CCD카메라와 적외선 카메라의 융합을 통한 효과적인 객체 추적 시스템)

  • Kim, Seung-Hun;Jung, Il-Kyun;Park, Chang-Woo;Hwang, Jung-Hoon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.3
    • /
    • pp.229-235
    • /
    • 2011
  • To make a robust object tracking and identifying system for an intelligent robot and/or home system, heterogeneous sensor fusion between visible ray system and infrared ray system is proposed. The proposed system separates the object by combining the ROI (Region of Interest) estimated from two different images based on a heterogeneous sensor that consolidates the ordinary CCD camera and the IR (Infrared) camera. Human's body and face are detected in both images by using different algorithms, such as histogram, optical-flow, skin-color model and Haar model. Also the pose of human body is estimated from the result of body detection in IR image by using PCA algorithm along with AdaBoost algorithm. Then, the results from each detection algorithm are fused to extract the best detection result. To verify the heterogeneous sensor fusion system, few experiments were done in various environments. From the experimental results, the system seems to have good tracking and identification performance regardless of the environmental changes. The application area of the proposed system is not limited to robot or home system but the surveillance system and military system.

Deep learning-based Human Action Recognition Technique Considering the Spatio-Temporal Relationship of Joints (관절의 시·공간적 관계를 고려한 딥러닝 기반의 행동인식 기법)

  • Choi, Inkyu;Song, Hyok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.413-415
    • /
    • 2022
  • Since human joints can be used as useful information for analyzing human behavior as a component of the human body, many studies have been conducted on human action recognition using joint information. However, it is a very complex problem to recognize human action that changes every moment using only each independent joint information. Therefore, an additional information extraction method to be used for learning and an algorithm that considers the current state based on the past state are needed. In this paper, we propose a human action recognition technique considering the positional relationship of connected joints and the change of the position of each joint over time. Using the pre-trained joint extraction model, position information of each joint is obtained, and bone information is extracted using the difference vector between the connected joints. In addition, a simplified neural network is constructed according to the two types of inputs, and spatio-temporal features are extracted by adding LSTM. As a result of the experiment using a dataset consisting of 9 behaviors, it was confirmed that when the action recognition accuracy was measured considering the temporal and spatial relationship features of each joint, it showed superior performance compared to the result using only single joint information.

  • PDF

Interaction Intent Analysis of Multiple Persons using Nonverbal Behavior Features (인간의 비언어적 행동 특징을 이용한 다중 사용자의 상호작용 의도 분석)

  • Yun, Sang-Seok;Kim, Munsang;Choi, Mun-Taek;Song, Jae-Bok
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.8
    • /
    • pp.738-744
    • /
    • 2013
  • According to the cognitive science research, the interaction intent of humans can be estimated through an analysis of the representing behaviors. This paper proposes a novel methodology for reliable intention analysis of humans by applying this approach. To identify the intention, 8 behavioral features are extracted from the 4 characteristics in human-human interaction and we outline a set of core components for nonverbal behavior of humans. These nonverbal behaviors are associated with various recognition modules including multimodal sensors which have each modality with localizing sound source of the speaker in the audition part, recognizing frontal face and facial expression in the vision part, and estimating human trajectories, body pose and leaning, and hand gesture in the spatial part. As a post-processing step, temporal confidential reasoning is utilized to improve the recognition performance and integrated human model is utilized to quantitatively classify the intention from multi-dimensional cues by applying the weight factor. Thus, interactive robots can make informed engagement decision to effectively interact with multiple persons. Experimental results show that the proposed scheme works successfully between human users and a robot in human-robot interaction.

Driver Assistance System By the Image Based Behavior Pattern Recognition (영상기반 행동패턴 인식에 의한 운전자 보조시스템)

  • Kim, Sangwon;Kim, Jungkyu
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.12
    • /
    • pp.123-129
    • /
    • 2014
  • In accordance with the development of various convergence devices, cameras are being used in many types of the systems such as security system, driver assistance device and so on, and a lot of people are exposed to these system. Therefore the system should be able to recognize the human behavior and support some useful functions with the information that is obtained from detected human behavior. In this paper we use a machine learning approach based on 2D image and propose the human behavior pattern recognition methods. The proposed methods can provide valuable information to support some useful function to user based on the recognized human behavior. First proposed one is "phone call behavior" recognition. If a camera of the black box, which is focused on driver in a car, recognize phone call pose, it can give a warning to driver for safe driving. The second one is "looking ahead" recognition for driving safety where we propose the decision rule and method to decide whether the driver is looking ahead or not. This paper also shows usefulness of proposed recognition methods with some experiment results in real time.

Three-dimensional Head Tracking Using Adaptive Local Binary Pattern in Depth Images

  • Kim, Joongrock;Yoon, Changyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.2
    • /
    • pp.131-139
    • /
    • 2016
  • Recognition of human motions has become a main area of computer vision due to its potential human-computer interface (HCI) and surveillance. Among those existing recognition techniques for human motions, head detection and tracking is basis for all human motion recognitions. Various approaches have been tried to detect and trace the position of human head in two-dimensional (2D) images precisely. However, it is still a challenging problem because the human appearance is too changeable by pose, and images are affected by illumination change. To enhance the performance of head detection and tracking, the real-time three-dimensional (3D) data acquisition sensors such as time-of-flight and Kinect depth sensor are recently used. In this paper, we propose an effective feature extraction method, called adaptive local binary pattern (ALBP), for depth image based applications. Contrasting to well-known conventional local binary pattern (LBP), the proposed ALBP cannot only extract shape information without texture in depth images, but also is invariant distance change in range images. We apply the proposed ALBP for head detection and tracking in depth images to show its effectiveness and its usefulness.

Back-Propagation Neural Network Based Face Detection and Pose Estimation (오류-역전파 신경망 기반의 얼굴 검출 및 포즈 추정)

  • Lee, Jae-Hoon;Jun, In-Ja;Lee, Jung-Hoon;Rhee, Phill-Kyu
    • The KIPS Transactions:PartB
    • /
    • v.9B no.6
    • /
    • pp.853-862
    • /
    • 2002
  • Face Detection can be defined as follows : Given a digitalized arbitrary or image sequence, the goal of face detection is to determine whether or not there is any human face in the image, and if present, return its location, direction, size, and so on. This technique is based on many applications such face recognition facial expression, head gesture and so on, and is one of important qualify factors. But face in an given image is considerably difficult because facial expression, pose, facial size, light conditions and so on change the overall appearance of faces, thereby making it difficult to detect them rapidly and exactly. Therefore, this paper proposes fast and exact face detection which overcomes some restrictions by using neural network. The proposed system can be face detection irrelevant to facial expression, background and pose rapidily. For this. face detection is performed by neural network and detection response time is shortened by reducing search region and decreasing calculation time of neural network. Reduced search region is accomplished by using skin color segment and frame difference. And neural network calculation time is decreased by reducing input vector sire of neural network. Principle Component Analysis (PCA) can reduce the dimension of data. Also, pose estimates in extracted facial image and eye region is located. This result enables to us more informations about face. The experiment measured success rate and process time using the Squared Mahalanobis distance. Both of still images and sequence images was experimented and in case of skin color segment, the result shows different success rate whether or not camera setting. Pose estimation experiments was carried out under same conditions and existence or nonexistence glasses shows different result in eye region detection. The experiment results show satisfactory detection rate and process time for real time system.