• 제목/요약/키워드: Pose Recognition

검색결과 278건 처리시간 0.028초

운동 게임을 위한 키넥트 센서 기반 운동 자세 인식 모델 개발 (Development of Kinect-Based Pose Recognition Model for Exercise Game)

  • 박경신
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제5권10호
    • /
    • pp.303-310
    • /
    • 2016
  • 최근 Wii Sport나 Xbox Fitness 등 실제와 똑같이 몸을 움직이도록 하는 기능성 운동 게임인 엑서 게임이 인기를 끌고 있다. 그런데 이런 체감형 운동 게임에서는 사용자가 운동 자세를 얼마나 정확하게 취했는지 자세의 교정이 얼마나 필요한지를 알 수 있기 위하여 자세 인식이 크게 중요하다. 본 연구에서는 고령자를 대상으로 한 운동프로그램 콘텐츠에서 사용자의 자세 정보를 인식하기 위하여 키넥트 센서에서 제공하는 골격 모델의 특징점을 추출하여 각각의 특징벡터를 생성하여 만든 운동 자세 인식 모델 방법을 제안하였다. 본 논문에서는 제안하는 운동 자세 인식 모델의 설계 및 구현을 설명하였고, 간단한 실험을 통해서 제안된 운동 자세 인식 모델의 사용 가능성을 증명하였다. 실험결과 10명의 참여자들의 12가지 운동 자세에 대한 전체 평균은 94.52% 정도 일치율을 보였다.

Enhanced Sign Language Transcription System via Hand Tracking and Pose Estimation

  • Kim, Jung-Ho;Kim, Najoung;Park, Hancheol;Park, Jong C.
    • Journal of Computing Science and Engineering
    • /
    • 제10권3호
    • /
    • pp.95-101
    • /
    • 2016
  • In this study, we propose a new system for constructing parallel corpora for sign languages, which are generally under-resourced in comparison to spoken languages. In order to achieve scalability and accessibility regarding data collection and corpus construction, our system utilizes deep learning-based techniques and predicts depth information to perform pose estimation on hand information obtainable from video recordings by a single RGB camera. These estimated poses are then transcribed into expressions in SignWriting. We evaluate the accuracy of hand tracking and hand pose estimation modules of our system quantitatively, using the American Sign Language Image Dataset and the American Sign Language Lexicon Video Dataset. The evaluation results show that our transcription system has a high potential to be successfully employed in constructing a sizable sign language corpus using various types of video resources.

An Improved Approach for 3D Hand Pose Estimation Based on a Single Depth Image and Haar Random Forest

  • Kim, Wonggi;Chun, Junchul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권8호
    • /
    • pp.3136-3150
    • /
    • 2015
  • A vision-based 3D tracking of articulated human hand is one of the major issues in the applications of human computer interactions and understanding the control of robot hand. This paper presents an improved approach for tracking and recovering the 3D position and orientation of a human hand using the Kinect sensor. The basic idea of the proposed method is to solve an optimization problem that minimizes the discrepancy in 3D shape between an actual hand observed by Kinect and a hypothesized 3D hand model. Since each of the 3D hand pose has 23 degrees of freedom, the hand articulation tracking needs computational excessive burden in minimizing the 3D shape discrepancy between an observed hand and a 3D hand model. For this, we first created a 3D hand model which represents the hand with 17 different parts. Secondly, Random Forest classifier was trained on the synthetic depth images generated by animating the developed 3D hand model, which was then used for Haar-like feature-based classification rather than performing per-pixel classification. Classification results were used for estimating the joint positions for the hand skeleton. Through the experiment, we were able to prove that the proposed method showed improvement rates in hand part recognition and a performance of 20-30 fps. The results confirmed its practical use in classifying hand area and successfully tracked and recovered the 3D hand pose in a real time fashion.

MPEG-U-based Advanced User Interaction Interface Using Hand Posture Recognition

  • Han, Gukhee;Choi, Haechul
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제5권4호
    • /
    • pp.267-273
    • /
    • 2016
  • Hand posture recognition is an important technique to enable a natural and familiar interface in the human-computer interaction (HCI) field. This paper introduces a hand posture recognition method using a depth camera. Moreover, the hand posture recognition method is incorporated with the Moving Picture Experts Group Rich Media User Interface (MPEG-U) Advanced User Interaction (AUI) Interface (MPEG-U part 2), which can provide a natural interface on a variety of devices. The proposed method initially detects positions and lengths of all fingers opened, and then recognizes the hand posture from the pose of one or two hands, as well as the number of fingers folded when a user presents a gesture representing a pattern in the AUI data format specified in MPEG-U part 2. The AUI interface represents a user's hand posture in the compliant MPEG-U schema structure. Experimental results demonstrate the performance of the hand posture recognition system and verified that the AUI interface is compatible with the MPEG-U standard.

Viewpoint Unconstrained Face Recognition Based on Affine Local Descriptors and Probabilistic Similarity

  • Gao, Yongbin;Lee, Hyo Jong
    • Journal of Information Processing Systems
    • /
    • 제11권4호
    • /
    • pp.643-654
    • /
    • 2015
  • Face recognition under controlled settings, such as limited viewpoint and illumination change, can achieve good performance nowadays. However, real world application for face recognition is still challenging. In this paper, we propose using the combination of Affine Scale Invariant Feature Transform (SIFT) and Probabilistic Similarity for face recognition under a large viewpoint change. Affine SIFT is an extension of SIFT algorithm to detect affine invariant local descriptors. Affine SIFT generates a series of different viewpoints using affine transformation. In this way, it allows for a viewpoint difference between the gallery face and probe face. However, the human face is not planar as it contains significant 3D depth. Affine SIFT does not work well for significant change in pose. To complement this, we combined it with probabilistic similarity, which gets the log likelihood between the probe and gallery face based on sum of squared difference (SSD) distribution in an offline learning process. Our experiment results show that our framework achieves impressive better recognition accuracy than other algorithms compared on the FERET database.

Efficient 3D Model based Face Representation and Recognition Algorithmusing Pixel-to-Vertex Map (PVM)

  • Jeong, Kang-Hun;Moon, Hyeon-Joon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제5권1호
    • /
    • pp.228-246
    • /
    • 2011
  • A 3D model based approach for a face representation and recognition algorithm has been investigated as a robust solution for pose and illumination variation. Since a generative 3D face model consists of a large number of vertices, a 3D model based face recognition system is generally inefficient in computation time and complexity. In this paper, we propose a novel 3D face representation algorithm based on a pixel to vertex map (PVM) to optimize the number of vertices. We explore shape and texture coefficient vectors of the 3D model by fitting it to an input face using inverse compositional image alignment (ICIA) to evaluate face recognition performance. Experimental results show that the proposed face representation and recognition algorithm is efficient in computation time while maintaining reasonable accuracy.

산업용 지능형 로봇의 물체 인식 방법 (Object Recognition Method for Industrial Intelligent Robot)

  • 김계경;강상승;김중배;이재연;도현민;최태용;경진호
    • 한국정밀공학회지
    • /
    • 제30권9호
    • /
    • pp.901-908
    • /
    • 2013
  • The introduction of industrial intelligent robot using vision sensor has been interested in automated factory. 2D and 3D vision sensors have used to recognize object and to estimate object pose, which is for packaging parts onto a complete whole. But it is not trivial task due to illumination and various types of objects. Object image has distorted due to illumination that has caused low reliability in recognition. In this paper, recognition method of complex shape object has been proposed. An accurate object region has detected from combined binary image, which has achieved using DoG filter and local adaptive binarization. The object has recognized using neural network, which is trained with sub-divided object class according to object type and rotation angle. Predefined shape model of object and maximal slope have used to estimate the pose of object. The performance has evaluated on ETRI database and recognition rate of 96% has obtained.

모션 인식을 위한 2D 자세 추정 알고리듬의 이미지 전처리 및 얼굴 가림에 대한 영향도 분석 (Investigation of image preprocessing and face covering influences on motion recognition by a 2D human pose estimation algorithm)

  • 노은솔;이사랑;홍석무
    • 한국산학기술학회논문지
    • /
    • 제21권7호
    • /
    • pp.285-291
    • /
    • 2020
  • 제조 산업에서 인력은 로봇으로 대체되지만 전문 기술은 데이터 변환이 어려워 산업용 로봇에 적용이 불가능하다. 이는 비전 기반의 모션 인식 방법으로 데이터 확보가 가능하나 이미지 데이터에 따라 판단 값이 달라질 수 있다. 따라서 본 연구는 비전 방법을 사용해 사람의 자세를 추정 시 영향을 미치는 인자를 고려해 정확성 향상 방법을 찾고자 한다. 비전 방법 중 OpenPose의 3가지 모델 MPII, COCO 및 COCO + foot을 사용했으며, CNN(Convolutional Neural Networks)을 사용한 OpenPose 구조에서 얼굴 가림 및 이미지 전처리에 미치는 영향을 확인하고자 액세서리의 유무, 이미지 크기 및 필터링을 매개 변수로 설정했다. 각 매개 변수 별 이미지 데이터를 3 가지 모델에 적용해 실제 값과 예측 값 사이 거리 오차와 PCK (Percentage of correct Keypoint)로 영향도를 판단했다. 그 결과 COCO + foot 모델은 3 가지 매개 변수에 대한 민감도가 가장 낮았다. 또한 이미지 크기는 50% (원본 3024 × 4032에서 1512 × 2016로 축소) 이상 비율이 가장 적절하며, MPII 모델만 emboss 필터링을 적용할 때 거리 오차 평균이 최대 60pixel 감소되어 향상된 결과를 얻었다.

서비스 자동화 시스템을 위한 물체 자세 인식 및 동작 계획 (Object Pose Estimation and Motion Planning for Service Automation System)

  • 권영우;이동영;강호선;최지욱;이인호
    • 로봇학회논문지
    • /
    • 제19권2호
    • /
    • pp.176-187
    • /
    • 2024
  • Recently, automated solutions using collaborative robots have been emerging in various industries. Their primary functions include Pick & Place, Peg in the Hole, fastening and assembly, welding, and more, which are being utilized and researched in various fields. The application of these robots varies depending on the characteristics of the grippers attached to the end of the collaborative robots. To grasp a variety of objects, a gripper with a high degree of freedom is required. In this paper, we propose a service automation system using a multi-degree-of-freedom gripper, collaborative robots, and vision sensors. Assuming various products are placed at a checkout counter, we use three cameras to recognize the objects, estimate their pose, and create grasping points for grasping. The grasping points are grasped by the multi-degree-of-freedom gripper, and experiments are conducted to recognize barcodes, a key task in service automation. To recognize objects, we used a CNN (Convolutional Neural Network) based algorithm and point cloud to estimate the object's 6D pose. Using the recognized object's 6d pose information, we create grasping points for the multi-degree-of-freedom gripper and perform re-grasping in a direction that facilitates barcode scanning. The experiment was conducted with four selected objects, progressing through identification, 6D pose estimation, and grasping, recording the success and failure of barcode recognition to prove the effectiveness of the proposed system.

3차원 자세 추정 기법의 성능 향상을 위한 임의 시점 합성 기반의 고난도 예제 생성 (Hard Example Generation by Novel View Synthesis for 3-D Pose Estimation)

  • 김민지;김성찬
    • 대한임베디드공학회논문지
    • /
    • 제19권1호
    • /
    • pp.9-17
    • /
    • 2024
  • It is widely recognized that for 3D human pose estimation (HPE), dataset acquisition is expensive and the effectiveness of augmentation techniques of conventional visual recognition tasks is limited. We address these difficulties by presenting a simple but effective method that augments input images in terms of viewpoints when training a 3D human pose estimation (HPE) model. Our intuition is that meaningful variants of the input images for HPE could be obtained by viewing a human instance in the images from an arbitrary viewpoint different from that in the original images. The core idea is to synthesize new images that have self-occlusion and thus are difficult to predict at different viewpoints even with the same pose of the original example. We incorporate this idea into the training procedure of the 3D HPE model as an augmentation stage of the input samples. We show that a strategy for augmenting the synthesized example should be carefully designed in terms of the frequency of performing the augmentation and the selection of viewpoints for synthesizing the samples. To this end, we propose a new metric to measure the prediction difficulty of input images for 3D HPE in terms of the distance between corresponding keypoints on both sides of a human body. Extensive exploration of the space of augmentation probability choices and example selection according to the proposed distance metric leads to a performance gain of up to 6.2% on Human3.6M, the well-known pose estimation dataset.