• Title/Summary/Keyword: 2D-3D pose estimation

Search Result 86, Processing Time 0.019 seconds

View-Invariant Body Pose Estimation based on Biased Manifold Learning (편향된 다양체 학습 기반 시점 변화에 강인한 인체 포즈 추정)

  • Hur, Dong-Cheol;Lee, Seong-Whan
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.11
    • /
    • pp.960-966
    • /
    • 2009
  • A manifold is used to represent a relationship between high-dimensional data samples in low-dimensional space. In human pose estimation, it is created in low-dimensional space for processing image and 3D body configuration data. Manifold learning is to build a manifold. But it is vulnerable to silhouette variations. Such silhouette variations are occurred due to view-change, person-change, distance-change, and noises. Representing silhouette variations in a single manifold is impossible. In this paper, we focus a silhouette variation problem occurred by view-change. In previous view invariant pose estimation methods based on manifold learning, there were two ways. One is modeling manifolds for all view points. The other is to extract view factors from mapping functions. But these methods do not support one by one mapping for silhouettes and corresponding body configurations because of unsupervised learning. Modeling manifold and extracting view factors are very complex. So we propose a method based on triple manifolds. These are view manifold, pose manifold, and body configuration manifold. In order to build manifolds, we employ biased manifold learning. After building manifolds, we learn mapping functions among spaces (2D image space, pose manifold space, view manifold space, body configuration manifold space, 3D body configuration space). In our experiments, we could estimate various body poses from 24 view points.

UV Mapping Based Pose Estimation of Furniture Parts in Assembly Manuals (UV-map 기반의 신경망 학습을 이용한 조립 설명서에서의 부품의 자세 추정)

  • Kang, Isaac;Cho, Nam Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.667-670
    • /
    • 2020
  • 최근에는 증강현실, 로봇공학 등의 분야에서 객체의 위치 검출 이외에도, 객체의 자세에 대한 추정이 요구되고 있다. 객체의 자세 정보가 포함된 데이터셋은 위치 정보만 포함된 데이터셋에 비하여 상대적으로 매우 적기 때문에 인공 신경망 구조를 활용하기 어려운 측면이 있으나, 최근에 들어서는 기계학습 기반의 자세 추정 알고리즘들이 여럿 등장하고 있다. 본 논문에서는 이 가운데 Dense 6d Pose Object detector (DPOD) [11]의 구조를 기반으로 하여 가구의 조립 설명서에 그려진 가구 부품들의 자세를 추정하고자 한다. DPOD [11]는 입력으로 RGB 영상을 받으며, 해당 영상에서 자세를 추정하고자 하는 객체의 영역에 해당하는 픽셀들을 추정하고, 객체의 영역에 해당되는 각 픽셀에서 해당 객체의 3D 모델의 UV map 값을 추정한다. 이렇게 픽셀 개수만큼의 2D - 3D 대응이 생성된 이후에는, RANSAC과 PnP 알고리즘을 통해 RGB 영상에서의 객체와 객체의 3D 모델 간의 변환 관계 행렬이 구해지게 된다. 본 논문에서는 사전에 정해진 24개의 자세 후보들을 기반으로 가구 부품의 3D 모델을 2D에 투영한 RGB 영상들로 인공 신경망을 학습하였으며, 평가 시에는 실제 조립 설명서에서의 가구 부품의 자세를 추정하였다. 실험 결과 IKEA의 Stefan 의자 조립 설명서에 대하여 100%의 ADD score를 얻었으며, 추정 자세가 자세 후보군 중 정답 자세에 가장 근접한 경우를 정답으로 평가했을 때 100%의 정답률을 얻었다. 제안하는 신경망을 사용하였을 때, 가구 조립 설명서에서 가구 부품의 위치를 찾는 객체 검출기(object detection network)와, 각 개체의 종류를 구분하는 객체 리트리벌 네트워크(retrieval network)를 함께 사용하여 최종적으로 가구 부품의 자세를 추정할 수 있다.

  • PDF

A Switched Visual Servoing Technique Robust to Camera Calibration Errors for Reaching the Desired Location Following a Straight Line in 3-D Space (카메라 교정 오차에 강인한 3차원 직선 경로 추종을 위한 전환 비주얼 서보잉 기법)

  • Kim, Do-Hyoung;Chung, Myung-Jin
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.2
    • /
    • pp.125-134
    • /
    • 2006
  • The problem of establishing the servo system to reach the desired location keeping all features in the field of view and following a straight line is considered. In addition, robustness of camera calibration parameters is considered in this paper. The proposed approach is based on switching from position-based visual servoing (PBVS) to image-based visual servoing (IBVS) and allows the camera path to follow a straight line. To achieve the objective, a pose estimation method is required; the camera's target pose is estimated from the obtained images without the knowledge of the object. A switched control law moves the camera equipped to a robot end-effector near the desired location following a straight line in Cartesian space and then positions it to the desired pose with robustness to camera calibration error. Finally simulation results show the feasibility of the proposed visual servoing technique.

  • PDF

A 3D Face Reconstruction and Tracking Method using the Estimated Depth Information (얼굴 깊이 추정을 이용한 3차원 얼굴 생성 및 추적 방법)

  • Ju, Myung-Ho;Kang, Hang-Bong
    • The KIPS Transactions:PartB
    • /
    • v.18B no.1
    • /
    • pp.21-28
    • /
    • 2011
  • A 3D face shape derived from 2D images may be useful in many applications, such as face recognition, face synthesis and human computer interaction. To do this, we develop a fast 3D Active Appearance Model (3D-AAM) method using depth estimation. The training images include specific 3D face poses which are extremely different from one another. The landmark's depth information of landmarks is estimated from the training image sequence by using the approximated Jacobian matrix. It is added at the test phase to deal with the 3D pose variations of the input face. Our experimental results show that the proposed method can efficiently fit the face shape, including the variations of facial expressions and 3D pose variations, better than the typical AAM, and can estimate accurate 3D face shape from images.

Object Recognition-based Global Localization for Mobile Robots (이동로봇의 물체인식 기반 전역적 자기위치 추정)

  • Park, Soon-Yyong;Park, Mignon;Park, Sung-Kee
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.1
    • /
    • pp.33-41
    • /
    • 2008
  • Based on object recognition technology, we present a new global localization method for robot navigation. For doing this, we model any indoor environment using the following visual cues with a stereo camera; view-based image features for object recognition and those 3D positions for object pose estimation. Also, we use the depth information at the horizontal centerline in image where optical axis passes through, which is similar to the data of the 2D laser range finder. Therefore, we can build a hybrid local node for a topological map that is composed of an indoor environment metric map and an object location map. Based on such modeling, we suggest a coarse-to-fine strategy for estimating the global localization of a mobile robot. The coarse pose is obtained by means of object recognition and SVD based least-squares fitting, and then its refined pose is estimated with a particle filtering algorithm. With real experiments, we show that the proposed method can be an effective vision- based global localization algorithm.

  • PDF

Bundle Adjustment and 3D Reconstruction Method for Underwater Sonar Image (수중 영상 소나의 번들 조정과 3차원 복원을 위한 운동 추정의 모호성에 관한 연구)

  • Shin, Young-Sik;Lee, Yeong-jun;Cho, Hyun-Taek;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.11 no.2
    • /
    • pp.51-59
    • /
    • 2016
  • In this paper we present (1) analysis of imaging sonar measurement for two-view relative pose estimation of an autonomous vehicle and (2) bundle adjustment and 3D reconstruction method using imaging sonar. Sonar has been a popular sensor for underwater application due to its robustness to water turbidity and visibility in water medium. While vision based motion estimation has been applied to many ground vehicles for motion estimation and 3D reconstruction, imaging sonar addresses challenges in relative sensor frame motion. We focus on the fact that the sonar measurement inherently poses ambiguity in its measurement. This paper illustrates the source of the ambiguity in sonar measurements and summarizes assumptions for sonar based robot navigation. For validation, we synthetically generated underwater seafloor with varying complexity to analyze the error in the motion estimation.

Point Cloud Registration Algorithm Based on RGB-D Camera for Shooting Volumetric Objects (체적형 객체 촬영을 위한 RGB-D 카메라 기반의 포인트 클라우드 정합 알고리즘)

  • Kim, Kyung-Jin;Park, Byung-Seo;Kim, Dong-Wook;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.765-774
    • /
    • 2019
  • In this paper, we propose a point cloud matching algorithm for multiple RGB-D cameras. In general, computer vision is concerned with the problem of precisely estimating camera position. Existing 3D model generation methods require a large number of cameras or expensive 3D cameras. In addition, the conventional method of obtaining the camera external parameters through the two-dimensional image has a large estimation error. In this paper, we propose a method to obtain coordinate transformation parameters with an error within a valid range by using depth image and function optimization method to generate omni-directional three-dimensional model using 8 low-cost RGB-D cameras.

User Detection and Main Body Parts Estimation using Inaccurate Depth Information and 2D Motion Information (정밀하지 않은 깊이정보와 2D움직임 정보를 이용한 사용자 검출과 주요 신체부위 추정)

  • Lee, Jae-Won;Hong, Sung-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.17 no.4
    • /
    • pp.611-624
    • /
    • 2012
  • 'Gesture' is the most intuitive means of communication except the voice. Therefore, there are many researches for method that controls computer using gesture input to replace the keyboard or mouse. In these researches, the method of user detection and main body parts estimation is one of the very important process. in this paper, we propose user objects detection and main body parts estimation method on inaccurate depth information for pose estimation. we present user detection method using 2D and 3D depth information, so this method robust to changes in lighting and noise and 2D signal processing 1D signals, so mainly suitable for real-time and using the previous object information, so more accurate and robust. Also, we present main body parts estimation method using 2D contour information, 3D depth information, and tracking. The result of an experiment, proposed user detection method is more robust than only using 2D information method and exactly detect object on inaccurate depth information. Also, proposed main body parts estimation method overcome the disadvantage that can't detect main body parts in occlusion area only using 2D contour information and sensitive to changes in illumination or environment using color information.

Human Motion Tracking based on 3D Depth Point Matching with Superellipsoid Body Model (타원체 모델과 깊이값 포인트 매칭 기법을 활용한 사람 움직임 추적 기술)

  • Kim, Nam-Gyu
    • Journal of Digital Contents Society
    • /
    • v.13 no.2
    • /
    • pp.255-262
    • /
    • 2012
  • Human motion tracking algorithm is receiving attention from many research areas, such as human computer interaction, video conference, surveillance analysis, and game or entertainment applications. Over the last decade, various tracking technologies for each application have been demonstrated and refined among them such of real time computer vision and image processing, advanced man-machine interface, and so on. In this paper, we introduce cost-effective and real-time human motion tracking algorithms based on depth image 3D point matching with a given superellipsoid body representation. The body representative model is made by using parametric volume modeling method based on superellipsoid and consists of 18 articulated joints. For more accurate estimation, we exploit initial inverse kinematic solution with classified body parts' information, and then, the initial pose is modified to more accurate pose by using 3D point matching algorithm.

Real-time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks (RGB-Depth 카메라와 Deep Convolution Neural Networks 기반의 실시간 사람 양손 3D 포즈 추정)

  • Park, Na Hyeon;Ji, Yong Bin;Gi, Geon;Kim, Tae Yeon;Park, Hye Min;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.686-689
    • /
    • 2018
  • 3D 손 포즈 추정(Hand Pose Estimation, HPE)은 스마트 인간 컴퓨터 인터페이스를 위해서 중요한 기술이다. 이 연구에서는 딥러닝 방법을 기반으로 하여 단일 RGB-Depth 카메라로 촬영한 양손의 3D 손 자세를 실시간으로 인식하는 손 포즈 추정 시스템을 제시한다. 손 포즈 추정 시스템은 4단계로 구성된다. 첫째, Skin Detection 및 Depth cutting 알고리즘을 사용하여 양손을 RGB와 깊이 영상에서 감지하고 추출한다. 둘째, Convolutional Neural Network(CNN) Classifier는 오른손과 왼손을 구별하는데 사용된다. CNN Classifier 는 3개의 convolution layer와 2개의 Fully-Connected Layer로 구성되어 있으며, 추출된 깊이 영상을 입력으로 사용한다. 셋째, 학습된 CNN regressor는 추출된 왼쪽 및 오른쪽 손의 깊이 영상에서 손 관절을 추정하기 위해 다수의 Convolutional Layers, Pooling Layers, Fully Connected Layers로 구성된다. CNN classifier와 regressor는 22,000개 깊이 영상 데이터셋으로 학습된다. 마지막으로, 각 손의 3D 손 자세는 추정된 손 관절 정보로부터 재구성된다. 테스트 결과, CNN classifier는 오른쪽 손과 왼쪽 손을 96.9%의 정확도로 구별할 수 있으며, CNN regressor는 형균 8.48mm의 오차 범위로 3D 손 관절 정보를 추정할 수 있다. 본 연구에서 제안하는 손 포즈 추정 시스템은 가상 현실(virtual reality, VR), 증강 현실(Augmented Reality, AR) 및 융합 현실 (Mixed Reality, MR) 응용 프로그램을 포함한 다양한 응용 분야에서 사용할 수 있다.