• Title/Summary/Keyword: 2D-3D pose estimation

Search Result 86, Processing Time 0.025 seconds

Robust 3-D Motion Estimation Based on Stereo Vision and Kalman Filtering (스테레오 시각과 Kalman 필터링을 이용한 강인한 3차원 운동추정)

  • 계영철
    • Journal of Broadcast Engineering
    • /
    • v.1 no.2
    • /
    • pp.176-187
    • /
    • 1996
  • This paper deals with the accurate estimation of 3- D pose (position and orientation) of a moving object with reference to the world frame (or robot base frame), based on a sequence of stereo images taken by cameras mounted on the end - effector of a robot manipulator. This work is an extension of the previous work[1]. Emphasis is given to the 3-D pose estimation relative to the world (or robot base) frame under the presence of not only the measurement noise in 2 - D images[ 1] but also the camera position errors due to the random noise involved in joint angles of a robot manipulator. To this end, a new set of discrete linear Kalman filter equations is derived, based on the following: 1) the orientation error of the object frame due to measurement noise in 2 - D images is modeled with reference to the camera frame by analyzing the noise propagation through 3- D reconstruction; 2) an extended Jacobian matrix is formulated by combining the result of 1) and the orientation error of the end-effector frame due to joint angle errors through robot differential kinematics; and 3) the rotational motion of an object, which is nonlinear in nature, is linearized based on quaternions. Motion parameters are computed from the estimated quaternions based on the iterated least-squares method. Simulation results show the significant reduction of estimation errors and also demonstrate an accurate convergence of the actual motion parameters to the true values.

  • PDF

Combining an Edge-Based Method and a Direct Method for Robust 3D Object Tracking

  • Lomaliza, Jean-Pierre;Park, Hanhoon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.2
    • /
    • pp.167-177
    • /
    • 2021
  • In the field of augmented reality, edge-based methods have been popularly used in tracking textureless 3D objects. However, edge-based methods are inherently vulnerable to cluttered backgrounds. Another way to track textureless or poorly-textured 3D objects is to directly align image intensity of 3D object between consecutive frames. Although the direct methods enable more reliable and stable tracking compared to using local features such as edges, they are more sensitive to occlusion and less accurate than the edge-based methods. Therefore, we propose a method that combines an edge-based method and a direct method to leverage the advantages from each approach. Experimental results show that the proposed method is much robust to both fast camera (or object) movements and occlusion while still working in real time at a frame rate of 18 Hz. The tracking success rate and tracking accuracy were improved by up to 84% and 1.4 pixels, respectively, compared to using the edge-based method or the direct method solely.

High-Quality Depth Map Generation of Humans in Monocular Videos (단안 영상에서 인간 오브젝트의 고품질 깊이 정보 생성 방법)

  • Lee, Jungjin;Lee, Sangwoo;Park, Jongjin;Noh, Junyong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.20 no.2
    • /
    • pp.1-11
    • /
    • 2014
  • The quality of 2D-to-3D conversion depends on the accuracy of the assigned depth to scene objects. Manual depth painting for given objects is labor intensive as each frame is painted. Specifically, a human is one of the most challenging objects for a high-quality conversion, as a human body is an articulated figure and has many degrees of freedom (DOF). In addition, various styles of clothes, accessories, and hair create a very complex silhouette around the 2D human object. We propose an efficient method to estimate visually pleasing depths of a human at every frame in a monocular video. First, a 3D template model is matched to a person in a monocular video with a small number of specified user correspondences. Our pose estimation with sequential joint angular constraints reproduces a various range of human motions (i.e., spine bending) by allowing the utilization of a fully skinned 3D model with a large number of joints and DOFs. The initial depth of the 2D object in the video is assigned from the matched results, and then propagated toward areas where the depth is missing to produce a complete depth map. For the effective handling of the complex silhouettes and appearances, we introduce a partial depth propagation method based on color segmentation to ensure the detail of the results. We compared the result and depth maps painted by experienced artists. The comparison shows that our method produces viable depth maps of humans in monocular videos efficiently.

CNN3D-Based Bus Passenger Prediction Model Using Skeleton Keypoints (Skeleton Keypoints를 활용한 CNN3D 기반의 버스 승객 승하차 예측모델)

  • Jang, Jin;Kim, Soo Hyung
    • Smart Media Journal
    • /
    • v.11 no.3
    • /
    • pp.90-101
    • /
    • 2022
  • Buses are a popular means of transportation. As such, thorough preparation is needed for passenger safety management. However, the safety system is insufficient because there are accidents such as a death accident occurred when the bus departed without recognizing the elderly approaching to get on in 2018. There is a safety system that prevents pinching accidents through sensors on the back door stairs, but such a system does not prevent accidents that occur in the process of getting on and off like the above accident. If it is possible to predict the intention of bus passengers to get on and off, it will help to develop a safety system to prevent such accidents. However, studies predicting the intention of passengers to get on and off are insufficient. Therefore, in this paper, we propose a 1×1 CNN3D-based getting on and off intention prediction model using skeleton keypoints of passengers extracted from the camera image attached to the bus through UDP-Pose. The proposed model shows approximately 1~2% higher accuracy than the RNN and LSTM models in predicting passenger's getting on and off intentions.

Fast Structure Recovery and Integration using Scaled Orthographic Factorization (개선된 직교분해기법을 사용한 구조의 빠른 복원 및 융합)

  • Yoon, Jong-Hyun;Park, Jong-Seung;Lee, Sang-Rak;Noh, Sung-Ryul
    • 한국HCI학회:학술대회논문집
    • /
    • 2006.02a
    • /
    • pp.486-492
    • /
    • 2006
  • 본 논문에서는 비디오에서의 특징점 추적을 통해 얻은 2D 좌표를 이용한 3D 구조를 추정하는 방법과 네 점 이상의 공통점을 이용한 융합 방법을 제안한다. 영상의 각 프레임에서 공통되는 특징점을 이용하여 형상을 추정한다. 영상의 각 프레임에 대한 특징점의 추적은 Lucas-Kanade 방법을 사용하였다. 3D 좌표 추정 방법으로 개선된 직교분해기법을 사용하였다. 개선된 직교분해기법에서는 3D 좌표를 복원함과 동시에 카메라의 위치와 방향을 계산할 수 있다. 복원된 부분 데이터들은 전체를 이루는 일부분이므로, 융합을 통해 완성된 모습을 만들 수 있다. 복원된 부분 데이터들의 서로 다른 좌표계를 기준 좌표계로 변환함으로써 융합할 수 있다. 융합은 카메라의 모션에 해당하는 카메라의 위치와 방향에 의존된다. 융합 과정은 모두 선형으로 평균 0.5초 이하의 수행 속도를 보이며 융합의 오차는 평균 0.1cm 이하의 오차를 보였다.

  • PDF

FBX Format Animation Generation System Combined with Joint Estimation Network using RGB Images (RGB 이미지를 이용한 관절 추정 네트워크와 결합된 FBX 형식 애니메이션 생성 시스템)

  • Lee, Yujin;Kim, Sangjoon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.519-532
    • /
    • 2021
  • Recently, in various fields such as games, movies, and animation, content that uses motion capture to build body models and create characters to express in 3D space is increasing. Studies are underway to generate animations using RGB-D cameras to compensate for problems such as the cost of cinematography in how to place joints by attaching markers, but the problem of pose estimation accuracy or equipment cost still exists. Therefore, in this paper, we propose a system that inputs RGB images into a joint estimation network and converts the results into 3D data to create FBX format animations in order to reduce the equipment cost required for animation creation and increase joint estimation accuracy. First, the two-dimensional joint is estimated for the RGB image, and the three-dimensional coordinates of the joint are estimated using this value. The result is converted to a quaternion, rotated, and an animation in FBX format is created. To measure the accuracy of the proposed method, the system operation was verified by comparing the error between the animation generated based on the 3D position of the marker by attaching a marker to the body and the animation generated by the proposed system.

Ordinal Depth Based Deductive Weakly Supervised Learning for Monocular 3D Human Pose Estimation (단안 이미지로부터 3D 사람 자세 추정을 위한 순서 깊이 기반 연역적 약지도 학습 기법)

  • Youngchan Lee;Gyubin Lee;Wonsang You
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.826-829
    • /
    • 2024
  • 3D 사람 자세 추정 기술은 다양한 응용 분야에서의 높은 활용성으로 인해 대량의 학습 데이터가 수집되어 딥러닝 모델 연구가 진행되어 온 반면, 동물 자세 추정의 경우 3D 동물 데이터의 부족으로 인해 관련 연구는 극히 미진하다. 본 연구는 동물 자세 추정을 위한 예비연구로서, 3D 학습 데이터가 없는 상황에서 단일 이미지로부터 3D 사람 자세를 추정하는 딥러닝 기법을 제안한다. 이를 위하여 사전 훈련된 다중 시점 학습모델을 사용하여 2D 자세 데이터로부터 가상의 다중 시점 데이터를 생성하여 훈련하는 연역적 학습 기반 교사-학생 모델을 구성하였다. 또한, 키포인트 깊이 정보 대신 2D 이미지로부터 레이블링 된 순서 깊이 정보에 기반한 손실함수를 적용하였다. 제안된 모델이 동물데이터에서 적용 가능한지 평가하기 위해 실험은 사람 데이터를 사용하여 이루어졌다. 실험 결과는 제안된 방법이 기존 단안 이미지 기반 모델보다 3D 자세 추정의 성능을 개선함을 보여준다.

A Model-based 3-D Pose Estimation Method from Line Correspondences of Polyhedral Objects

  • Kang, Dong-Joong;Ha, Jong-Eun
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.762-766
    • /
    • 2003
  • In this paper, we present a new approach to solve the problem of estimating the camera 3-D location and orientation from a matched set of 3-D model and 2-D image features. An iterative least-square method is used to solve both rotation and translation simultaneously. Because conventional methods that solved for rotation first and then translation do not provide good solutions, we derive an error equation using roll-pitch-yaw angle to present the rotation matrix. To minimize the error equation, Levenberg-Marquardt algorithm is introduced with uniform sampling strategy of rotation space to avoid stuck in local minimum. Experimental results using real images are presented.

  • PDF

A Study on the Application of ColMap in 3D Reconstruction for Cultural Heritage Restoration

  • Byong-Kwon Lee;Beom-jun Kim;Woo-Jong Yoo;Min Ahn;Soo-Jin Han
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.95-101
    • /
    • 2023
  • Colmap is one of the innovative artificial intelligence technologies, highly effective as a tool in 3D reconstruction tasks. Moreover, it excels at constructing intricate 3D models by utilizing images and corresponding metadata. Colmap generates 3D models by merging 2D images, camera position data, depth information, and so on. Through this, it achieves detailed and precise 3D reconstructions, inclusive of objects from the real world. Additionally, Colmap provides rapid processing by leveraging GPUs, allowing for efficient operation even within large data sets. In this paper, we have presented a method of collecting 2D images of traditional Korean towers and reconstructing them into 3D models using Colmap. This study applied this technology in the restoration process of traditional stone towers in South Korea. As a result, we confirmed the potential applicability of Colmap in the field of cultural heritage restoration.

The Estimation of Hand Pose Based on Mean-Shift Tracking Using the Fusion of Color and Depth Information for Marker-less Augmented Reality (비마커 증강현실을 위한 색상 및 깊이 정보를 융합한 Mean-Shift 추적 기반 손 자세의 추정)

  • Lee, Sun-Hyoung;Hahn, Hern-Soo;Han, Young-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.7
    • /
    • pp.155-166
    • /
    • 2012
  • This paper proposes a new method of estimating the hand pose through the Mean-Shift tracking algorithm using the fusion of color and depth information for marker-less augmented reality. On marker-less augmented reality, the most of previous studies detect the hand region using the skin color from simple experimental background. Because finger features should be detected on the hand, the hand pose that can be measured from cameras is restricted considerably. However, the proposed method can easily detect the hand pose from complex background through the new Mean-Shift tracking method using the fusion of the color and depth information from 3D sensor. The proposed method of estimating the hand pose uses the gravity point and two random points on the hand without largely constraints. The proposed Mean-Shift tracking method has about 50 pixels error less than general tracking method just using color value. The augmented reality experiment of the proposed method shows results of its performance being as good as marker based one on the complex background.