• Title/Summary/Keyword: 2D-3D pose estimation

Search Result 86, Processing Time 0.029 seconds

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

Deep Learning-Based Outlier Detection and Correction for 3D Pose Estimation (3차원 자세 추정을 위한 딥러닝 기반 이상치 검출 및 보정 기법)

  • Ju, Chan-Yang;Park, Ji-Sung;Lee, Dong-Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.10
    • /
    • pp.419-426
    • /
    • 2022
  • In this paper, we propose a method to improve the accuracy of 3D human pose estimation model in various move motions. Existing human pose estimation models have some problems of jitter, inversion, swap, miss that cause miss coordinates when estimating human poses. These problems cause low accuracy of pose estimation models to detect exact coordinates of human poses. We propose a method that consists of detection and correction methods to handle with these problems. Deep learning-based outlier detection method detects outlier of human pose coordinates in move motion effectively and rule-based correction method corrects the outlier according to a simple rule. We have shown that the proposed method is effective in various motions with the experiments using 2D golf swing motion data and have shown the possibility of expansion from 2D to 3D coordinates.

The Estimation of the Transform Parameters Using the Pattern Matching with 2D Images (2차원 영상에서 패턴매칭을 이용한 3차원 물체의 변환정보 추정)

  • 조택동;이호영;양상민
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.21 no.7
    • /
    • pp.83-91
    • /
    • 2004
  • The determination of camera position and orientation from known correspondences of 3D reference points and their images is known as pose estimation in computer vision or space resection in photogrammetry. This paper discusses estimation of transform parameters using the pattern matching method with 2D images only. In general, the 3D reference points or lines are needed to find out the 3D transform parameters, but this method is applied without the 3D reference points or lines. It uses only two images to find out the transform parameters between two image. The algorithm is simulated using Visual C++ on Windows 98.

Markerless camera pose estimation framework utilizing construction material with standardized specification

  • Harim Kim;Heejae Ahn;Sebeen Yoon;Taehoon Kim;Thomas H.-K. Kang;Young K. Ju;Minju Kim;Hunhee Cho
    • Computers and Concrete
    • /
    • v.33 no.5
    • /
    • pp.535-544
    • /
    • 2024
  • In the rapidly advancing landscape of computer vision (CV) technology, there is a burgeoning interest in its integration with the construction industry. Camera calibration is the process of deriving intrinsic and extrinsic parameters that affect when the coordinates of the 3D real world are projected onto the 2D plane, where the intrinsic parameters are internal factors of the camera, and extrinsic parameters are external factors such as the position and rotation of the camera. Camera pose estimation or extrinsic calibration, which estimates extrinsic parameters, is essential information for CV application at construction since it can be used for indoor navigation of construction robots and field monitoring by restoring depth information. Traditionally, camera pose estimation methods for cameras relied on target objects such as markers or patterns. However, these methods, which are marker- or pattern-based, are often time-consuming due to the requirement of installing a target object for estimation. As a solution to this challenge, this study introduces a novel framework that facilitates camera pose estimation using standardized materials found commonly in construction sites, such as concrete forms. The proposed framework obtains 3D real-world coordinates by referring to construction materials with certain specifications, extracts the 2D coordinates of the corresponding image plane through keypoint detection, and derives the camera's coordinate through the perspective-n-point (PnP) method which derives the extrinsic parameters by matching 3D and 2D coordinate pairs. This framework presents a substantial advancement as it streamlines the extrinsic calibration process, thereby potentially enhancing the efficiency of CV technology application and data collection at construction sites. This approach holds promise for expediting and optimizing various construction-related tasks by automating and simplifying the calibration procedure.

Fast Hand Pose Estimation with Keypoint Detection and Annoy Tree (Keypoint Detection과 Annoy Tree를 사용한 2D Hand Pose Estimation)

  • Lee, Hui-Jae;Kang Min-Hye
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.277-278
    • /
    • 2021
  • 최근 손동작 인식에 대한 연구들이 활발하다. 하지만 대부분 Depth 정보를 포함한3D 정보를 필요로 한다. 이는 기존 연구들이 Depth 카메라 없이는 동작하지 않는다는 한계점이 있다는 것을 의미한다. 본 프로젝트는 Depth 카메라를 사용하지 않고 2D 이미지에서 Hand Keypoint Detection을 통해 손동작 인식을 하는 방법론을 제안한다. 학습 데이터 셋으로 Facebook에서 제공하는 InterHand2.6M 데이터셋[1]을 사용한다. 제안 방법은 크게 두 단계로 진행된다. 첫째로, Object Detection으로 Hand Detection을 수행한다. 데이터 셋이 어두운 배경에서 촬영되어 실 사용 환경에서 Detection 성능이 나오지 않는 점을 해결하기 위한 이미지 합성 Augmentation 기법을 제안한다. 둘째로, Keypoint Detection으로 21개의 Hand Keypoint들을 얻는다. 실험을 통해 유의미한 벡터들을 생성한 뒤 Annoy (Approximate nearest neighbors Oh Yeah) Tree를 생성한다. 생성된 Annoy Tree들로 후처리 작업을 거친 뒤 최종 Pose Estimation을 완료한다. Annoy Tree를 사용한 Pose Estimation에서는 NN(Neural Network)을 사용한 것보다 빠르며 동등한 성능을 냈다.

  • PDF

Hard Example Generation by Novel View Synthesis for 3-D Pose Estimation (3차원 자세 추정 기법의 성능 향상을 위한 임의 시점 합성 기반의 고난도 예제 생성)

  • Minji Kim;Sungchan Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.1
    • /
    • pp.9-17
    • /
    • 2024
  • It is widely recognized that for 3D human pose estimation (HPE), dataset acquisition is expensive and the effectiveness of augmentation techniques of conventional visual recognition tasks is limited. We address these difficulties by presenting a simple but effective method that augments input images in terms of viewpoints when training a 3D human pose estimation (HPE) model. Our intuition is that meaningful variants of the input images for HPE could be obtained by viewing a human instance in the images from an arbitrary viewpoint different from that in the original images. The core idea is to synthesize new images that have self-occlusion and thus are difficult to predict at different viewpoints even with the same pose of the original example. We incorporate this idea into the training procedure of the 3D HPE model as an augmentation stage of the input samples. We show that a strategy for augmenting the synthesized example should be carefully designed in terms of the frequency of performing the augmentation and the selection of viewpoints for synthesizing the samples. To this end, we propose a new metric to measure the prediction difficulty of input images for 3D HPE in terms of the distance between corresponding keypoints on both sides of a human body. Extensive exploration of the space of augmentation probability choices and example selection according to the proposed distance metric leads to a performance gain of up to 6.2% on Human3.6M, the well-known pose estimation dataset.

Localization of a Monocular Camera using a Feature-based Probabilistic Map (특징점 기반 확률 맵을 이용한 단일 카메라의 위치 추정방법)

  • Kim, Hyungjin;Lee, Donghwa;Oh, Taekjun;Myung, Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.4
    • /
    • pp.367-371
    • /
    • 2015
  • In this paper, a novel localization method for a monocular camera is proposed by using a feature-based probabilistic map. The localization of a camera is generally estimated from 3D-to-2D correspondences between a 3D map and an image plane through the PnP algorithm. In the computer vision communities, an accurate 3D map is generated by optimization using a large number of image dataset for camera pose estimation. In robotics communities, a camera pose is estimated by probabilistic approaches with lack of feature. Thus, it needs an extra system because the camera system cannot estimate a full state of the robot pose. Therefore, we propose an accurate localization method for a monocular camera using a probabilistic approach in the case of an insufficient image dataset without any extra system. In our system, features from a probabilistic map are projected into an image plane using linear approximation. By minimizing Mahalanobis distance between the projected features from the probabilistic map and extracted features from a query image, the accurate pose of the monocular camera is estimated from an initial pose obtained by the PnP algorithm. The proposed algorithm is demonstrated through simulations in a 3D space.

A Framework for Real Time Vehicle Pose Estimation based on synthetic method of obtaining 2D-to-3D Point Correspondence

  • Yun, Sergey;Jeon, Moongu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.904-907
    • /
    • 2014
  • In this work we present a robust and fast approach to estimate 3D vehicle pose that can provide results under a specific traffic surveillance conditions. Such limitations are expressed by single fixed CCTV camera that is located relatively high above the ground, its pitch axes is parallel to the reference plane and the camera focus assumed to be known. The benefit of our framework that it does not require prior training, camera calibration and does not heavily rely on 3D model shape as most common technics do. Also it deals with a bad shape condition of the objects as we focused on low resolution surveillance scenes. Pose estimation task is presented as PnP problem to solve it we use well known "POSIT" algorithm [1]. In order to use this algorithm at least 4 non coplanar point's correspondence is required. To find such we propose a set of techniques based on model and scene geometry. Our framework can be applied in real time video sequence. Results for estimated vehicle pose are shown in real image scene.

3D Pose Estimation from Selective View for 3D Volumetric Data Deformation (3 차원 볼류메트릭 데이터 변형을 위한 선택적 시점에서의 3 차원 포즈 추정)

  • Lee, Sol;Kim, Ji-Hyun;Park, Jung-Tak;Park, Byung-Seo;Seo, Young-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.156-157
    • /
    • 2022
  • 본 논문에서는 선택적 시점에서의 2D 포즈 추정(pose estimation) 결과를 정합 하여 정확도 높은 3D 스켈레톤(skeleton)을 만들어 낸다. 여러 프레임의 3D 데이터를 10 도 간격으로 36 방향에서 투영한 뒤, 2D 포즈 추정 결과 신뢰도가 높은 시점에서의 결과만을 선별하여 3 차원으로 정합 한다. 이때 사용하는 시점의 개수를 달리하며 정확도에 미치는 영향을 분석하여 실험적으로 정확도가 높은 최소의 시점 개수를 정하였다. 또한, 정합 한 3D 뼈대를 모션 캡쳐(motion capture) 센서와 비교하여 제안하는 알고리즘에 의해 3D 포즈 추정의 정확도가 향상되는 것을 확인했다.

  • PDF

High-quality Texture Extraction for Point Clouds Reconstructed from RGB-D Images (RGB-D 영상으로 복원한 점 집합을 위한 고화질 텍스쳐 추출)

  • Seo, Woong;Park, Sang Uk;Ihm, Insung
    • Journal of the Korea Computer Graphics Society
    • /
    • v.24 no.3
    • /
    • pp.61-71
    • /
    • 2018
  • When triangular meshes are generated from the point clouds in global space reconstructed through camera pose estimation against captured RGB-D streams, the quality of the resulting meshes improves as more triangles are hired. However, for 3D reconstructed models beyond some size threshold, they become to suffer from the ugly-looking artefacts due to the insufficient precision of RGB-D sensors as well as significant burdens in memory requirement and rendering cost. In this paper, for the generation of 3D models appropriate for real-time applications, we propose an effective technique that extracts high-quality textures for moderate-sized meshes from the captured colors associated with the reconstructed point sets. In particular, we show that via a simple method based on the mapping between the 3D global space resulting from the camera pose estimation and the 2D texture space, textures can be generated effectively for the 3D models reconstructed from captured RGB-D image streams.