• Title/Summary/Keyword: 2D-3D pose estimation

Search Result 86, Processing Time 0.029 seconds

Investigation of image preprocessing and face covering influences on motion recognition by a 2D human pose estimation algorithm (모션 인식을 위한 2D 자세 추정 알고리듬의 이미지 전처리 및 얼굴 가림에 대한 영향도 분석)

  • Noh, Eunsol;Yi, Sarang;Hong, Seokmoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.285-291
    • /
    • 2020
  • In manufacturing, humans are being replaced with robots, but expert skills remain difficult to convert to data, making them difficult to apply to industrial robots. One method is by visual motion recognition, but physical features may be judged differently depending on the image data. This study aimed to improve the accuracy of vision methods for estimating the posture of humans. Three OpenPose vision models were applied: MPII, COCO, and COCO+foot. To identify the effects of face-covering accessories and image preprocessing on the Convolutional Neural Network (CNN) structure, the presence/non-presence of accessories, image size, and filtering were set as the parameters affecting the identification of a human's posture. For each parameter, image data were applied to the three models, and the errors between the actual and predicted values, as well as the percentage correct keypoints (PCK), were calculated. The COCO+foot model showed the lowest sensitivity to all three parameters. A <50% (from 3024×4032 to 1512×2016 pixels) reduction in image size was considered acceptable. Emboss filtering, in combination with MPII, provided the best results (reduced error of <60 pixels).

Real-time 3D Volumetric Model Generation using Multiview RGB-D Camera (다시점 RGB-D 카메라를 이용한 실시간 3차원 체적 모델의 생성)

  • Kim, Kyung-Jin;Park, Byung-Seo;Kim, Dong-Wook;Kwon, Soon-Chul;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.439-448
    • /
    • 2020
  • In this paper, we propose a modified optimization algorithm for point cloud matching of multi-view RGB-D cameras. In general, in the computer vision field, it is very important to accurately estimate the position of the camera. The 3D model generation methods proposed in the previous research require a large number of cameras or expensive 3D cameras. Also, the methods of obtaining the external parameters of the camera through the 2D image have a large error. In this paper, we propose a matching technique for generating a 3D point cloud and mesh model that can provide omnidirectional free viewpoint using 8 low-cost RGB-D cameras. We propose a method that uses a depth map-based function optimization method with RGB images and obtains coordinate transformation parameters that can generate a high-quality 3D model without obtaining initial parameters.

Motion Prior-Guided Refinement for Accurate Baseball Player Pose Estimation (스윙 모션 사전 지식을 활용한 정확한 야구 선수 포즈 보정)

  • Seunghyun Oh;Heewon Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.615-616
    • /
    • 2024
  • 현대 야구에서 타자의 스윙 패턴 분석은 상대 투수가 투구 전략을 수립하는데 상당히 중요하다. 이미지 기반의 인간 포즈 추정(HPE)은 대규모 스윙 패턴 분석을 자동화할 수 있다. 그러나 기존의 HPE 방법은 빠르고 가려진 신체 움직임으로 인해 복잡한 스윙 모션을 정확하게 추정하는 데 어려움이 있다. 이러한 문제를 극복하기 위해 스윙 모션에 대한 사전 정보를 활용하여 야구 선수의 포즈를 보정하는 방법(BPPC)을 제안한다. BPPC는 동작 인식, 오프셋 학습, 3D에서 2D 프로젝션 및 동작 인지 손실 함수를 통해 스윙 모션에 대한 사전 정보를 반영하여 기성 HPE 모델 결과를 보정한다. 실험에 따르면 BPPC는 벤치마크 데이터셋에서 기성 HPE 모델의 2D 키포인트 정확도를 정량적 및 정성적으로 향상시키고, 특히 신뢰도 점수가 낮고 부정확한 키포인트를 크게 보정했다.

Infrared camera calibration-based passive marker pose estimation method (적외선 카메라 칼리브레이션 기반 패시브 마커 자세 추정 방법)

  • Park, Byung-Seo;Kim, Dong-Wook;Seo, Young-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.203-204
    • /
    • 2022
  • 본 논문에서는 다수의 적외선 카메라의 2D 패시브마커 영상을 이용한 3차원 리지드 바디(Rigid Body) 자세추정 방법을 제안한다. 1차로 개별 카메라의 내부 변수를 구하기 위해 체스보드를 이용한 칼리브레이션 과정을 수행하고, 2차 보정 과정에서 3개의 적외선 마커가 있는 삼각형 구조물을 모든 카메라가 관찰 가능하도록 움직인 후 프레임별 누적된 데이터를 계산하여 카메라 간의 상대적인 위치정보의 보정 및 업데이트를 진행한다. 이 후 각 카메라의 좌표계를 3D월드 좌표계로 변환하는 과정을 통해 3개 마커의 3차원 좌표를 복원하여 각 마커간 거리를 계산하여 실제 거리와의 차이를 비교한 결과 1mm 내외의 오차를 측정하였다.

  • PDF

Towards 3D Modeling of Buildings using Mobile Augmented Reality and Aerial Photographs (모바일 증강 현실 및 항공사진을 이용한 건물의 3차원 모델링)

  • Kim, Se-Hwan;Ventura, Jonathan;Chang, Jae-Sik;Lee, Tae-Hee;Hollerer, Tobias
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.2
    • /
    • pp.84-91
    • /
    • 2009
  • This paper presents an online partial 3D modeling methodology that uses a mobile augmented reality system and aerial photographs, and a tracking methodology that compares the 3D model with a video image. Instead of relying on models which are created in advance, the system generates a 3D model for a real building on the fly by combining frontal and aerial views. A user's initial pose is estimated using an aerial photograph, which is retrieved from a database according to the user's GPS coordinates, and an inertial sensor which measures pitch. We detect edges of the rooftop based on Graph cut, and find edges and a corner of the bottom by minimizing the proposed cost function. To track the user's position and orientation in real-time, feature-based tracking is carried out based on salient points on the edges and the sides of a building the user is keeping in view. We implemented camera pose estimators using both a least squares estimator and an unscented Kalman filter (UKF). We evaluated the speed and accuracy of both approaches, and we demonstrated the usefulness of our computations as important building blocks for an Anywhere Augmentation scenario.

Photorealistic Real-Time Dense 3D Mesh Mapping for AUV (자율 수중 로봇을 위한 사실적인 실시간 고밀도 3차원 Mesh 지도 작성)

  • Jungwoo Lee;Younggun Cho
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.2
    • /
    • pp.188-195
    • /
    • 2024
  • This paper proposes a photorealistic real-time dense 3D mapping system that utilizes a neural network-based image enhancement method and mesh-based map representation. Due to the characteristics of the underwater environment, where problems such as hazing and low contrast occur, it is hard to apply conventional simultaneous localization and mapping (SLAM) methods. At the same time, the behavior of Autonomous Underwater Vehicle (AUV) is computationally constrained. In this paper, we utilize a neural network-based image enhancement method to improve pose estimation and mapping quality and apply a sliding window-based mesh expansion method to enable lightweight, fast, and photorealistic mapping. To validate our results, we utilize real-world and indoor synthetic datasets. We performed qualitative validation with the real-world dataset and quantitative validation by modeling images from the indoor synthetic dataset as underwater scenes.

Efficient Intermediate Joint Estimation using the UKF based on the Numerical Inverse Kinematics (수치적인 역운동학 기반 UKF를 이용한 효율적인 중간 관절 추정)

  • Seo, Yung-Ho;Lee, Jun-Sung;Lee, Chil-Woo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.6
    • /
    • pp.39-47
    • /
    • 2010
  • A research of image-based articulated pose estimation has some problems such as detection of human feature, precise pose estimation, and real-time performance. In particular, various methods are currently presented for recovering many joints of human body. We propose the novel numerical inverse kinematics improved with the UKF(unscented Kalman filter) in order to estimate the human pose in real-time. An existing numerical inverse kinematics is required many iterations for solving the optimal estimation and has some problems such as the singularity of jacobian matrix and a local minima. To solve these problems, we combine the UKF as a tool for optimal state estimation with the numerical inverse kinematics. Combining the solution of the numerical inverse kinematics with the sampling based UKF provides the stability and rapid convergence to optimal estimate. In order to estimate the human pose, we extract the interesting human body using both background subtraction and skin color detection algorithm. We localize its 3D position with the camera geometry. Next, through we use the UKF based numerical inverse kinematics, we generate the intermediate joints that are not detect from the images. Proposed method complements the defect of numerical inverse kinematics such as a computational complexity and an accuracy of estimation.

Unsupervised Monocular Depth Estimation Using Self-Attention for Autonomous Driving (자율주행을 위한 Self-Attention 기반 비지도 단안 카메라 영상 깊이 추정)

  • Seung-Jun Hwang;Sung-Jun Park;Joong-Hwan Baek
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.2
    • /
    • pp.182-189
    • /
    • 2023
  • Depth estimation is a key technology in 3D map generation for autonomous driving of vehicles, robots, and drones. The existing sensor-based method has high accuracy but is expensive and has low resolution, while the camera-based method is more affordable with higher resolution. In this study, we propose self-attention-based unsupervised monocular depth estimation for UAV camera system. Self-Attention operation is applied to the network to improve the global feature extraction performance. In addition, we reduce the weight size of the self-attention operation for a low computational amount. The estimated depth and camera pose are transformed into point cloud. The point cloud is mapped into 3D map using the occupancy grid of Octree structure. The proposed network is evaluated using synthesized images and depth sequences from the Mid-Air dataset. Our network demonstrates a 7.69% reduction in error compared to prior studies.

Automatic Camera Pose Determination from a Single Face Image

  • Wei, Li;Lee, Eung-Joo;Ok, Soo-Yol;Bae, Sung-Ho;Lee, Suk-Hwan;Choo, Young-Yeol;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.12
    • /
    • pp.1566-1576
    • /
    • 2007
  • Camera pose information from 2D face image is very important for making virtual 3D face model synchronize with the real face. It is also very important for any other uses such as: human computer interface, 3D object estimation, automatic camera control etc. In this paper, we have presented a camera position determination algorithm from a single 2D face image using the relationship between mouth position information and face region boundary information. Our algorithm first corrects the color bias by a lighting compensation algorithm, then we nonlinearly transformed the image into $YC_bC_r$ color space and use the visible chrominance feature of face in this color space to detect human face region. And then for face candidate, use the nearly reversed relationship information between $C_b\;and\;C_r$ cluster of face feature to detect mouth position. And then we use the geometrical relationship between mouth position information and face region boundary information to determine rotation angles in both x-axis and y-axis of camera position and use the relationship between face region size information and Camera-Face distance information to determine the camera-face distance. Experimental results demonstrate the validity of our algorithm and the correct determination rate is accredited for applying it into practice.

  • PDF

Facial Feature Extraction with Its Applications

  • Lee, Minkyu;Lee, Sangyoun
    • Journal of International Society for Simulation Surgery
    • /
    • v.2 no.1
    • /
    • pp.7-9
    • /
    • 2015
  • Purpose In the many face-related application such as head pose estimation, 3D face modeling, facial appearance manipulation, the robust and fast facial feature extraction is necessary. We present the facial feature extraction method based on shape regression and feature selection for real-time facial feature extraction. Materials and Methods The facial features are initialized by statistical shape model and then the shape of facial features are deformed iteratively according to the texture pattern which is selected on the feature pool. Results We obtain fast and robust facial feature extraction result with error less than 4% and processing time less than 12 ms. The alignment error is measured by average of ratio of pixel difference to inter-ocular distance. Conclusion The accuracy and processing time of the method is enough to apply facial feature based application and can be used on the face beautification or 3D face modeling.