• Title/Summary/Keyword: 3D Pose, AR

Search Result 15, Processing Time 0.022 seconds

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

Detecting Complex 3D Human Motions with Body Model Low-Rank Representation for Real-Time Smart Activity Monitoring System

  • Jalal, Ahmad;Kamal, Shaharyar;Kim, Dong-Seong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1189-1204
    • /
    • 2018
  • Detecting and capturing 3D human structures from the intensity-based image sequences is an inherently arguable problem, which attracted attention of several researchers especially in real-time activity recognition (Real-AR). These Real-AR systems have been significantly enhanced by using depth intensity sensors that gives maximum information, in spite of the fact that conventional Real-AR systems are using RGB video sensors. This study proposed a depth-based routine-logging Real-AR system to identify the daily human activity routines and to make these surroundings an intelligent living space. Our real-time routine-logging Real-AR system is categorized into two categories. The data collection with the use of a depth camera, feature extraction based on joint information and training/recognition of each activity. In-addition, the recognition mechanism locates, and pinpoints the learned activities and induces routine-logs. The evaluation applied on the depth datasets (self-annotated and MSRAction3D datasets) demonstrated that proposed system can achieve better recognition rates and robust as compare to state-of-the-art methods. Our Real-AR should be feasibly accessible and permanently used in behavior monitoring applications, humanoid-robot systems and e-medical therapy systems.

A Study on u-GIS Outdoor Augmented Reality System Development (u-GIS 야외 증강현실 시스템 개발에 관한 연구)

  • Kim, Jeong-Hwan;Kim, Shin-Hyoung;Kil, Woo-Sung
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.183-188
    • /
    • 2009
  • In this paper, we preset a method for the development of u-GIS outdoor augmented reality(AR) system. The proposed system is consist of three parts. First, sensor acquisition and calibration for AR, Second, camera and sensor based tracking for AR, Third, integration of sensor information and 3D models. We combine spatial information of real and virtual spaces through u-GIS AR system.

  • PDF

Augmented Reality Service Based on Object Pose Prediction Using PnP Algorithm

  • Kim, In-Seon;Jung, Tae-Won;Jung, Kye-Dong
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.295-301
    • /
    • 2021
  • Digital media technology is gradually developing with the development of convergence quaternary industrial technology and mobile devices. The combination of deep learning and augmented reality can provide more convenient and lively services through the interaction of 3D virtual images with the real world. We combine deep learning-based pose prediction with augmented reality technology. We predict the eight vertices of the bounding box of the object in the image. Using the predicted eight vertices(x,y), eight vertices(x,y,z) of 3D mesh, and the intrinsic parameter of the smartphone camera, we compute the external parameters of the camera through the PnP algorithm. We calculate the distance to the object and the degree of rotation of the object using the external parameter and apply to AR content. Our method provides services in a web environment, making it highly accessible to users and easy to maintain the system. As we provide augmented reality services using consumers' smartphone cameras, we can apply them to various business fields.

3D Stereoscopic Augmented Reality with a Monocular Camera (단안카메라 기반 삼차원 입체영상 증강현실)

  • Rho, Seungmin;Lee, Jinwoo;Hwang, Jae-In;Kim, Junho
    • Journal of the Korea Computer Graphics Society
    • /
    • v.22 no.3
    • /
    • pp.11-20
    • /
    • 2016
  • This paper introduces an effective method for generating 3D stereoscopic images that gives immersive 3D experiences to viewers using mobile-based binocular HMDs. Most of previous AR systems with monocular cameras have a common limitation that the same real-world images are provided to the viewer's eyes without parallax. In this paper, based on the assumption that viewers focus on the marker in the scenario of marker based AR, we recovery the binocular disparity about a camera image and a virtual object using the pose information of the marker. The basic idea is to generate the binocular disparity for real-world images and a virtual object, where the images are placed on the 2D plane in 3D defined by the pose information of the marker. For non-marker areas in the images, we apply blur effects to reduce the visual discomfort by decreasing their sharpness. Our user studies show that the proposed method for 3D stereoscopic image provides high depth feeling to viewers compared to the previous binocular AR systems. The results show that our system provides high depth feelings, high sense of reality, and visual comfort, compared to the previous binocular AR systems.

Real-time Human Pose Estimation using RGB-D images and Deep Learning

  • Rim, Beanbonyka;Sung, Nak-Jun;Ma, Jun;Choi, Yoo-Joo;Hong, Min
    • Journal of Internet Computing and Services
    • /
    • v.21 no.3
    • /
    • pp.113-121
    • /
    • 2020
  • Human Pose Estimation (HPE) which localizes the human body joints becomes a high potential for high-level applications in the field of computer vision. The main challenges of HPE in real-time are occlusion, illumination change and diversity of pose appearance. The single RGB image is fed into HPE framework in order to reduce the computation cost by using depth-independent device such as a common camera, webcam, or phone cam. However, HPE based on the single RGB is not able to solve the above challenges due to inherent characteristics of color or texture. On the other hand, depth information which is fed into HPE framework and detects the human body parts in 3D coordinates can be usefully used to solve the above challenges. However, the depth information-based HPE requires the depth-dependent device which has space constraint and is cost consuming. Especially, the result of depth information-based HPE is less reliable due to the requirement of pose initialization and less stabilization of frame tracking. Therefore, this paper proposes a new method of HPE which is robust in estimating self-occlusion. There are many human parts which can be occluded by other body parts. However, this paper focuses only on head self-occlusion. The new method is a combination of the RGB image-based HPE framework and the depth information-based HPE framework. We evaluated the performance of the proposed method by COCO Object Keypoint Similarity library. By taking an advantage of RGB image-based HPE method and depth information-based HPE method, our HPE method based on RGB-D achieved the mAP of 0.903 and mAR of 0.938. It proved that our method outperforms the RGB-based HPE and the depth-based HPE.

Medical Digital Twin-Based Dynamic Virtual Body Capture System (메디컬 디지털 트윈 기반 동적 가상 인체 획득 시스템)

  • Kim, Daehwan;Kim, Yongwan;Lee, Kisuk
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.10
    • /
    • pp.1398-1401
    • /
    • 2020
  • We present the concept of a Medical Digital Twin (MDT) that can predict and analyze medical diseases using computer simulations and introduce a dynamic virtual body capture system to create it. The MDT is a technology that creates a 3D digital virtual human body by reflecting individual medical and biometric information. The virtual human body is composed of a static virtual human body that reflects an individual's internal and external information and a dynamic virtual human body that reflects his motion. Especially we describe an early version of the dynamic virtual body capture system that enables continuous simulation of musculoskeletal diseases.

Real-time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks (RGB-Depth 카메라와 Deep Convolution Neural Networks 기반의 실시간 사람 양손 3D 포즈 추정)

  • Park, Na Hyeon;Ji, Yong Bin;Gi, Geon;Kim, Tae Yeon;Park, Hye Min;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.686-689
    • /
    • 2018
  • 3D 손 포즈 추정(Hand Pose Estimation, HPE)은 스마트 인간 컴퓨터 인터페이스를 위해서 중요한 기술이다. 이 연구에서는 딥러닝 방법을 기반으로 하여 단일 RGB-Depth 카메라로 촬영한 양손의 3D 손 자세를 실시간으로 인식하는 손 포즈 추정 시스템을 제시한다. 손 포즈 추정 시스템은 4단계로 구성된다. 첫째, Skin Detection 및 Depth cutting 알고리즘을 사용하여 양손을 RGB와 깊이 영상에서 감지하고 추출한다. 둘째, Convolutional Neural Network(CNN) Classifier는 오른손과 왼손을 구별하는데 사용된다. CNN Classifier 는 3개의 convolution layer와 2개의 Fully-Connected Layer로 구성되어 있으며, 추출된 깊이 영상을 입력으로 사용한다. 셋째, 학습된 CNN regressor는 추출된 왼쪽 및 오른쪽 손의 깊이 영상에서 손 관절을 추정하기 위해 다수의 Convolutional Layers, Pooling Layers, Fully Connected Layers로 구성된다. CNN classifier와 regressor는 22,000개 깊이 영상 데이터셋으로 학습된다. 마지막으로, 각 손의 3D 손 자세는 추정된 손 관절 정보로부터 재구성된다. 테스트 결과, CNN classifier는 오른쪽 손과 왼쪽 손을 96.9%의 정확도로 구별할 수 있으며, CNN regressor는 형균 8.48mm의 오차 범위로 3D 손 관절 정보를 추정할 수 있다. 본 연구에서 제안하는 손 포즈 추정 시스템은 가상 현실(virtual reality, VR), 증강 현실(Augmented Reality, AR) 및 융합 현실 (Mixed Reality, MR) 응용 프로그램을 포함한 다양한 응용 분야에서 사용할 수 있다.

A study on the implementation of Korea's traditional pagoda WebXR service

  • Byong-Kwon Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.69-75
    • /
    • 2024
  • This study focuses on enhancing the understanding of the form and characteristics of traditional towers, or 'pagodas,' by utilizing WebXR technology to enable users to explore 3D models and experience them in virtual reality on the web. Traditional towers in Korea pose challenges for direct on-site verification due to their size, making it difficult to examine the structure and features of each level. To address these issues, this research aims to provide users with a WebXR service that allows them to remotely explore and analyze towers without geographical or temporal constraints. The research methodology involves utilizing WebAR to offer a web-based service where users can directly view the original form of the tower's 3D model using smart devices both online and on-site. However, outdoor conditions may affect performance, and to address this, a tower-outline detection and matching technique was employed. Consequently, we propose a remote support service for traditional towers, allowing users to remotely access information and features of various towers nationwide on the web. Meanwhile, on-site visits can involve experiencing augmented reality representations of towers using smart devices.

Extrinsic calibration using a multi-view camera (멀티뷰 카메라를 사용한 외부 카메라 보정)

  • 김기영;김세환;박종일;우운택
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.187-190
    • /
    • 2003
  • In this paper, we propose an extrinsic calibration method for a multi-view camera to get an optimal pose in 3D space. Conventional calibration algorithms do not guarantee the calibration accuracy at a mid/long distance because pixel errors increase as the distance between camera and pattern goes far. To compensate for the calibration errors, firstly, we apply the Tsai's algorithm to each lens so that we obtain initial extrinsic parameters Then, we estimate extrinsic parameters by using distance vectors obtained from structural cues of a multi-view camera. After we get the estimated extrinsic parameters of each lens, we carry out a non-linear optimization using the relationship between camera coordinate and world coordinate iteratively. The optimal camera parameters can be used in generating 3D panoramic virtual environment and supporting AR applications.

  • PDF