• Title/Summary/Keyword: 3D방송

Search Result 1,332, Processing Time 0.031 seconds

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Development of Photogrammetric Rectification Method Applying Bayesian Approach for High Quality 3D Contents Production (고품질의 3D 콘텐츠 제작을 위한 베이지안 접근방식의 사진측량기반 편위수정기법 개발)

  • Kim, Jae-In;Kim, Taejung
    • Journal of Broadcast Engineering
    • /
    • v.18 no.1
    • /
    • pp.31-42
    • /
    • 2013
  • This paper proposes a photogrammetric rectification method based on Bayesian approach as a method that eliminates vertical parallax between stereo images to minimize visual fatigue of 3D contents. The image rectification consists of two phases; geometry estimation and epipolar transformation. For geometry estimation, coplanarity-based relative orientation algorithm was used in this paper. To ensure robustness for mismatch and localization error occurred by automation of tie point extraction, Bayesian approach was applied by introducing several prior constraints. As epipolar transformation perspective transformation was used based on condition of collinearity to minimize distortion of result images and modification for input images. Other algorithms were compared to evaluate performance. For geometry estimation, traditional relative orientation algorithm, 8-points algorithm and stereo calibration algorithm were employed. For epipolar transformation, Hartley algorithm and Bouguet algorithm were employed. The evaluation results showed that the proposed algorithm produced results with high accuracy, robustness about error sources and minimum image modification.

Using Skeleton Vector Information and RNN Learning Behavior Recognition Algorithm (스켈레톤 벡터 정보와 RNN 학습을 이용한 행동인식 알고리즘)

  • Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.598-605
    • /
    • 2018
  • Behavior awareness is a technology that recognizes human behavior through data and can be used in applications such as risk behavior through video surveillance systems. Conventional behavior recognition algorithms have been performed using the 2D camera image device or multi-mode sensor or multi-view or 3D equipment. When two-dimensional data was used, the recognition rate was low in the behavior recognition of the three-dimensional space, and other methods were difficult due to the complicated equipment configuration and the expensive additional equipment. In this paper, we propose a method of recognizing human behavior using only CCTV images without additional equipment using only RGB and depth information. First, the skeleton extraction algorithm is applied to extract points of joints and body parts. We apply the equations to transform the vector including the displacement vector and the relational vector, and study the continuous vector data through the RNN model. As a result of applying the learned model to various data sets and confirming the accuracy of the behavior recognition, the performance similar to that of the existing algorithm using the 3D information can be verified only by the 2D information.

Design and Implementation of a Realistic Multi-View Scalable Video Coding Scheme (실감형 다시점 스케일러블 비디오 코딩 방법의 설계 및 구현)

  • Park, Min-Woo;Park, Gwang-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.14 no.6
    • /
    • pp.703-720
    • /
    • 2009
  • This paper proposes a realistic multi-view scalable video coding scheme designed for user's interest in 3D content services and the usage in the future computing environment. Future video coding schemes should support realistic services that make users feel the 3-D presence through stereoscopic or multi-view videos, as well as to accomplish the so-called one-source multi-use services in order to comprehensively support diverse transmission environments and terminals. Unlike the most of video coding methods which only support two-dimensional display, the proposed coding scheme in this paper is the method which can support such realistic services. This paper designs and also implements the proposed coding scheme through integrating Multi-view Video Coding scheme and Scalable Video Coding scheme, then shows its possibility of realization of 3D services by the simulation. The simulation results show the proposed structure remarkably improves the performance of random access with almost the same coding efficiency.

Study on Virtual Reality (VR) Operating System Prototype (가상환경(VR) 운영체제 프로토타입 연구)

  • Kim, Eunsol;Kim, Jiyeon;Yoo, Eunjin;Park, Taejung
    • Journal of Broadcast Engineering
    • /
    • v.22 no.1
    • /
    • pp.87-94
    • /
    • 2017
  • This paper presents a prototype for virtual reality operating system (VR OS) concept with head mount display (HMD) and hand gesture recognition technology based on game engine (Unity3D). We have designed and implemented simple multitasking thread mechanism constructed on the realtime environment provided by Unity3D game engine. Our virtual reality operating system receives user input from the hand gesture recognition device (Leap Motion) to simulate mouse and keyboard and provides output via head mount display (Oculus Rift DK2). As a result, our system provides users with more broad and immersive work environment by implementing 360 degree work space.

An Analysis of Visual Fatigue Caused From Distortions in 3D Video Production (3D 영상의 제작 왜곡이 시청 피로도에 미치는 영향 분석)

  • Jang, Hyung-Jun;Kim, Yong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.1
    • /
    • pp.1-16
    • /
    • 2012
  • In order to improve the workflow of 3D video production, this paper analyzes the visual fatigue caused from distortions in 3D video production stage through a set of subjective visual assessment tests. To establish a set of objective indicators for subjective visual tests, various distortions in production stage are investigated to be categorized into 7 representative visual-fatigue-producing factors, and to conduct visual assessment tests for each selected category, 4 test video clips are produced by combining the extent of camera movement as well as the object(s) movement in the scene. Each produced test video is distorted to reflect each of the selected 7 visual-fatigue-producing factors, and we set 7 levels of distortion for each factor, resulting in 196 5-second-long video clips for testing. Based on these test materials and the recommendation of ITU-R BT.1438, subjective visual assessment tests are conducted by 101 applicants. The test results provide a relative importance and the tolerance limit of each visual-fatigue-producing factor, which corresponds to various distortions in 3D video production field.

Implementation of User Gesture Recognition System for manipulating a Floating Hologram Character (플로팅 홀로그램 캐릭터 조작을 위한 사용자 제스처 인식 시스템 구현)

  • Jang, Myeong-Soo;Lee, Woo-Beom
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.143-149
    • /
    • 2019
  • Floating holograms are technologies that provide rich 3D stereoscopic images in a wide space such as advertisement, concert. In addition, It is possible to reduce the 3D glasses inconvenience, eye strain, and space distortion, and to enjoy 3D images with excellent realism and existence. Therefore, this paper implements a user gesture recognition system for manipulating a floating hologram characters that can be used in a small space devices. The proposed method detects face region using haar feature-based cascade classifier, and recognizes the user gestures using a user gesture-occurred position information that is acquired from the gesture difference image in real time. And Each classified gesture information is mapped to the character motion in floating hologram for manipulating a character action. In order to evaluate the performance of the proposed user gesture recognition system for manipulating a floating hologram character, we make the floating hologram display devise, and measures the recognition rate of each gesture repeatedly that includes body shaking, walking, hand shaking, and jumping. As a results, the average recognition rate was 88%.

360 RGBD Image Synthesis from a Sparse Set of Images with Narrow Field-of-View (소수의 협소화각 RGBD 영상으로부터 360 RGBD 영상 합성)

  • Kim, Soojie;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.487-498
    • /
    • 2022
  • Depth map is an image that contains distance information in 3D space on a 2D plane and is used in various 3D vision tasks. Many existing depth estimation studies mainly use narrow FoV images, in which a significant portion of the entire scene is lost. In this paper, we propose a technique for generating 360° omnidirectional RGBD images from a sparse set of narrow FoV images. The proposed generative adversarial network based image generation model estimates the relative FoV for the entire panoramic image from a small number of non-overlapping images and produces a 360° RGB and depth image simultaneously. In addition, it shows improved performance by configuring a network reflecting the spherical characteristics of the 360° image.

Stereo image compression based on error concealment for 3D television (3차원 텔레비전을 위한 에러 은닉 기반 스테레오 영상 압축)

  • Bak, Sungchul;Sim, Donggyu;Namkung, Jae-Chan;Oh, Seoung-jun
    • Journal of Broadcast Engineering
    • /
    • v.10 no.3
    • /
    • pp.286-296
    • /
    • 2005
  • This paper presents a stereo-based image compression and transmission system for 3D realistic television. In the proposed system, a disparity map is extracted from an input stereo image pair and the extracted disparity map and one of two input images are transmitted or stored at a local or remote site. However, correspondences can not be determined in occlusion areas. Thus, it is not easy to recover 3D information in such regions. In this paper, a reconstruction image compensation algorithm based on error block concealment and in-loop filtering is proposed to minimize the reconstruction error in generating stereo image pair. The effectiveness of the proposed algorithm is shown in term of objective accuracy of reconstruction image with several real stereo image pairs.

Screen Disparity and Size Perception Function of Various 3D Stimuli (양안시차에 따른 다양한 3D 자극의 크기지각 예측함수 개발)

  • Park, JongJin;Li, Hyung-Chul O.;Kim, ShinWoo
    • Journal of Broadcast Engineering
    • /
    • v.18 no.1
    • /
    • pp.66-76
    • /
    • 2013
  • Although there has been much advance in the development of 3D displays of various purpose, 3D contents are not yet so used as expected in those displays. One well-known obstacle in the enjoyment of 3D contents is visual fatigue, but another major issue is image distortion of 3D contents. In the previous research, Shin, Li, & Kim (2012) reported systematic linear relationship between screen disparity and size perception of a simple object whose retinal size was constant across different disparities. In this research, we intended to generalize the previous finding by using various 3D stimuli in the test of the relationship between screen disparity and size perception of those stimuli. Consistent with previous findings, our data indicated that size perception linearly changes as a function of screen disparity and the linearity was observed in all stimuli types we used in this research. We described the empirical relationship between screen disparity and size perception in the form of prediction function for size perception in which visual angle is the predictor. This function will be very useful in the creation of 3D contents as one can make reasonable predictions on the to-be-perceived size of an object being filmed using screen disparity of their camera setting.