• Title/Summary/Keyword: 깊이 맵

Search Result 171, Processing Time 0.03 seconds

Depth Map Extraction from the Single Image Using Pix2Pix Model (Pix2Pix 모델을 활용한 단일 영상의 깊이맵 추출)

  • Gang, Su Myung;Lee, Joon Jae
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.5
    • /
    • pp.547-557
    • /
    • 2019
  • To extract the depth map from a single image, a number of CNN-based deep learning methods have been performed in recent research. In this study, the GAN structure of Pix2Pix is maintained. this model allows to converge well, because it has the structure of the generator and the discriminator. But the convolution in this model takes a long time to compute. So we change the convolution form in the generator to a depthwise convolution to improve the speed while preserving the result. Thus, the seven down-sizing convolutional hidden layers in the generator U-Net are changed to depthwise convolution. This type of convolution decreases the number of parameters, and also speeds up computation time. The proposed model shows similar depth map prediction results as in the case of the existing structure, and the computation time in case of a inference is decreased by 64%.

Multi-view Video Acquisition Workflow in Real Scene (실사 환경에서의 다시점 영상 획득 워크플로우)

  • Bongho Lee;Joonsoo Kim;Jun Young Jeong;Kuk Jin Yun;Won-Sik Cheong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.154-156
    • /
    • 2022
  • 본 논문은 카메라 어레이기반 실사 다시점 입체영상을 획득·생성하기 위한 워크플로우를 제시하고 이를 검증하기 위한 실험 결과를 소개한다. 구체적으로, 액션 캠 기반 수렴형 리그 구조, 획득 동기화, 카메라 캘리브레이션, 깊이 맵 추출을 포함하는 일련의 과정 및 이에 대한 검증으로 실내외 2종의 콘텐츠의 획득 실험 결과를 기술한다.

  • PDF

Shadow Playing Contents Development by Using Kinect for Interactive Learning (키넥트를 사용한 체감형 학습 그림자 놀이 콘텐츠 개발)

  • Son, Jong-Deok;Lee, Byung-Gook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.464-466
    • /
    • 2011
  • 본 논문에서는 영상 처리 및 컴퓨터 비전 분야의 기술을 마이크로 소프트 사의 키넥트에 적용하여 효과적인 체감형 학습을 위한 콘텐츠를 구성하고자 한다. 일반적으로 사람들이 많이 알고 있는 손 그림자 놀이를 응용하여 참여자들의 움직임을 통해 인터랙션을 발생시킬 수 있도록 하였고, 깊이정보맵으로부터 카메라에서 가장 가까운 거리에 있는 영역을 검출하기 위해 Meanshift segmentation(평균이동분할) 알고리즘을 적용 시켰다. 본 시스템의 체감형 콘텐츠는 문화 콘텐츠의 한 종류로서 이의 확장된 버전이 여러 분야에서 많은 활용이 될 것을 기대한다.

Acquisition Workflow of Multiview Stereoscopic Video at Real and CG Environment (실사 및 CG 환경에서의 다시점 입체영상 획득 기술)

  • Jeong, Jun Young;Yun, Kug Jin;Cheong, Won-Sik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.51-53
    • /
    • 2022
  • 고정된 위치를 중심으로 회전운동만 체험할 수 있는 3 자유도(DoF: Degrees of Freedom)를 넘어 위치를 변경하며 운동시차까지 포함된 6 자유도를 지원하는 몰입형 미디어에 대한 연구가 지속해서 진행되고 있다. 특히 부드러운 시점 변경을 제공하기 위해 특정 위치에서 샘플링 된 여러 개의 텍스쳐(또는 컬러) 및 깊이맵 영상(MVD: Multiview Video plus Depth)으로 구성된 다시점 영상을 통해 실제로 획득되지 않은 위치에서의 영상을 만들어내는 가상시점 합성(virtual view synthesis) 기술이 많이 사용되고 있다. 본 논문에서는 몰입형 미디어의 대표적인 데이터 형식인 다시점 영상을 실사 및 컴퓨터 그래픽스(CG: Computer Graphics) 환경에서 획득하는 방법에 관해 설명한다.

  • PDF

Effect of Creative Thinking through Art Collaboration Class (아트 콜라보레이션 수업을 통한 창의적 사고의 효과)

  • An, Ji-Su;Huh, Yoon-Jung
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.7
    • /
    • pp.121-131
    • /
    • 2019
  • Art Collaborative advertising uses creative works that are not related directly to the product and recreates it as a more valuable product. Creative thinking acts as a core value. The purpose of this study is to investigate the effect of creative thinking using mind map and SCAMPER technique in art collaboration class. After analyzing the art collaboration advertisement class for 6 students in middle school, we analyzed characteristics between creative technique and creativity through student activities and work. The results were as follows. First, creative thinking ability of students who experienced art collaboration showed flexibility and originality in SCAMPER, and fluency in mind map. Second, throughout the course, we were able to observe elaboration, which embodied tough ideas and developed depth. This study will contribute to the research related to the improvement of students' creative convergence case by meeting two or more areas and collaborating on each core competency.

Stereoscopic Free-viewpoint Tour-Into-Picture Generation from a Single Image (단안 영상의 입체 자유시점 Tour-Into-Picture)

  • Kim, Je-Dong;Lee, Kwang-Hoon;Kim, Man-Bae
    • Journal of Broadcast Engineering
    • /
    • v.15 no.2
    • /
    • pp.163-172
    • /
    • 2010
  • The free viewpoint video delivers an active contents where users can see the images rendered from the viewpoints chosen by them. Its applications are found in broad areas, especially museum tour, entertainment and so forth. As a new free-viewpoint application, this paper presents a stereoscopic free-viewpoint TIP (Tour Into Picture) where users can navigate the inside of a single image controlling a virtual camera and utilizing depth data. Unlike conventional TIP methods providing 2D image or video, our proposed method can provide users with 3D stereoscopic and free-viewpoint contents. Navigating a picture with stereoscopic viewing can deliver more realistic and immersive perception. The method uses semi-automatic processing to make foreground mask, background image, and depth map. The second step is to navigate the single picture and to obtain rendered images by perspective projection. For the free-viewpoint viewing, a virtual camera whose operations include translation, rotation, look-around, and zooming is operated. In experiments, the proposed method was tested eth 'Danopungjun' that is one of famous paintings made in Chosun Dynasty. The free-viewpoint software is developed based on MFC Visual C++ and OpenGL libraries.

A Fast Algorithm of the Belief Propagation Stereo Method (신뢰전파 스테레오 기법의 고속 알고리즘)

  • Choi, Young-Seok;Kang, Hyun-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.5
    • /
    • pp.1-8
    • /
    • 2008
  • The belief propagation method that has been studied recently yields good performance in disparity extraction. The method in which a target function is modeled as an energy function based on Markov random field(MRF), solves the stereo matching problem by finding the disparity to minimize the energy function. MRF models provide robust and unified framework for vision problem such as stereo and image restoration. the belief propagation method produces quite correct results, but it has difficulty in real time implementation because of higher computational complexity than other stereo methods. To relieve this problem, in this paper, we propose a fast algorithm of the belief propagation method. Energy function consists of a data term and a smoothness tern. The data term usually corresponds to the difference in brightness between correspondences, and smoothness term indicates the continuity of adjacent pixels. Smoothness information is created from messages, which are assigned using four different message arrays for the pixel positions adjacent in four directions. The processing time for four message arrays dominates 80 percent of the whole program execution time. In the proposed method, we propose an algorithm that dramatically reduces the processing time require in message calculation, since the message.; are not produced in four arrays but in a single array. Tn the last step of disparity extraction process, the messages are called in the single integrated array and this algorithm requires 1/4 computational complexity of the conventional method. Our method is evaluated by comparing the disparity error rates of our method and the conventional method. Experimental results show that the proposed method remarkably reduces the execution time while it rarely increases disparity error.

Fast View Synthesis Using GPGPU (GPGPU를 이용한 고속 영상 합성 기법)

  • Shin, Hong-Chang;Park, Han-Hoon;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.13 no.6
    • /
    • pp.859-874
    • /
    • 2008
  • In this paper, we develop a fast view synthesis method that generates multiple intermediate views in real-time for the 3D display system when the camera geometry and depth map of reference views are given in advance. The proposed method achieves faster view synthesis than previous approaches in GPU by processing in parallel the entire computations required for the view synthesis. Specifically, we use $CUDA^{TM}$ (by NVIDIA) to control GPU device. For increasing the processing speed, we adapted all the processes for the view synthesis to single instruction multiple data (SIMD) structure that is a main feature of CUDA, maximized the use of the high-speed memories on GPU device, and optimized the implementation. As a result, we could synthesize 9 intermediate view images with the size of 720 by 480 pixels within 0.128 second.

3D Model Reconstruction Algorithm Using a Focus Measure Based on Higher Order Statistics (고차 통계 초점 척도를 이용한 3D 모델 복원 알고리즘)

  • Lee, Joo-Hyun;Yoon, Hyeon-Ju;Han, Kyu-Phil
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.1
    • /
    • pp.11-18
    • /
    • 2013
  • This paper presents a SFF(shape from focus) algorithm using a new focus measure based on higher order statistics for the exact depth estimation. Since conventional SFF-based 3D depth reconstruction algorithms used SML(sum of modified Laplacian) as the focus measure, their performance is strongly depended on the image characteristics. These are efficient only for the rich texture and well focused images. Therefore, this paper adopts a new focus measure using HOS(higher order statistics), in order to extract the focus value for relatively poor texture and focused images. The initial best focus area map is generated by the measure. Thereafter, the area refinement, thinning, and corner detection methods are successively applied for the extraction of the locally best focus points. Finally, a 3D model from the carefully selected points is reconstructed by Delaunay triangulation.

Screen Content Coding Analysis to Improve Coding Efficiency for Immersive Video (몰입형 비디오 압축을 위한 스크린 콘텐츠 코딩 성능 분석)

  • Lee, Soonbin;Jeong, Jong-Beom;Kim, Inae;Lee, Sangsoon;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.25 no.6
    • /
    • pp.911-921
    • /
    • 2020
  • Recently, MPEG-I (Immersive) has been exploring compression performance through standardization projects for immersive video. The MPEG Immersion Video (MIV) standard technology is intended to provide limited 6DoF based on depth map-based image rendering (DIBR). MIV is a model that processes the Basic View and the residual information into an Additional View, which is a collection of patches. Atlases have the unique characteristics depending on the kind of the view they are included, requiring consideration of the compression efficiency. In this paper, the performance comparison analysis of screen content coding tools such as intra block copy (IBC) is conducted, based on the pattern of various views and patches repetition. It is demonstrated that the proposed method improves coding performance around -15.74% BD-rate reduction in the MIV.