• Title/Summary/Keyword: 3차원 방송

Search Result 582, Processing Time 0.033 seconds

Band Selection Algorithm based on Expected Value for Pixel Classification (픽셀 분류를 위한 기댓값 기반 밴드 선택 알고리즘)

  • Chang, Duhyeuk;Jung, Byeonghyeon;Heo, Junyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.6
    • /
    • pp.107-112
    • /
    • 2022
  • In an embedded system such as a drone, it is difficult to store, transfer and analyze the entire hyper-spectral image to a server in real time because it takes a lot of power and time. Therefore, the hyper-spectral image data is transmitted to the server through dimension reduction or compression pre-processing. Feature selection method are used to send only the bands for analysis purpose, and these algorithms usually take a lot of processing time depending on the size of the image, even though the efficiency is high. In this paper, by improving the temporal disadvantage of the band selection algorithm, the time taken 24 hours was reduced to around 60-180 seconds based on the 40000*682 image resolution of 8GB data, and the use of 7.6GB RAM was significantly reduced to 2.3GB using 45 out of 150 bands. However, in terms of pixel classification performance, more than 98% of analysis results were derived similarly to the previous one.

Lossless Coding of Audio Spectral Coefficients Using Selective Bit-Plane Coding (선택적 비트 플레인 부호화를 이용한 오디오 주파수 계수의 무손실 부호화 기술)

  • Yoo, Seung-Kwan;Park, Ho-Chong;Oh, Seoung-Jun;Ahn, Chang-Beom;Sim, Dong-Gyu;Beak, Seung-Kwon;Kang, Kyoung-Ok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.1
    • /
    • pp.18-25
    • /
    • 2008
  • In this paper, new lossless coding method of spectral coefficients for audio codec is proposed. Conventional lossless coder uses Huffman coding utilizing the statistical characteristics of spectral coefficients, but does not provide the high coding efficiency due to its simple structure. To solve this limitation, new lossless coding scheme with better performance is proposed that consists of bit-plane transform and run-length coding. In the proposed scheme, the spectral coefficients are first transformed by bit-plane into 1-D bit-stream with better correlative properties, which is then coded intorun-length and is finally Huffman coded. In addition, the coding performance is further increased by applying the proposed bit-plane coding selectively to each group, after the entire frequency is divided into 3 groups. The performance of proposed coding scheme is measured in terms of theoretical number of bits based on the entropy, and shows at most 6% enhancement compared to that of conventional lossless coder used in AAC audio codec.

Multi-view Video Coding using View Interpolation (영상 보간을 이용한 다시점 비디오 부호화 방법)

  • Lee, Cheon;Oh, Kwan-Jung;Ho, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.12 no.2
    • /
    • pp.128-136
    • /
    • 2007
  • Since the multi-view video is a set of video sequences captured by multiple array cameras for the same three-dimensional scene, it can provide multiple viewpoint images using geometrical manipulation and intermediate view generation. Although multi-view video allows us to experience more realistic feeling with a wide range of images, the amount of data to be processed increases in proportion to the number of cameras. Therefore, we need to develop efficient coding methods. One of the possible approaches to multi-view video coding is to generate an intermediate image using view interpolation method and to use the interpolated image as an additional reference frame. The previous view interpolation method for multi-view video coding employs fixed size block matching over the pre-determined disparity search range. However, if the disparity search range is not proper, disparity error may occur. In this paper, we propose an efficient view interpolation method using initial disparity estimation, variable block-based estimation, and pixel-level estimation using adjusted search ranges. In addition, we propose a multi-view video coding method based on H.264/AVC to exploit the intermediate image. Intermediate images have been improved about $1{\sim}4dB$ using the proposed method compared to the previous view interpolation method, and the coding efficiency have been improved about 0.5 dB compared to the reference model.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

A Prototype Architecture of an Interactive Service System for Digital Hologram Videos (디지털 홀로그램 비디오를 위한 인터랙티브 서비스 시스템의 프로토타입 설계)

  • Seo, Young-Ho;Lee, Yoon-Hyuk;Yoo, Ji-Sang;Kim, Man-Bae;Choi, Hyun-Jun;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.17 no.4
    • /
    • pp.695-706
    • /
    • 2012
  • The purpose of this paper is to propose a service system for a digital hologram video, which has not been published yet. This system assumes the existing service frame for 2-dimensional or 3-dimensional image/video, which includes data acquisition, processing, transmission, reception, and reconstruction. This system also includes the function to service the digital hologram at the viewer's view point by tracking the viewer's face. For this function, the image information at the virtual view point corresponding to the viewer's view point is generated to get the corresponding hologram. Here in this paper, only a prototype that includes major functions of it is implemented, which includes camera system for data acquisition, camera calibration and image rectification, depth/intensity image enhancement, intermediate view generation, digital hologram generation, and holographic image reconstruction by both simulation and optical apparatus. The proposed prototype system was implemented and the result showed that it takes about 352ms to generate one frame of digital hologram and reconstruct the image by simulation, or 183ms to reconstruct image by optical apparatus instead of simulation.

Accurate Camera Calibration Method for Multiview Stereoscopic Image Acquisition (다중 입체 영상 획득을 위한 정밀 카메라 캘리브레이션 기법)

  • Kim, Jung Hee;Yun, Yeohun;Kim, Junsu;Yun, Kugjin;Cheong, Won-Sik;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.919-927
    • /
    • 2019
  • In this paper, we propose an accurate camera calibration method for acquiring multiview stereoscopic images. Generally, camera calibration is performed by using checkerboard structured patterns. The checkerboard pattern simplifies feature point extraction process and utilizes previously recognized lattice structure, which results in the accurate estimation of relations between the point on 2-dimensional image and the point on 3-dimensional space. Since estimation accuracy of camera parameters is dependent on feature matching, accurate detection of checkerboard corner is crucial. Therefore, in this paper, we propose the method that performs accurate camera calibration method through accurate detection of checkerboard corners. Proposed method detects checkerboard corner candidates by utilizing 1-dimensional gaussian filters with succeeding corner refinement process to remove outliers from corner candidates and accurately detect checkerboard corners in sub-pixel unit. In order to verify the proposed method, we check reprojection errors and camera location estimation results to confirm camera intrinsic parameters and extrinsic parameters estimation accuracy.

Rendering Quality Improvement Method based on Depth and Inverse Warping (깊이정보와 역변환 기반의 포인트 클라우드 렌더링 품질 향상 방법)

  • Lee, Heejea;Yun, Junyoung;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.714-724
    • /
    • 2021
  • The point cloud content is immersive content recorded by acquiring points and colors corresponding to the real environment and objects having three-dimensional location information. When a point cloud content consisting of three-dimensional points having position and color information is enlarged and rendered, the gap between the points widens and an empty hole occurs. In this paper, we propose a method for improving the quality of point cloud contents through inverse transformation-based interpolation using depth information for holes by finding holes that occur due to the gap between points when expanding the point cloud. The points on the back are rendered between the holes created by the gap between the points, acting as a hindrance to applying the interpolation method. To solve this, remove the points corresponding to the back side of the point cloud. Next, a depth map at the point in time when an empty hole is generated is extracted. Finally, inverse transform is performed to extract pixels from the original data. As a result of rendering content by the proposed method, the rendering quality improved by 1.2 dB in terms of average PSNR compared to the conventional method of increasing the size to fill the blank area.

Changes in the Emotion by the Expressive Definition of Visual Contents (영상콘텐츠의 표현밀도에 따른 감정의 변화)

  • Kim, Se-Hwa
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.1
    • /
    • pp.192-201
    • /
    • 2010
  • This research deals with expressive definition of visual contents by using the distance between a subject and a screen resolution, and what changes affect the emotion of those looking at the expressive definition. A visual image captured from a HDTV screen was shown to the 61 students attending a university in the Busan area and SAM evaluation method was used to measure 3 different emotions such as pleasant, arousal, and dominance. While comparing different resolution, looking at high resolution contents rather than low resolution resulted in a direction of pleasant, arousal, and dominance. Also showing a different resolution than consistently showing the same resolution had a more volatile emotional effect. Aftermath multiple comparison resulted in a tendency for emotions to become unpleasant and un-arousal when high resolution contents were shown and then switched to a low resolution contents. There was no result of any significance in the control variables. Also on the aftermath multiple comparison on short, medium and long distance between the subject and the screen resolution, short distance had a bigger pleasant, arousal, and dominance emotional numbers than the rest. In a multiple variable verification result, a resolution and the distance of happiness and excitement showed a positive correlation.

Development of a Haptic Modeling and Editing (촉감 모델링 및 편집 툴 개발)

  • Seo, Yong-Won;Lee, Beom-Chan;Cha, Jong-Eun;Kim, Jong-Phil;Ryu, Je-Ha
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.373-378
    • /
    • 2007
  • 최근 들어 햅틱 분야는 디지털 콘텐츠를 만질 수 있게 촉감을 제공함으로써 의학, 교육, 군사, 엔터테인먼트, 방송 분야 등에서 널리 연구되고 있다. 그러나 햅틱 분야가 사용자에게 시청각 정보와 더불어 추가적인 촉감을 제공함으로써 보다 실감 있고 자연스러운 상호작용을 제공하는 등 여러 가지 장점을 가진 것에 비해 아직은 일반 사용자들에게 생소한 분야다. 그 이유 중 하나로 촉감 상호작용이 가능한 콘텐츠의 부재를 들 수 있다. 또한 최근에 가상환경(Virtual Environment, VR)에 관심이 증가 되고, 가상환경에 햅틱이라는 기술을 접목시키는 시도가 많이 일어나고 있어서, 촉감 모델링에 대한 욕구 또한 증대 되고 있다. 일반적으로 촉감 모델링은 Material properties를 가지고 있는 그래픽 모델들로 구성이 된다. 그래픽 모델링은 일반적인 모델링툴 (MAYA, 3D MAX, 기타 등)으로 할 수 있다. 하지만 촉감 관련된 촉감 모델들은 콘텐츠를 제작한 이후에 일일이 수작업으로 넣어 주어야 한다. 그래픽 모델링에서는 사용자가 직접 눈으로 확인 하면서 작업을 이루어 지기 때문에 직관적으로 이루어질 수 있다. 이와 비슷하게 촉감 모델링은 직관적인 모델링을 하기 위해서 사용자가 직접 촉감을 느껴 보면서 진행이 되어야 한다. 또한 그래픽 모델링과 촉감 모델링이 동시에 진행이 되지 않기 때문에 촉감 콘텐츠를 만드는데 시간이 많이 걸리게 되고 직관적이지 못하는 단점이 있다. 더 나아가서 이런 촉감 모델링을 포함한 모델링 높은 생산성을 위해서 신속히 이루어져야 한다. 이런 이유들 때문에 촉감 모델링을 위한 새로운 인터페이스가 필요하다. 본 논문에서는 촉감 상호작용이 가능한 촉감 콘텐츠를 직관적으로 생성하고 조작할 수 있게 하는 촉감 모델러를 기술한다. 촉감 모델러에서 사용자는 3 자유도 촉감 장치를 사용하여 3 차원의 콘텐츠 (정적 이거나 동적이거나 Deformation이 가능한 2D, 2.5D, 3D Scene)를 실시간으로 만져보면서 생성, 조작할 수 있는 촉감 사용자 인터페이스 (Haptic User Interface, HUI)를 통해서 콘텐츠의 표면 촉감 특성을 직관적으로 편집할 수 있다. 촉감 사용자인터페이스는 마우스로 조작하는 기존의 2 차원 그래픽 사용자 인터페이스를 포함하여 3 차원으로 사용자 인터페이스도 추가되어 있고 그 형태는 촉감 장치로 조작할 수 있는 버튼, 라디오버튼, 슬라이더, 조이스틱의 구성요소로 이루어져있다. 사용자는 각각의 구성요소를 조작하여 콘텐츠의 표면 촉감 특성 값을 바꾸고 촉감 사용자 인터페이스의 한 부분을 만져 그 촉감을 실시간으로 느껴봄으로써 직관적으로 특성 값을 정할 수 있다. 또한, XML 기반의 파일포맷을 제공함으로써 생성된 콘텐츠를 저장할 수 있고 저장된 콘텐츠를 불러오거나 다른 콘텐츠에 추가할 수 있다. 이러한 시스템은 햅틱이라는 분야를 잘 모르는 사람들도 직관적으로 촉감 모델링을 하는데 큰 도움을 줄 수 있을 것이다.

  • PDF

Estimation of a Driver's Physical Condition Using Real-time Vision System (실시간 비전 시스템을 이용한 운전자 신체적 상태 추정)

  • Kim, Jong-Il;Ahn, Hyun-Sik;Jeong, Gu-Min;Moon, Chan-Woo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.213-224
    • /
    • 2009
  • This paper presents a new algorithm for estimating a driver's physical condition using real-time vision system and performs experimentation for real facial image data. The system relies on a face recognition to robustly track the center points and sizes of person's two pupils, and two side edge points of the mouth. The face recognition constitutes the color statistics by YUV color space together with geometrical model of a typical face. The system can classify the rotation in all viewing directions, to detect eye/mouth occlusion, eye blinking and eye closure, and to recover the three dimensional gaze of the eyes. These are utilized to determine the carelessness and drowsiness of the driver. Finally, experimental results have demonstrated the validity and the applicability of the proposed method for the estimation of a driver's physical condition.

  • PDF