• Title/Summary/Keyword: Broadcast Media

Search Result 7,012, Processing Time 0.025 seconds

A New Calibration of 3D Point Cloud using 3D Skeleton (3D 스켈레톤을 이용한 3D 포인트 클라우드의 캘리브레이션)

  • Park, Byung-Seo;Kang, Ji-Won;Lee, Sol;Park, Jung-Tak;Choi, Jang-Hwan;Kim, Dong-Wook;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.247-257
    • /
    • 2021
  • This paper proposes a new technique for calibrating a multi-view RGB-D camera using a 3D (dimensional) skeleton. In order to calibrate a multi-view camera, consistent feature points are required. In addition, it is necessary to acquire accurate feature points in order to obtain a high-accuracy calibration result. We use the human skeleton as a feature point to calibrate a multi-view camera. The human skeleton can be easily obtained using state-of-the-art pose estimation algorithms. We propose an RGB-D-based calibration algorithm that uses the joint coordinates of the 3D skeleton obtained through the posture estimation algorithm as a feature point. Since the human body information captured by the multi-view camera may be incomplete, the skeleton predicted based on the image information acquired through it may be incomplete. After efficiently integrating a large number of incomplete skeletons into one skeleton, multi-view cameras can be calibrated by using the integrated skeleton to obtain a camera transformation matrix. In order to increase the accuracy of the calibration, multiple skeletons are used for optimization through temporal iterations. We demonstrate through experiments that a multi-view camera can be calibrated using a large number of incomplete skeletons.

Dense-Depth Map Estimation with LiDAR Depth Map and Optical Images based on Self-Organizing Map (라이다 깊이 맵과 이미지를 사용한 자기 조직화 지도 기반의 고밀도 깊이 맵 생성 방법)

  • Choi, Hansol;Lee, Jongseok;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.283-295
    • /
    • 2021
  • This paper proposes a method for generating dense depth map using information of color images and depth map generated based on lidar based on self-organizing map. The proposed depth map upsampling method consists of an initial depth prediction step for an area that has not been acquired from LiDAR and an initial depth filtering step. In the initial depth prediction step, stereo matching is performed on two color images to predict an initial depth value. In the depth map filtering step, in order to reduce the error of the predicted initial depth value, a self-organizing map technique is performed on the predicted depth pixel by using the measured depth pixel around the predicted depth pixel. In the process of self-organization map, a weight is determined according to a difference between a distance between a predicted depth pixel and an measured depth pixel and a color value corresponding to each pixel. In this paper, we compared the proposed method with the bilateral filter and k-nearest neighbor widely used as a depth map upsampling method for performance comparison. Compared to the bilateral filter and the k-nearest neighbor, the proposed method reduced by about 6.4% and 8.6% in terms of MAE, and about 10.8% and 14.3% in terms of RMSE.

Watermarking for Digital Hologram by a Deep Neural Network and its Training Considering the Hologram Data Characteristics (딥 뉴럴 네트워크에 의한 디지털 홀로그램의 워터마킹 및 홀로그램 데이터 특성을 고려한 학습)

  • Lee, Juwon;Lee, Jae-Eun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.296-307
    • /
    • 2021
  • A digital hologram (DH) is an ultra-high value-added video content that includes 3D information in 2D data. Therefore, its intellectual property rights must be protected for its distribution. For this, this paper proposes a watermarking method of DH using a deep neural network. This method is a watermark (WM) invisibility, attack robustness, and blind watermarking method that does not use host information in WM extraction. The proposed network consists of four sub-networks: pre-processing for each of the host and WM, WM embedding watermark, and WM extracting watermark. This network expand the WM data to the host instead of shrinking host data to WM and concatenate it to the host to insert the WM by considering the characteristics of a DH having a strong high frequency component. In addition, in the training of this network, the difference in performance according to the data distribution property of DH is identified, and a method of selecting a training data set with the best performance in all types of DH is presented. The proposed method is tested for various types and strengths of attacks to show its performance. It also shows that this method has high practicality as it operates independently of the resolution of the host DH and WM data.

Image Quality for TV Genre Depending on Viewers Experience (시청자 경험에 의한 TV장르별 화질)

  • Park, YungKyung
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.308-320
    • /
    • 2021
  • Conventional image quality studies have been focused on 'naturalness' and has relied on memory color. Memory colors are mainly formed for familiar objects with prior experience, and the more faithfully these memories are reflected, the more naturalness of the reproduced image quality increases. In particular, the brightness and saturation of memory colors play an important role in increasing the preference of image quality as well as naturalness. Therefore, in the case of existing image quality studies, image quality characteristics were studied focusing on natural objects and people with memory. We extracted representative images of each genre (sports, documentaries, news, entertainment and music, and movies), adjusted the brightness, contrast, and saturation of each image, and conducted an experiment to evaluate perceived quality. Based on situational context, the results of this classification indicated that genres of television content can be divided into two categories: proximate and indirect experiences. Proximate experience best characterizes outdoor sports, dramas, and nature documentaries, where their image qualities have shown to have a strong correlation with brightness and contrast. On the other hand, indirect experience best characterizes news, music shows and SF/action movies. The image quality perception for indirect experiences was shown to be closely related to and optimized by contrast and saturation.

Deep Learning-based Keypoint Filtering for Remote Sensing Image Registration (원격 탐사 영상 정합을 위한 딥러닝 기반 특징점 필터링)

  • Sung, Jun-Young;Lee, Woo-Ju;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.26 no.1
    • /
    • pp.26-38
    • /
    • 2021
  • In this paper, DLKF (Deep Learning Keypoint Filtering), the deep learning-based keypoint filtering method for the rapidization of the image registration method for remote sensing images is proposed. The complexity of the conventional feature-based image registration method arises during the feature matching step. To reduce this complexity, this paper proposes to filter only the keypoints detected in the artificial structure among the keypoints detected in the keypoint detector by ensuring that the feature matching is matched with the keypoints detected in the artificial structure of the image. For reducing the number of keypoints points as preserving essential keypoints, we preserve keypoints adjacent to the boundaries of the artificial structure, and use reduced images, and crop image patches overlapping to eliminate noise from the patch boundary as a result of the image segmentation method. the proposed method improves the speed and accuracy of registration. To verify the performance of DLKF, the speed and accuracy of the conventional keypoints extraction method were compared using the remote sensing image of KOMPSAT-3 satellite. Based on the SIFT-based registration method, which is commonly used in households, the SURF-based registration method, which improved the speed of the SIFT method, improved the speed by 2.6 times while reducing the number of keypoints by about 18%, but the accuracy decreased from 3.42 to 5.43. Became. However, when the proposed method, DLKF, was used, the number of keypoints was reduced by about 82%, improving the speed by about 20.5 times, while reducing the accuracy to 4.51.

Group-based Adaptive Rendering for 6DoF Immersive Video Streaming (6DoF 몰입형 비디오 스트리밍을 위한 그룹 분할 기반 적응적 렌더링 기법)

  • Lee, Soonbin;Jeong, Jong-Beom;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.27 no.2
    • /
    • pp.216-227
    • /
    • 2022
  • The MPEG-I (Immersive) group is working on a standardization project for immersive video that provides 6 degrees of freedom (6DoF). The MPEG Immersion Video (MIV) standard technology is intended to provide limited 6DoF based on depth map-based image rendering (DIBR) technique. Many efficient coding methods have been suggested for MIV, but efficient transmission strategies have received little attention in MPEG-I. This paper proposes group-based adaptive rendering method for immersive video streaming. Each group can be transmitted independently using group-based encoding, enabling adaptive transmission depending on the user's viewport. In the rendering process, the proposed method derives weights of group for view synthesis and allocate high quality bitstream according to a given viewport. The proposed method is implemented through the Test Model for Immersive Video (TMIV) test model. The proposed method demonstrates 17.0% Bjontegaard-delta rate (BD-rate) savings on the peak signalto-noise ratio (PSNR) and 14.6% on the Immersive Video PSNR(IV-PSNR) in terms of various end-to-end evaluation metrics in the experiment.

Change Attention-based Vehicle Scratch Detection System (변화 주목 기반 차량 흠집 탐지 시스템)

  • Lee, EunSeong;Lee, DongJun;Park, GunHee;Lee, Woo-Ju;Sim, Donggyu;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.27 no.2
    • /
    • pp.228-239
    • /
    • 2022
  • In this paper, we propose an unmanned vehicle scratch detection deep learning model for car sharing services. Conventional scratch detection models consist of two steps: 1) a deep learning module for scratch detection of images before and after rental, 2) a manual matching process for finding newly generated scratches. In order to build a fully automatic scratch detection model, we propose a one-step unmanned scratch detection deep learning model. The proposed model is implemented by applying transfer learning and fine-tuning to the deep learning model that detects changes in satellite images. In the proposed car sharing service, specular reflection greatly affects the scratch detection performance since the brightness of the gloss-treated automobile surface is anisotropic and a non-expert user takes a picture with a general camera. In order to reduce detection errors caused by specular reflected light, we propose a preprocessing process for removing specular reflection components. For data taken by mobile phone cameras, the proposed system can provide high matching performance subjectively and objectively. The scores for change detection metrics such as precision, recall, F1, and kappa are 67.90%, 74.56%, 71.08%, and 70.18%, respectively.

Detection Scheme Based on Gauss - Seidel Method for OTFS Systems (OTFS 시스템을 위한 Gauss - Seidel 방법 기반의 검출 기법)

  • Cha, Eunyoung;Kim, Hyeongseok;Ahn, Haesung;Kwon, Seol;Kim, Jeongchang
    • Journal of Broadcast Engineering
    • /
    • v.27 no.2
    • /
    • pp.244-247
    • /
    • 2022
  • In this paper, the performance of the decoding schemes using linear MMSE filters in the frequency and time domains and the reinforcement Gauss-Seidel algorithm for the orthogonal time frequency space (OTFS) system that can improve robustness under high-speed mobile environments are compared. The reinforcement Gauss-Seidel algorithm can improve the bit error rate performance by suppressing the noise enhancement. The simulation results show that the performance of the decoding scheme using the linear MMSE filter in the frequency domain is severely degraded due to the effect of Doppler shift as the mobile speed increases. In addition, the decoding scheme using the reinforcement Gauss-Seidel algorithm under the channel environment with 120 km/h and 500 km/h speeds outperforms the decoding schemes using linear MMSE filters in the frequency and time domains.

Character Detection and Recognition of Steel Materials in Construction Drawings using YOLOv4-based Small Object Detection Techniques (YOLOv4 기반의 소형 물체탐지기법을 이용한 건설도면 내 철강 자재 문자 검출 및 인식기법)

  • Sim, Ji-Woo;Woo, Hee-Jo;Kim, Yoonhwan;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.391-401
    • /
    • 2022
  • As deep learning-based object detection and recognition research have been developed recently, the scope of application to industry and real life is expanding. But deep learning-based systems in the construction system are still much less studied. Calculating materials in the construction system is still manual, so it is a reality that transactions of wrong volumn calculation are generated due to a lot of time required and difficulty in accurate accumulation. A fast and accurate automatic drawing recognition system is required to solve this problem. Therefore, we propose an AI-based automatic drawing recognition accumulation system that detects and recognizes steel materials in construction drawings. To accurately detect steel materials in construction drawings, we propose data augmentation techniques and spatial attention modules for improving small object detection performance based on YOLOv4. The detected steel material area is recognized by text, and the number of steel materials is integrated based on the predicted characters. Experimental results show that the proposed method increases the accuracy and precision by 1.8% and 16%, respectively, compared with the conventional YOLOv4. As for the proposed method, Precision performance was 0.938. The recall was 1. Average Precision AP0.5 was 99.4% and AP0.5:0.95 was 67%. Accuracy for character recognition obtained 99.9.% by configuring and learning a suitable dataset that contains fonts used in construction drawings compared to the 75.6% using the existing dataset. The average time required per image was 0.013 seconds in the detection, 0.65 seconds in character recognition, and 0.16 seconds in the accumulation, resulting in 0.84 seconds.

Comparison of Artificial Intelligence Multitask Performance using Object Detection and Foreground Image (물체탐색과 전경영상을 이용한 인공지능 멀티태스크 성능 비교)

  • Jeong, Min Hyuk;Kim, Sang-Kyun;Lee, Jin Young;Choo, Hyon-Gon;Lee, HeeKyung;Cheong, Won-Sik
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.308-317
    • /
    • 2022
  • Researches are underway to efficiently reduce the size of video data transmitted and stored in the image analysis process using deep learning-based machine vision technology. MPEG (Moving Picture Expert Group) has newly established a standardization project called VCM (Video Coding for Machine) and is conducting research on video encoding for machines rather than video encoding for humans. We are researching a multitask that performs various tasks with one image input. The proposed pipeline does not perform all object detection of each task that should precede object detection, but precedes it only once and uses the result as an input for each task. In this paper, we propose a pipeline for efficient multitasking and perform comparative experiments on compression efficiency, execution time, and result accuracy of the input image to check the efficiency. As a result of the experiment, the capacity of the input image decreased by more than 97.5%, while the accuracy of the result decreased slightly, confirming the possibility of efficient multitasking.