• Title/Summary/Keyword: Broadcast Media

Search Result 7,012, Processing Time 0.024 seconds

A Technique for Interpreting and Adjusting Depth Information of each Plane by Applying an Object Detection Algorithm to Multi-plane Light-field Image Converted from Hologram Image (Light-field 이미지로 변환된 다중 평면 홀로그램 영상에 대해 객체 검출 알고리즘을 적용한 평면별 객체의 깊이 정보 해석 및 조절 기법)

  • Young-Gyu Bae;Dong-Ha Shin;Seung-Yeol Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.31-41
    • /
    • 2023
  • Directly converting the focal depth and image size of computer-generated-hologram (CGH), which is obtained by calculating the interference pattern of light from the 3D image, is known to be quite difficult because of the less similarity between the CGH and the original image. This paper proposes a method for separately converting the each of focal length of the given CGH, which is composed of multi-depth images. Firstly, the proposed technique converts the 3D image reproduced from the CGH into a Light-Field (LF) image composed of a set of 2D images observed from various angles, and the positions of the moving objects for each observed views are checked using an object detection algorithm YOLOv5 (You-Only-Look-Once-version-5). After that, by adjusting the positions of objects, the depth-transformed LF image and CGH are generated. Numerical simulations and experimental results show that the proposed technique can change the focal length within a range of about 3 cm without significant loss of the image quality when applied to the image which have original depth of 10 cm, with a spatial light modulator which has a pixel size of 3.6 ㎛ and a resolution of 3840⨯2160.

Large-view-volume Multi-view Ball-lens Display using Optical Module Array (광학 모듈 어레이를 이용한 넓은 시야 부피의 다시점 볼 렌즈 디스플레이)

  • Gunhee Lee;Daerak Heo;Jeonghyuk Park;Minwoo Jung;Joonku Hahn
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.79-89
    • /
    • 2023
  • A multi-view display is regarded as the most practical technology to provide a three-dimensional effect to a viewer because it can provide an appropriate viewpoint according to the observer's position. But, most multi-view displays with flat shapes have a disadvantage in that a viewer watches 3D images only within a limited front viewing angle. In this paper, we proposed a spherical display using a ball lens with spherical symmetry that provides perfect parallax by extending the viewing zone to 360 degrees. In the proposed system, each projection lens is designed to be packaged into a small modular array, and the module array is arranged in a spherical shape around a ball lens to provide vertical and horizontal parallax. Through the applied optical module, the image is formed in the center of the ball lens, and 3D contents are clearly imaged with the size of about 0.65 times the diameter of the ball lens when the viewer watches them within the viewing window. Therefore, the feasibility of a 360-degree full parallax display that overcomes the spherical aberration of a ball lens and provides a wide field of view is confirmed experimentally.

Early Prediction of Fine Dust Concentration in Seoul using Weather and Fine Dust Information (기상 및 미세먼지 정보를 활용한 서울시의 미세먼지 농도 조기 예측)

  • HanJoo Lee;Minkyu Jee;Hakdong Kim;Taeheul Jun;Cheongwon Kim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.285-292
    • /
    • 2023
  • Recently, the impact of fine dust on health has become a major topic. Fine dust is dangerous because it can penetrate the body and affect the respiratory system, without being filtered out by the mucous membrane in the nose. Since fine dust is directly related to the industry, it is practically impossible to completely remove it. Therefore, if the concentration of fine dust can be predicted in advance, pre-emptive measures can be taken to minimize its impact on the human body. Fine dust can travel over 600km in a day, so it not only affects neighboring areas, but also distant regions. In this paper, wind direction and speed data and a time series prediction model were used to predict the concentration of fine dust in Seoul, and the correlation between the concentration of fine dust in Seoul and the concentration in each region was confirmed. In addition, predictions were made using the concentration of fine dust in each region and in Seoul. The lowest MAE (mean absolute error) in the prediction results was 12.13, which was about 15.17% better than the MAE of 14.3 presented in previous studies.

A Study on the Use of Contrast Agent and the Improvement of Body Part Classification Performance through Deep Learning-Based CT Scan Reconstruction (딥러닝 기반 CT 스캔 재구성을 통한 조영제 사용 및 신체 부위 분류 성능 향상 연구)

  • Seongwon Na;Yousun Ko;Kyung Won Kim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.293-301
    • /
    • 2023
  • Unstandardized medical data collection and management are still being conducted manually, and studies are being conducted to classify CT data using deep learning to solve this problem. However, most studies are developing models based only on the axial plane, which is a basic CT slice. Because CT images depict only human structures unlike general images, reconstructing CT scans alone can provide richer physical features. This study seeks to find ways to achieve higher performance through various methods of converting CT scan to 2D as well as axial planes. The training used 1042 CT scans from five body parts and collected 179 test sets and 448 with external datasets for model evaluation. To develop a deep learning model, we used InceptionResNetV2 pre-trained with ImageNet as a backbone and re-trained the entire layer of the model. As a result of the experiment, the reconstruction data model achieved 99.33% in body part classification, 1.12% higher than the axial model, and the axial model was higher only in brain and neck in contrast classification. In conclusion, it was possible to achieve more accurate performance when learning with data that shows better anatomical features than when trained with axial slice alone.

SHVC-based Texture Map Coding for Scalable Dynamic Mesh Compression (스케일러블 동적 메쉬 압축을 위한 SHVC 기반 텍스처 맵 부호화 방법)

  • Naseong Kwon;Joohyung Byeon;Hansol Choi;Donggyu Sim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.314-328
    • /
    • 2023
  • In this paper, we propose a texture map compression method based on the hierarchical coding method of SHVC to support the scalability function of dynamic mesh compression. The proposed method effectively eliminates the redundancy of multiple-resolution texture maps by downsampling a high-resolution texture map to generate multiple-resolution texture maps and encoding them with SHVC. The dynamic mesh decoder supports the scalability of mesh data by decoding a texture map having an appropriate resolution according to receiver performance and network environment. To evaluate the performance of the proposed method, the proposed method is applied to V-DMC (Video-based Dynamic Mesh Coding) reference software, TMMv1.0, and the performance of the scalable encoder/decoder proposed in this paper and TMMv1.0-based simulcast method is compared. As a result of experiments, the proposed method effectively improves in performance the average of -7.7% and -5.7% in terms of point cloud-based BD-rate (Luma PSNR) in AI and LD conditions compared to the simulcast method, confirming that it is possible to effectively support the texture map scalability of dynamic mesh data through the proposed method.

Material Image Classification using Normal Map Generation (Normal map 생성을 이용한 물질 이미지 분류)

  • Nam, Hyeongil;Kim, Tae Hyun;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.69-79
    • /
    • 2022
  • In this study, a method of generating and utilizing a normal map image used to represent the characteristics of the surface of an image material to improve the classification accuracy of the original material image is proposed. First of all, (1) to generate a normal map that reflects the surface properties of a material in an image, a U-Net with attention-R2 gate as a generator was used, and a Pix2Pix-based method using the generated normal map and the similarity with the original normal map as a reconstruction loss was used. Next, (2) we propose a network that can improve the accuracy of classification of the original material image by applying the previously created normal map image to the attention gate of the classification network. For normal maps generated using Pixar Dataset, the similarity between normal maps corresponding to ground truth is evaluated. In this case, the results of reconstruction loss function applied differently according to the similarity metrics are compared. In addition, for evaluation of material image classification, it was confirmed that the proposed method based on MINC-2500 and FMD datasets and comparative experiments in previous studies could be more accurately distinguished. The method proposed in this paper is expected to be the basis for various image processing and network construction that can identify substances within an image.

Analysis of Transfer Learning Effect for Automatic Dog Breed Classification (반려견 자동 품종 분류를 위한 전이학습 효과 분석)

  • Lee, Dongsu;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.133-145
    • /
    • 2022
  • Compared to the continuously increasing dog population and industry size in Korea, systematic analysis of related data and research on breed classification methods are very insufficient. In this paper, an automatic breed classification method is proposed using deep learning technology for 14 major dog breeds domestically raised. To do this, dog images are collected for deep learning training and a dataset is built, and a breed classification algorithm is created by performing transfer learning based on VGG-16 and Resnet-34 as backbone networks. In order to check the transfer learning effect of the two models on dog images, we compared the use of pre-trained weights and the experiment of updating the weights. When fine tuning was performed based on VGG-16 backbone network, in the final model, the accuracy of Top 1 was about 89% and that of Top 3 was about 94%, respectively. The domestic dog breed classification method and data construction proposed in this paper have the potential to be used for various application purposes, such as classification of abandoned and lost dog breeds in animal protection centers or utilization in pet-feed industry.

Latent Shifting and Compensation for Learned Video Compression (신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상)

  • Kim, Yeongwoong;Kim, Donghyun;Jeong, Se Yoon;Choi, Jin Soo;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.31-43
    • /
    • 2022
  • Traditional video compression has developed so far based on hybrid compression methods through motion prediction, residual coding, and quantization. With the rapid development of technology through artificial neural networks in recent years, research on image compression and video compression based on artificial neural networks is also progressing rapidly, showing competitiveness compared to the performance of traditional video compression codecs. In this paper, a new method capable of improving the performance of such an artificial neural network-based video compression model is presented. Basically, we take the rate-distortion optimization method using the auto-encoder and entropy model adopted by the existing learned video compression model and shifts some components of the latent information that are difficult for entropy model to estimate when transmitting compressed latent representation to the decoder side from the encoder side, and finally compensates the distortion of lost information. In this way, the existing neural network based video compression framework, MFVC (Motion Free Video Compression) is improved and the BDBR (Bjøntegaard Delta-Rate) calculated based on H.264 is nearly twice the amount of bits (-27%) of MFVC (-14%). The proposed method has the advantage of being widely applicable to neural network based image or video compression technologies, not only to MFVC, but also to models using latent information and entropy model.

Data Augmentation for Tomato Detection and Pose Estimation (토마토 위치 및 자세 추정을 위한 데이터 증대기법)

  • Jang, Minho;Hwang, Youngbae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.44-55
    • /
    • 2022
  • In order to automatically provide information on fruits in agricultural related broadcasting contents, instance image segmentation of target fruits is required. In addition, the information on the 3D pose of the corresponding fruit may be meaningfully used. This paper represents research that provides information about tomatoes in video content. A large amount of data is required to learn the instance segmentation, but it is difficult to obtain sufficient training data. Therefore, the training data is generated through a data augmentation technique based on a small amount of real images. Compared to the result using only the real images, it is shown that the detection performance is improved as a result of learning through the synthesized image created by separating the foreground and background. As a result of learning augmented images using images created using conventional image pre-processing techniques, it was shown that higher performance was obtained than synthetic images in which foreground and background were separated. To estimate the pose from the result of object detection, a point cloud was obtained using an RGB-D camera. Then, cylinder fitting based on least square minimization is performed, and the tomato pose is estimated through the axial direction of the cylinder. We show that the results of detection, instance image segmentation, and cylinder fitting of a target object effectively through various experiments.

QRAS-based Algorithm for Omnidirectional Sound Source Determination Without Blind Spots (사각영역이 없는 전방향 음원인식을 위한 QRAS 기반의 알고리즘)

  • Kim, Youngeon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.91-103
    • /
    • 2022
  • Determination of sound source characteristics such as: sound volume, direction and distance to the source is one of the important techniques for unmanned systems like autonomous vehicles, robot systems and AI speakers. There are multiple methods of determining the direction and distance to the sound source, e.g., using a radar, a rider, an ultrasonic wave and a RF signal with a sound. These methods require the transmission of signals and cannot accurately identify sound sources generated in the obstructed region due to obstacles. In this paper, we have implemented and evaluated a method of detecting and identifying the sound in the audible frequency band by a method of recognizing the volume, direction, and distance to the sound source that is generated in the periphery including the invisible region. A cross-shaped based sound source recognition algorithm, which is mainly used for identifying a sound source, can measure the volume and locate the direction of the sound source, but the method has a problem with "blind spots". In addition, a serious limitation for this type of algorithm is lack of capability to determine the distance to the sound source. In order to overcome the limitations of this existing method, we propose a QRAS-based algorithm that uses rectangular-shaped technology. This method can determine the volume, direction, and distance to the sound source, which is an improvement over the cross-shaped based algorithm. The QRAS-based algorithm for the OSSD uses 6 AITDs derived from four microphones which are deployed in a rectangular-shaped configuration. The QRAS-based algorithm can solve existing problems of the cross-shaped based algorithms like blind spots, and it can determine the distance to the sound source. Experiments have demonstrated that the proposed QRAS-based algorithm for OSSD can reliably determine sound volume along with direction and distance to the sound source, which avoiding blind spots.