• Title/Summary/Keyword: 해상 객체 검출

Search Result 28, Processing Time 0.028 seconds

Compression Error Compensation Method for Multi-Resolution Feature Map (다해상도 피처 맵 압축 손상 보상 방법)

  • Kwon, Naseong;Lee, Minhun;Choi, Hansol;Park, Seungjin;Oh, Seoung-Jun;Kim, Younhee;Lee, Jooyoung;Jeong, SeYoon;Sim, Donggyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1343-1345
    • /
    • 2022
  • 본 논문에서는 다해상도 피라미드 피처 맵 압축 손상 보상 방법을 제안한다. 본 논문에서 제안하는 방법은 패킹된 C-레이어 피처 맵을 비디오 코덱으로 압축할 때, 저해상도 계층의 원본 피처 맵과 복원된 피처 맵 간의 차분 값을 구해 이를 고해상도 계층의 피처 맵에 더해줌으로써 부호화 과정에서 발생하는 오차를 보상하는 방법이다. 본 논문에서 제안하는 방법의 성능을 평가하기 위하여 OpenImageV6 데이터셋 중 1000 장에 대해 객체 검출 성능을 평가하였다. 본 논문에서 제안하는 피처 맵 압축 방법은 C-레이어 피처 맵 압축 방법 대비 bpp 와 mAP 의 BD-rate 관점에서 35.10%의 성능 향상을 보인다.

  • PDF

Design of ToF-Stereo Fusion Sensor System for 3D Spatial Scanning (3차원 공간 스캔을 위한 ToF-Stereo 융합 센서 시스템 설계)

  • Yun Ju Lee;Sun Kook Yoo
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.134-141
    • /
    • 2023
  • In this paper, we propose a ToF-Stereo fusion sensor system for 3D space scanning that increases the recognition rate of 3D objects, guarantees object detection quality, and is robust to the environment. The ToF-Stereo sensor fusion system uses a method of fusing the sensing values of the ToF sensor and the Stereo RGB sensor, and even if one sensor does not operate, the other sensor can be used to continuously detect an object. Since the quality of the ToF sensor and the Stereo RGB sensor varies depending on the sensing distance, sensing resolution, light reflectivity, and illuminance, a module that can adjust the function of the sensor based on reliability estimation is placed. The ToF-Stereo sensor fusion system combines the sensing values of the ToF sensor and the Stereo RGB sensor, estimates the reliability, and adjusts the function of the sensor according to the reliability to fuse the two sensing values, thereby improving the quality of the 3D space scan.

Comparative Analysis of CNN Deep Learning Model Performance Based on Quantification Application for High-Speed Marine Object Classification (고속 해상 객체 분류를 위한 양자화 적용 기반 CNN 딥러닝 모델 성능 비교 분석)

  • Lee, Seong-Ju;Lee, Hyo-Chan;Song, Hyun-Hak;Jeon, Ho-Seok;Im, Tae-ho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.59-68
    • /
    • 2021
  • As artificial intelligence(AI) technologies, which have made rapid growth recently, began to be applied to the marine environment such as ships, there have been active researches on the application of CNN-based models specialized for digital videos. In E-Navigation service, which is combined with various technologies to detect floating objects of clash risk to reduce human errors and prevent fires inside ships, real-time processing is of huge importance. More functions added, however, mean a need for high-performance processes, which raises prices and poses a cost burden on shipowners. This study thus set out to propose a method capable of processing information at a high rate while maintaining the accuracy by applying Quantization techniques of a deep learning model. First, videos were pre-processed fit for the detection of floating matters in the sea to ensure the efficient transmission of video data to the deep learning entry. Secondly, the quantization technique, one of lightweight techniques for a deep learning model, was applied to reduce the usage rate of memory and increase the processing speed. Finally, the proposed deep learning model to which video pre-processing and quantization were applied was applied to various embedded boards to measure its accuracy and processing speed and test its performance. The proposed method was able to reduce the usage of memory capacity four times and improve the processing speed about four to five times while maintaining the old accuracy of recognition.

A Feature Map Compression Method for Multi-resolution Feature Map with PCA-based Transformation (PCA 기반 변환을 통한 다해상도 피처 맵 압축 방법)

  • Park, Seungjin;Lee, Minhun;Choi, Hansol;Kim, Minsub;Oh, Seoung-Jun;Kim, Younhee;Do, Jihoon;Jeong, Se Yoon;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.56-68
    • /
    • 2022
  • In this paper, we propose a compression method for multi-resolution feature maps for VCM. The proposed compression method removes the redundancy between the channels and resolution levels of the multi-resolution feature map through PCA-based transformation. According to each characteristic, the basis vectors and mean vector used for transformation, and the transformation coefficient obtained through the transformation are compressed using a VVC-based coder and DeepCABAC. In order to evaluate performance of the proposed method, the object detection performance was measured for the OpenImageV6 and COCO 2017 validation set, and the BD-rate of MPEG-VCM anchor and feature map compression anchor proposed in this paper was compared using bpp and mAP. As a result of the experiment, the proposed method shows a 25.71% BD-rate performance improvement compared to feature map compression anchor in OpenImageV6. Furthermore, for large objects of the COCO 2017 validation set, the BD-rate performance is improved by up to 43.72% compared to the MPEG-VCM anchor.

A Small-area Hardware Implementation of EGML-based Moving Object Detection Processor (EGML 기반 이동객체 검출 프로세서의 저면적 하드웨어 구현)

  • Sung, Mi-ji;Shin, Kyung-wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.12
    • /
    • pp.2213-2220
    • /
    • 2017
  • This paper proposes an efficient approach for hardware implementation of moving object detection (MOD) processor using effective Gaussian mixture learning (EGML)-based background subtraction method. Arithmetic units used in background generation were implemented using LUT-based approximation to reduce hardware complexity. Hardware resources used for both background subtraction and Gaussian probability density calculation were shared. The MOD processor was verified by FPGA-in-the-loop simulation using MATLAB/Simulink. The MOD performance was evaluated by using six types of video defined in IEEE CDW-2014 dataset, which resulted the average of recall value of 0.7700, the average of precision value of 0.7170, and the average of F-measure value of 0.7293. The MOD processor was implemented with 882 slices and block RAM of $146{\times}36kbits$ on Virtex5 FPGA, resulting in 60% hardware reduction compared to conventional design based on EGML. It was estimated that the MOD processor could operate with 75 MHz clock, resulting in real-time processing of $800{\times}600$ video with a frame rate of 39 fps.

A Technique for Interpreting and Adjusting Depth Information of each Plane by Applying an Object Detection Algorithm to Multi-plane Light-field Image Converted from Hologram Image (Light-field 이미지로 변환된 다중 평면 홀로그램 영상에 대해 객체 검출 알고리즘을 적용한 평면별 객체의 깊이 정보 해석 및 조절 기법)

  • Young-Gyu Bae;Dong-Ha Shin;Seung-Yeol Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.31-41
    • /
    • 2023
  • Directly converting the focal depth and image size of computer-generated-hologram (CGH), which is obtained by calculating the interference pattern of light from the 3D image, is known to be quite difficult because of the less similarity between the CGH and the original image. This paper proposes a method for separately converting the each of focal length of the given CGH, which is composed of multi-depth images. Firstly, the proposed technique converts the 3D image reproduced from the CGH into a Light-Field (LF) image composed of a set of 2D images observed from various angles, and the positions of the moving objects for each observed views are checked using an object detection algorithm YOLOv5 (You-Only-Look-Once-version-5). After that, by adjusting the positions of objects, the depth-transformed LF image and CGH are generated. Numerical simulations and experimental results show that the proposed technique can change the focal length within a range of about 3 cm without significant loss of the image quality when applied to the image which have original depth of 10 cm, with a spatial light modulator which has a pixel size of 3.6 ㎛ and a resolution of 3840⨯2160.

Performance Analysis of Face Recognition by Face Image resolutions using CNN without Backpropergation and LDA (역전파가 제거된 CNN과 LDA를 이용한 얼굴 영상 해상도별 얼굴 인식률 분석)

  • Moon, Hae-Min;Park, Jin-Won;Pan, Sung Bum
    • Smart Media Journal
    • /
    • v.5 no.1
    • /
    • pp.24-29
    • /
    • 2016
  • To satisfy the needs of high-level intelligent surveillance system, it shall be able to extract objects and classify to identify precise information on the object. The representative method to identify one's identity is face recognition that is caused a change in the recognition rate according to environmental factors such as illumination, background and angle of camera. In this paper, we analyze the robust face recognition of face image by changing the distance through a variety of experiments. The experiment was conducted by real face images of 1m to 5m. The method of face recognition based on Linear Discriminant Analysis show the best performance in average 75.4% when a large number of face images per one person is used for training. However, face recognition based on Convolution Neural Network show the best performance in average 69.8% when the number of face images per one person is less than five. In addition, rate of low resolution face recognition decrease rapidly when the size of the face image is smaller than $15{\times}15$.

Automated Analyses of Ground-Penetrating Radar Images to Determine Spatial Distribution of Buried Cultural Heritage (매장 문화재 공간 분포 결정을 위한 지하투과레이더 영상 분석 자동화 기법 탐색)

  • Kwon, Moonhee;Kim, Seung-Sep
    • Economic and Environmental Geology
    • /
    • v.55 no.5
    • /
    • pp.551-561
    • /
    • 2022
  • Geophysical exploration methods are very useful for generating high-resolution images of underground structures, and such methods can be applied to investigation of buried cultural properties and for determining their exact locations. In this study, image feature extraction and image segmentation methods were applied to automatically distinguish the structures of buried relics from the high-resolution ground-penetrating radar (GPR) images obtained at the center of Silla Kingdom, Gyeongju, South Korea. The major purpose for image feature extraction analyses is identifying the circular features from building remains and the linear features from ancient roads and fences. Feature extraction is implemented by applying the Canny edge detection and Hough transform algorithms. We applied the Hough transforms to the edge image resulted from the Canny algorithm in order to determine the locations the target features. However, the Hough transform requires different parameter settings for each survey sector. As for image segmentation, we applied the connected element labeling algorithm and object-based image analysis using Orfeo Toolbox (OTB) in QGIS. The connected components labeled image shows the signals associated with the target buried relics are effectively connected and labeled. However, we often find multiple labels are assigned to a single structure on the given GPR data. Object-based image analysis was conducted by using a Large-Scale Mean-Shift (LSMS) image segmentation. In this analysis, a vector layer containing pixel values for each segmented polygon was estimated first and then used to build a train-validation dataset by assigning the polygons to one class associated with the buried relics and another class for the background field. With the Random Forest Classifier, we find that the polygons on the LSMS image segmentation layer can be successfully classified into the polygons of the buried relics and those of the background. Thus, we propose that these automatic classification methods applied to the GPR images of buried cultural heritage in this study can be useful to obtain consistent analyses results for planning excavation processes.