• Title/Summary/Keyword: Object detection optimization method

Search Result 27, Processing Time 0.023 seconds

Image Stitching focused on Priority Object using Deep Learning based Object Detection (딥러닝 기반 사물 검출을 활용한 우선순위 사물 중심의 영상 스티칭)

  • Rhee, Seongbae;Kang, Jeonho;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.25 no.6
    • /
    • pp.882-897
    • /
    • 2020
  • Recently, the use of immersive media contents representing Panorama and 360° video is increasing. Since the viewing angle is limited to generate the content through a general camera, image stitching is mainly used to combine images taken with multiple cameras into one image having a wide field of view. However, if the parallax between the cameras is large, parallax distortion may occur in the stitched image, which disturbs the user's content immersion, thus an image stitching overcoming parallax distortion is required. The existing Seam Optimization based image stitching method to overcome parallax distortion uses energy function or object segment information to reflect the location information of objects, but the initial seam generation location, background information, performance of the object detector, and placement of objects may limit application. Therefore, in this paper, we propose an image stitching method that can overcome the limitations of the existing method by adding a weight value set differently according to the type of object to the energy value using object detection based on deep learning.

Research on Artificial Intelligence Based De-identification Technique of Personal Information Area at Video Data (영상데이터의 개인정보 영역에 대한 인공지능 기반 비식별화 기법 연구)

  • In-Jun Song;Cha-Jong Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.1
    • /
    • pp.19-25
    • /
    • 2024
  • This paper proposes an artificial intelligence-based personal information area object detection optimization method in an embedded system to de-identify personal information in video data. As an object detection optimization method, first, in order to increase the detection rate for personal information areas when detecting objects, a gyro sensor is used to collect the shooting angle of the image data when acquiring the image, and the image data is converted into a horizontal image through the collected shooting angle. Based on this, each learning model was created according to changes in the size of the image resolution of the learning data and changes in the learning method of the learning engine, and the effectiveness of the optimal learning model was selected and evaluated through an experimental method. As a de-identification method, a shuffling-based masking method was used, and double-key-based encryption of the masking information was used to prevent restoration by others. In order to reuse the original image, the original image could be restored through a security key. Through this, we were able to secure security for high personal information areas and improve usability through original image restoration. The research results of this paper are expected to contribute to industrial use of data without personal information leakage and to reducing the cost of personal information protection in industrial fields using video through de-identification of personal information areas included in video data.

Active Contour Model Based Object Contour Detection Using Genetic Algorithm with Wavelet Based Image Preprocessing

  • Mun, Kyeong-Jun;Kang, Hyeon-Tae;Lee, Hwa-Seok;Yoon, Yoo-Sool;Lee, Chang-Moon;Park, June-Ho
    • International Journal of Control, Automation, and Systems
    • /
    • v.2 no.1
    • /
    • pp.100-106
    • /
    • 2004
  • In this paper, we present a novel, rapid approach for the detection of brain tumors and deformity boundaries in medical images using a genetic algorithm with wavelet based preprocessing. The contour detection problem is formulated as an optimization process that seeks the contour of the object in a manner of minimizing an energy function based on an active contour model. The brain tumor segmentation contour, however, cannot be detected in case that a higher gradient intensity exists other than the interested brain tumor and deformities. Our method for discerning brain tumors and deformities from unwanted adjacent tissues is proposed. The proposed method can be used in medical image analysis because the exact contour of the brain tumor and deformities is followed by precise diagnosis of the deformities.

A Segmentation Method for a Moving Object on A Static Complex Background Scene. (복잡한 배경에서 움직이는 물체의 영역분할에 관한 연구)

  • Park, Sang-Min;Kwon, Hui-Ung;Kim, Dong-Sung;Jeong, Kyu-Sik
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.3
    • /
    • pp.321-329
    • /
    • 1999
  • Moving Object segmentation extracts an interested moving object on a consecutive image frames, and has been used for factory automation, autonomous navigation, video surveillance, and VOP(Video Object Plane) detection in a MPEG-4 method. This paper proposes new segmentation method using difference images are calculated with three consecutive input image frames, and used to calculate both coarse object area(AI) and it's movement area(OI). An AI is extracted by removing background using background area projection(BAP). Missing parts in the AI is recovered with help of the OI. Boundary information of the OI confines missing parts of the object and gives inital curves for active contour optimization. The optimized contours in addition to the AI make the boundaries of the moving object. Experimental results of a fast moving object on a complex background scene are included.

  • PDF

Online Video Synopsis via Multiple Object Detection

  • Lee, JaeWon;Kim, DoHyeon;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.19-28
    • /
    • 2019
  • In this paper, an online video summarization algorithm based on multiple object detection is proposed. As crime has been on the rise due to the recent rapid urbanization, the people's appetite for safety has been growing and the installation of surveillance cameras such as a closed-circuit television(CCTV) has been increasing in many cities. However, it takes a lot of time and labor to retrieve and analyze a huge amount of video data from numerous CCTVs. As a result, there is an increasing demand for intelligent video recognition systems that can automatically detect and summarize various events occurring on CCTVs. Video summarization is a method of generating synopsis video of a long time original video so that users can watch it in a short time. The proposed video summarization method can be divided into two stages. The object extraction step detects a specific object in the video and extracts a specific object desired by the user. The video summary step creates a final synopsis video based on the objects extracted in the previous object extraction step. While the existed methods do not consider the interaction between objects from the original video when generating the synopsis video, in the proposed method, new object clustering algorithm can effectively maintain interaction between objects in original video in synopsis video. This paper also proposed an online optimization method that can efficiently summarize the large number of objects appearing in long-time videos. Finally, Experimental results show that the performance of the proposed method is superior to that of the existing video synopsis algorithm.

Computer Vision-based Continuous Large-scale Site Monitoring System through Edge Computing and Small-Object Detection

  • Kim, Yeonjoo;Kim, Siyeon;Hwang, Sungjoo;Hong, Seok Hwan
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.1243-1244
    • /
    • 2022
  • In recent years, the growing interest in off-site construction has led to factories scaling up their manufacturing and production processes in the construction sector. Consequently, continuous large-scale site monitoring in low-variability environments, such as prefabricated components production plants (precast concrete production), has gained increasing importance. Although many studies on computer vision-based site monitoring have been conducted, challenges for deploying this technology for large-scale field applications still remain. One of the issues is collecting and transmitting vast amounts of video data. Continuous site monitoring systems are based on real-time video data collection and analysis, which requires excessive computational resources and network traffic. In addition, it is difficult to integrate various object information with different sizes and scales into a single scene. Various sizes and types of objects (e.g., workers, heavy equipment, and materials) exist in a plant production environment, and these objects should be detected simultaneously for effective site monitoring. However, with the existing object detection algorithms, it is difficult to simultaneously detect objects with significant differences in size because collecting and training massive amounts of object image data with various scales is necessary. This study thus developed a large-scale site monitoring system using edge computing and a small-object detection system to solve these problems. Edge computing is a distributed information technology architecture wherein the image or video data is processed near the originating source, not on a centralized server or cloud. By inferring information from the AI computing module equipped with CCTVs and communicating only the processed information with the server, it is possible to reduce excessive network traffic. Small-object detection is an innovative method to detect different-sized objects by cropping the raw image and setting the appropriate number of rows and columns for image splitting based on the target object size. This enables the detection of small objects from cropped and magnified images. The detected small objects can then be expressed in the original image. In the inference process, this study used the YOLO-v5 algorithm, known for its fast processing speed and widely used for real-time object detection. This method could effectively detect large and even small objects that were difficult to detect with the existing object detection algorithms. When the large-scale site monitoring system was tested, it performed well in detecting small objects, such as workers in a large-scale view of construction sites, which were inaccurately detected by the existing algorithms. Our next goal is to incorporate various safety monitoring and risk analysis algorithms into this system, such as collision risk estimation, based on the time-to-collision concept, enabling the optimization of safety routes by accumulating workers' paths and inferring the risky areas based on workers' trajectory patterns. Through such developments, this continuous large-scale site monitoring system can guide a construction plant's safety management system more effectively.

  • PDF

Performance Evaluation of YOLOv5 Model according to Various Hyper-parameters in Nuclear Medicine Phantom Images (핵의학 팬텀 영상에서 초매개변수 변화에 따른 YOLOv5 모델의 성능평가)

  • Min-Gwan Lee;Chanrok Park
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.1
    • /
    • pp.21-26
    • /
    • 2024
  • The one of the famous deep learning models for object detection task is you only look once version 5 (YOLOv5) framework based on the one stage architecture. In addition, YOLOv5 model indicated high performance for accurate lesion detection using the bottleneck CSP layer and skip connection function. The purpose of this study was to evaluate the performance of YOLOv5 framework according to various hyperparameters in position emission tomogrpahy (PET) phantom images. The dataset was obtained from QIN PET segmentation challenge in 500 slices. We set the bounding box to generate ground truth dataset using labelImg software. The hyperparameters for network train were applied by changing optimization function (SDG, Adam, and AdamW), activation function (SiLU, LeakyRelu, Mish, and Hardwish), and YOLOv5 model size (nano, small, large, and xlarge). The intersection over union (IOU) method was used for performance evaluation. As a results, the condition of outstanding performance is to apply AdamW, Hardwish, and nano size for optimization function, activation function and model version, respectively. In conclusion, we confirmed the usefulness of YOLOv5 network for object detection performance in nuclear medicine images.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Feature Detection using Measured 3D Data and Image Data (3차원 측정 데이터와 영상 데이터를 이용한 특징 형상 검출)

  • Kim, Hansol;Jung, Keonhwa;Chang, Minho;Kim, Junho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.30 no.6
    • /
    • pp.601-606
    • /
    • 2013
  • 3D scanning is a technique to measure the 3D shape information of the object. Shape information obtained by 3D scanning is expressed either as point cloud or as polygon mesh type data that can be widely used in various areas such as reverse engineering and quality inspection. 3D scanning should be performed as accurate as possible since the scanned data is highly required to detect the features on an object in order to scan the shape of the object more precisely. In this study, we propose the method on finding the location of feature more accurately, based on the extended Biplane SNAKE with global optimization. In each iteration, we project the feature lines obtained by the extended Biplane SNAKE into each image plane and move the feature lines to the features on each image. We have applied this approach to real models to verify the proposed optimization algorithm.

Optical System Design for Thermal Target Recognition by Spiral Scanning [TRSS]

  • Kim, Jai-Soon;Yoon, Jin-Kyung;Lee, Ho-Chan;Lee, Jai-Hyung;Kim, Hye-Kyung;Lee, Seung-Churl;Ahn, Keun-Ok
    • Journal of the Optical Society of Korea
    • /
    • v.8 no.4
    • /
    • pp.174-181
    • /
    • 2004
  • Various kinds of systems, that can do target recognition and position detection simultaneously by using infrared sensing detectors, have been developed. In this paper, the detection system TRSS (Thermal target Recognition by Spiral Scanning) adopts linear array shaped uncooled IR detector and uses spiral type fast scanning method for relative position detection of target objects, which radiate an IR region wavelength spectrum. It can detect thermal energy radiating from a 9 m-size target object as far as 200 m distance. And the maximum field of a detector is fully filled with the same size of target object at the minimum approaching distance 50 m. We investigate two types of lens systems. One is a singlet lens and the other is a doublet lens system. Every system includes one aspheric surface and free positioned aperture stop. Many designs of F/1.5 system with ${\pm}5.2^{\circ}$ field at the Efl=20, 30 mm conditions for single element and double elements lens system respectively are compared in their resolution performance [MTF] according to the aspheric surface and stop position changing on their optimization process. Optimum design is established including mechanical boundary conditions and manufacturing considerations.