• Title/Summary/Keyword: 객체기반 영상분류

Search Result 215, Processing Time 0.028 seconds

A Study on Residual U-Net for Semantic Segmentation based on Deep Learning (딥러닝 기반의 Semantic Segmentation을 위한 Residual U-Net에 관한 연구)

  • Shin, Seokyong;Lee, SangHun;Han, HyunHo
    • Journal of Digital Convergence
    • /
    • v.19 no.6
    • /
    • pp.251-258
    • /
    • 2021
  • In this paper, we proposed an encoder-decoder model utilizing residual learning to improve the accuracy of the U-Net-based semantic segmentation method. U-Net is a deep learning-based semantic segmentation method and is mainly used in applications such as autonomous vehicles and medical image analysis. The conventional U-Net occurs loss in feature compression process due to the shallow structure of the encoder. The loss of features causes a lack of context information necessary for classifying objects and has a problem of reducing segmentation accuracy. To improve this, The proposed method efficiently extracted context information through an encoder using residual learning, which is effective in preventing feature loss and gradient vanishing problems in the conventional U-Net. Furthermore, we reduced down-sampling operations in the encoder to reduce the loss of spatial information included in the feature maps. The proposed method showed an improved segmentation result of about 12% compared to the conventional U-Net in the Cityscapes dataset experiment.

Object Tracking Based on Exactly Reweighted Online Total-Error-Rate Minimization (정확히 재가중되는 온라인 전체 에러율 최소화 기반의 객체 추적)

  • JANG, Se-In;PARK, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.53-65
    • /
    • 2019
  • Object tracking is one of important steps to achieve video-based surveillance systems. Object tracking is considered as an essential task similar to object detection and recognition. In order to perform object tracking, various machine learning methods (e.g., least-squares, perceptron and support vector machine) can be applied for different designs of tracking systems. In general, generative methods (e.g., principal component analysis) were utilized due to its simplicity and effectiveness. However, the generative methods were only focused on modeling the target object. Due to this limitation, discriminative methods (e.g., binary classification) were adopted to distinguish the target object and the background. Among the machine learning methods for binary classification, total error rate minimization can be used as one of successful machine learning methods for binary classification. The total error rate minimization can achieve a global minimum due to a quadratic approximation to a step function while other methods (e.g., support vector machine) seek local minima using nonlinear functions (e.g., hinge loss function). Due to this quadratic approximation, the total error rate minimization could obtain appropriate properties in solving optimization problems for binary classification. However, this total error rate minimization was based on a batch mode setting. The batch mode setting can be limited to several applications under offline learning. Due to limited computing resources, offline learning could not handle large scale data sets. Compared to offline learning, online learning can update its solution without storing all training samples in learning process. Due to increment of large scale data sets, online learning becomes one of essential properties for various applications. Since object tracking needs to handle data samples in real time, online learning based total error rate minimization methods are necessary to efficiently address object tracking problems. Due to the need of the online learning, an online learning based total error rate minimization method was developed. However, an approximately reweighted technique was developed. Although the approximation technique is utilized, this online version of the total error rate minimization could achieve good performances in biometric applications. However, this method is assumed that the total error rate minimization can be asymptotically achieved when only the number of training samples is infinite. Although there is the assumption to achieve the total error rate minimization, the approximation issue can continuously accumulate learning errors according to increment of training samples. Due to this reason, the approximated online learning solution can then lead a wrong solution. The wrong solution can make significant errors when it is applied to surveillance systems. In this paper, we propose an exactly reweighted technique to recursively update the solution of the total error rate minimization in online learning manner. Compared to the approximately reweighted online total error rate minimization, an exactly reweighted online total error rate minimization is achieved. The proposed exact online learning method based on the total error rate minimization is then applied to object tracking problems. In our object tracking system, particle filtering is adopted. In particle filtering, our observation model is consisted of both generative and discriminative methods to leverage the advantages between generative and discriminative properties. In our experiments, our proposed object tracking system achieves promising performances on 8 public video sequences over competing object tracking systems. The paired t-test is also reported to evaluate its quality of the results. Our proposed online learning method can be extended under the deep learning architecture which can cover the shallow and deep networks. Moreover, online learning methods, that need the exact reweighting process, can use our proposed reweighting technique. In addition to object tracking, the proposed online learning method can be easily applied to object detection and recognition. Therefore, our proposed methods can contribute to online learning community and object tracking, detection and recognition communities.

Analysis on Topographic Normalization Methods for 2019 Gangneung-East Sea Wildfire Area Using PlanetScope Imagery (2019 강릉-동해 산불 피해 지역에 대한 PlanetScope 영상을 이용한 지형 정규화 기법 분석)

  • Chung, Minkyung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.2_1
    • /
    • pp.179-197
    • /
    • 2020
  • Topographic normalization reduces the terrain effects on reflectance by adjusting the brightness values of the image pixels to be equal if the pixels cover the same land-cover. Topographic effects are induced by the imaging conditions and tend to be large in high mountainousregions. Therefore, image analysis on mountainous terrain such as estimation of wildfire damage assessment requires appropriate topographic normalization techniques to yield accurate image processing results. However, most of the previous studies focused on the evaluation of topographic normalization on satellite images with moderate-low spatial resolution. Thus, the alleviation of topographic effects on multi-temporal high-resolution images was not dealt enough. In this study, the evaluation of terrain normalization was performed for each band to select the optimal technical combinations for rapid and accurate wildfire damage assessment using PlanetScope images. PlanetScope has considerable potential in the disaster management field as it satisfies the rapid image acquisition by providing the 3 m resolution daily image with global coverage. For comparison of topographic normalization techniques, seven widely used methods were employed on both pre-fire and post-fire images. The analysis on bi-temporal images suggests the optimal combination of techniques which can be applied on images with different land-cover composition. Then, the vegetation index was calculated from the images after the topographic normalization with the proposed method. The wildfire damage detection results were obtained by thresholding the index and showed improvementsin detection accuracy for both object-based and pixel-based image analysis. In addition, the burn severity map was constructed to verify the effects oftopographic correction on a continuous distribution of brightness values.

2-Stage Detection and Classification Network for Kiosk User Analysis (디스플레이형 자판기 사용자 분석을 위한 이중 단계 검출 및 분류 망)

  • Seo, Ji-Won;Kim, Mi-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.668-674
    • /
    • 2022
  • Machine learning techniques using visual data have high usability in fields of industry and service such as scene recognition, fault detection, security and user analysis. Among these, user analysis through the videos from CCTV is one of the practical way of using vision data. Also, many studies about lightweight artificial neural network have been published to increase high usability for mobile and embedded environment so far. In this study, we propose the network combining the object detection and classification for mobile graphic processing unit. This network detects pedestrian and face, classifies age and gender from detected face. Proposed network is constructed based on MobileNet, YOLOv2 and skip connection. Both detection and classification models are trained individually and combined as 2-stage structure. Also, attention mechanism is used to improve detection and classification ability. Nvidia Jetson Nano is used to run and evaluate the proposed system.

Development of A Multi-sensor Fusion-based Traffic Information Acquisition System with Robust to Environmental Changes using Mono Camera, Radar and Infrared Range Finder (환경변화에 강인한 단안카메라 레이더 적외선거리계 센서 융합 기반 교통정보 수집 시스템 개발)

  • Byun, Ki-hoon;Kim, Se-jin;Kwon, Jang-woo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.2
    • /
    • pp.36-54
    • /
    • 2017
  • The purpose of this paper is to develop a multi-sensor fusion-based traffic information acquisition system with robust to environmental changes. it combines the characteristics of each sensor and is more robust to the environmental changes than the video detector. Moreover, it is not affected by the time of day and night, and has less maintenance cost than the inductive-loop traffic detector. This is accomplished by synthesizing object tracking informations based on a radar, vehicle classification informations based on a video detector and reliable object detections of a infrared range finder. To prove the effectiveness of the proposed system, I conducted experiments for 6 hours over 5 days of the daytime and early evening on the pedestrian - accessible road. According to the experimental results, it has 88.7% classification accuracy and 95.5% vehicle detection rate. If the parameters of this system is optimized to adapt to the experimental environment changes, it is expected that it will contribute to the advancement of ITS.

Development of Open Set Recognition-based Multiple Damage Recognition Model for Bridge Structure Damage Detection (교량 구조물 손상탐지를 위한 Open Set Recognition 기반 다중손상 인식 모델 개발)

  • Kim, Young-Nam;Cho, Jun-Sang;Kim, Jun-Kyeong;Kim, Moon-Hyun;Kim, Jin-Pyung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.42 no.1
    • /
    • pp.117-126
    • /
    • 2022
  • Currently, the number of bridge structures in Korea is continuously increasing and enlarged, and the number of old bridges that have been in service for more than 30 years is also steadily increasing. Bridge aging is being treated as a serious social problem not only in Korea but also around the world, and the existing manpower-centered inspection method is revealing its limitations. Recently, various bridge damage detection studies using deep learning-based image processing algorithms have been conducted, but due to the limitations of the bridge damage data set, most of the bridge damage detection studies are mainly limited to one type of crack, which is also based on a close set classification model. As a detection method, when applied to an actual bridge image, a serious misrecognition problem may occur due to input images of an unknown class such as a background or other objects. In this study, five types of bridge damage including crack were defined and a data set was built, trained as a deep learning model, and an open set recognition-based bridge multiple damage recognition model applied with OpenMax algorithm was constructed. And after performing classification and recognition performance evaluation on the open set including untrained images, the results were analyzed.

Standardization Strategy on 3D Animation Contents (3D 애니메이션 콘텐츠의 SCORM 기반 표준화 전략)

  • Jang, Jae-Kyung;Kim, Sun-Hye;Kim, Ho-Sung
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.218-222
    • /
    • 2006
  • In making 3D animation with digital technology, it is necessary to increase productivity and reusability by managing production pipeline systematically through standardization of animation content. For this purpose, we try to develop the animation content management system that can manage all kind of information on the production pipeline, based on SCORM of e-teaming by considering production, publication and re-editing. A scene as the unit of visual semantics is standardize into an object that contains meta-data of place, cast, weather, season, time and viewpoint about the scene. The meta-data of content includes a lot of information of copyright, publication, description, etc, so that it plays an important role on the management and the publication. If an effective management system of meta-data such as ontology will be implemented, it is possible to search multimedia contents powerfully. Hence, it will bring on production and publication of UCC. Using the meta-data of content object, user and producer can easily search and reuse the contents. Hence, they can choose the contents object according to their preference and reproduce their own creative animation by reorganizing and packaging the selected objects.

  • PDF

Deep Learning Based Rescue Requesters Detection Algorithm for Physical Security in Disaster Sites (재난 현장 물리적 보안을 위한 딥러닝 기반 요구조자 탐지 알고리즘)

  • Kim, Da-hyeon;Park, Man-bok;Ahn, Jun-ho
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.57-64
    • /
    • 2022
  • If the inside of a building collapses due to a disaster such as fire, collapse, or natural disaster, the physical security inside the building is likely to become ineffective. Here, physical security is needed to minimize the human casualties and physical damages in the collapsed building. Therefore, this paper proposes an algorithm to minimize the damage in a disaster situation by fusing existing research that detects obstacles and collapsed areas in the building and a deep learning-based object detection algorithm that minimizes human casualties. The existing research uses a single camera to determine whether the corridor environment in which the robot is currently located has collapsed and detects obstacles that interfere with the search and rescue operation. Here, objects inside the collapsed building have irregular shapes due to the debris or collapse of the building, and they are classified and detected as obstacles. We also propose a method to detect rescue requesters-the most important resource in the disaster situation-and minimize human casualties. To this end, we collected open-source disaster images and image data of disaster situations and calculated the accuracy of detecting rescue requesters in disaster situations through various deep learning-based object detection algorithms. In this study, as a result of analyzing the algorithms that detect rescue requesters in disaster situations, we have found that the YOLOv4 algorithm has an accuracy of 0.94, proving that it is most suitable for use in actual disaster situations. This paper will be helpful for performing efficient search and rescue in disaster situations and achieving a high level of physical security, even in collapsed buildings.

Automatic Facial Expression Recognition using Tree Structures for Human Computer Interaction (HCI를 위한 트리 구조 기반의 자동 얼굴 표정 인식)

  • Shin, Yun-Hee;Ju, Jin-Sun;Kim, Eun-Yi;Kurata, Takeshi;Jain, Anil K.;Park, Se-Hyun;Jung, Kee-Chul
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.3
    • /
    • pp.60-68
    • /
    • 2007
  • In this paper, we propose an automatic facial expressions recognition system to analyze facial expressions (happiness, disgust, surprise and neutral) using tree structures based on heuristic rules. The facial region is first obtained using skin-color model and connected-component analysis (CCs). Thereafter the origins of user's eyes are localized using neural network (NN)-based texture classifier, then the facial features using some heuristics are localized. After detection of facial features, the facial expression recognition are performed using decision tree. To assess the validity of the proposed system, we tested the proposed system using 180 facial image in the MMI, JAFFE, VAK DB. The results show that our system have the accuracy of 93%.

  • PDF

Optimization of Deep Learning Model Based on Genetic Algorithm for Facial Expression Recognition (얼굴 표정 인식을 위한 유전자 알고리즘 기반 심층학습 모델 최적화)

  • Park, Jang-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.1
    • /
    • pp.85-92
    • /
    • 2020
  • Deep learning shows outstanding performance in image and video analysis, such as object classification, object detection and semantic segmentation. In this paper, it is analyzed that the performances of deep learning models can be affected by characteristics of train dataset. It is proposed as a method for selecting activation function and optimization algorithm of deep learning to classify facial expression. Classification performances are compared and analyzed by applying various algorithms of each component of deep learning model for CK+, MMI, and KDEF datasets. As results of simulation, it is shown that genetic algorithm can be an effective solution for optimizing components of deep learning model.