• Title/Summary/Keyword: 검출 모델

Search Result 1,728, Processing Time 0.031 seconds

Robust Face Detection Using Hybrid Filters and Convolutional Neural Networks (복합형 필터와 CNN 모델을 이용한 효과적인 얼굴 검출 기법)

  • Cho, Il-Gook;Park, Hyun-Jung;Kim, Ho-Joon
    • Annual Conference of KIPS
    • /
    • 2005.05a
    • /
    • pp.451-454
    • /
    • 2005
  • 본 논문에서는 수정된 CNN(Convolutional Neural Network) 모델과 다중 필터가 상호 결합된 형태의 얼굴 패턴 검출 기법을 소개 한다. 이는 로봇 시각의 응용문제에서 실내영상의 실시간 인식문제를 대상으로 한다. 검출 과정의 효율성 향상을 위하여 도입된 다중 필터는 후보 영역의 개수와 범위를 줄일 수 있게 한다. 제안된 모델에서 CNN 신경망은 가보변환(Gabor Transform)계층을 두어 검출 과정의 첫 단계에서 영상 내의 기본 특징 지도를 생성 하도록 하였다. 보다 강인한 검출기능을 위하여 조명보정 기법이 시스템의 전처리 단계로 구현 된다. 실제 영상을 통한 실험 결과로부터 제안된 이론의 타당성을 고찰 한다.

  • PDF

Performance analysis of YOLOv5 and Faster R-CNN for real-time crosswalk pedestrian detection (심층 신경망을 이용한 실시간 횡단보도 보행자 검출 방법 분석)

  • Bang, Junho;Park, Min-Ki;Song, Chaeyong;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1184-1186
    • /
    • 2022
  • 횡단보도에서의 보행자 교통사고 방지를 위한 다양한 방법들이 연구되고 있다. 본 논문에서는 점멸 신호등 상황에서 보행자 교통사고를 감소시키기 위해 영상을 이용한 심층 신경망 기반 횡단보도 보행자 검출 방법을 소개한다. YOLOv5 와 Faster R-CNN 각각을 기반으로 다양한 버전의 횡단보도 보행자 검출기를 구현하고, 이번 실험에서 중점이 되는 이들의 수행 시간을 비교 평가하고 mAP@0.5 가 어느 정도인지 판단하여 가장 적합한 모델을 판단한다. 실험 결과 실시간 처리 측면에서 YOLOs 모델이 84 fps 를 달성함으로써 실시간 보행자 검출에 가장 좋은 성능을 보였다. 횡단보도의 상황은 상시 빠르게 변하므로 가장 빠른 처리 성능을 기록한 YOLOv5s 모델이 실시간 횡단보도 보행자 검출 시스템에 가장 적합한 것으로 판단된다.

  • PDF

Statistical Voice Activity Defector Based on Signal Subspace Model (신호 준공간 모델에 기반한 통계적 음성 검출기)

  • Ryu, Kwang-Chun;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.372-378
    • /
    • 2008
  • Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.

Rapid Detection of Important Events in Baseball Video Using multi-Modal Analysis (멀티 모달 분석을 통한 야구 동영상에서의 실시간 중요 이벤트 검출 알고리즘)

  • Lee, Jin-Ho;Kim, Hyoung-Gook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.11a
    • /
    • pp.133-136
    • /
    • 2009
  • 본 논문에서는 야구 동영상에서 실시간으로 중요 이벤트 장면을 검출하는 알고리즘을 제안한다. 제안하는 알고리즘은 영상정보를 분석하여 Pitching 신과 Close Up 신을 추출하여 Play 구간을 검출하고, 오디오 정보를 분석하여 오디오 이벤트 구간을 검출한다. Play 구간의시작인 Pitching 신을 검출하기 위해서는 오프라인 모델과 온라인 모델을 혼용하여 다양한 환경에 상관없이 높은 성능을 보일 수 있도록 하였으며, 아나운서의 억양 및 관중의 함성의 고조도가 높아지는 구간을 기반으로 검출된 오디오 이벤트 구간을 영상 정보 분석을 통해 획득된 Play 장면구간을 결합하여 중요 이벤트 장면 검출의 정확도를 높일 수 있도록 하였다. 실험에 의하면 제안하는 알고리즘은 1초의 동영상 데이터를 처리하는데 0.024초의 소요 시간이 필요하고, 0.89의 Recall과 0.975의 Precision 검출 성능을 보임을 알 수 있었다.

  • PDF

A Bi-directional Information Learning Method Using Reverse Playback Video for Fully Supervised Temporal Action Localization (완전지도 시간적 행동 검출에서 역재생 비디오를 이용한 양방향 정보 학습 방법)

  • Huiwon Gwon;Hyejeong Jo;Sunhee Jo;Chanho Jung
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.145-149
    • /
    • 2024
  • Recently, research on temporal action localization has been actively conducted. In this paper, unlike existing methods, we propose two approaches for learning bidirectional information by creating reverse playback videos for fully supervised temporal action localization. One approach involves creating training data by combining reverse playback videos and forward playback videos, while the other approach involves training separate models on videos with different playback directions. Experiments were conducted on the THUMOS-14 dataset using TALLFormer. When using both reverse and forward playback videos as training data, the performance was 5.1% lower than that of the existing method. On the other hand, using a model ensemble shows a 1.9% improvement in performance.

A Study on the Hair Line detection Using Feature Points Matching in Hair Beauty Fashion Design (헤어 뷰티 패션 디자인 선별을 위한 특징 점 정합을 이용한 헤어 라인 검출)

  • 송선희;나상동;배용근
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.5
    • /
    • pp.934-940
    • /
    • 2003
  • In this paper, hair beauty fashion design feature points detection system is proposed. A hair models and hair face is represented as a graph where the nodes are placed at facial feature points labeled by their Gabor features and the edges are describes their spatial relations. An innovative flexible feature matching is proposed to perform features correspondence between hair models and the input image. This matching hair model works like random diffusion process in the image space by employing the locally competitive and globally corporative mechanism. The system works nicely on the face images under complicated background. pose variations and distorted by accessories. We demonstrate the benefits of our approach by its implementation on the face identification system.

A Lightweight Deep Learning Model for Text Detection in Fashion Design Sketch Images for Digital Transformation

  • Ju-Seok Shin;Hyun-Woo Kang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.17-25
    • /
    • 2023
  • In this paper, we propose a lightweight deep learning architecture tailored for efficient text detection in fashion design sketch images. Given the increasing prominence of Digital Transformation in the fashion industry, there is a growing emphasis on harnessing digital tools for creating fashion design sketches. As digitization becomes more pervasive in the fashion design process, the initial stages of text detection and recognition take on pivotal roles. In this study, a lightweight network was designed by building upon existing text detection deep learning models, taking into consideration the unique characteristics of apparel design drawings. Additionally, a separately collected dataset of apparel design drawings was added to train the deep learning model. Experimental results underscore the superior performance of our proposed deep learning model, outperforming existing text detection models by approximately 20% when applied to fashion design sketch images. As a result, this paper is expected to contribute to the Digital Transformation in the field of clothing design by means of research on optimizing deep learning models and detecting specialized text information.

Scream Sound Detection Based on Universal Background Model Under Various Sound Environments (다양한 소리 환경에서 UBM 기반의 비명 소리 검출)

  • Chung, Yong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.3
    • /
    • pp.485-492
    • /
    • 2017
  • GMM has been one of the most popular methods for scream sound detection. In the conventional GMM, the whole training data is divided into scream sound and non-scream sound, and the GMM is trained for each of them in the training process. Motivated by the idea that the process of scream sound detection is very similar to that of speaker recognition, the UBM which has been used quite successfully in speaker recognition, is proposed for use in scream sound detection in this study. We could find that UBM shows better performance than the traditional GMM from the experimental results.

Non-intrusive Calibration for User Interaction based Gaze Estimation (사용자 상호작용 기반의 시선 검출을 위한 비강압식 캘리브레이션)

  • Lee, Tae-Gyun;Yoo, Jang-Hee
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.45-53
    • /
    • 2020
  • In this paper, we describe a new method for acquiring calibration data using a user interaction process, which occurs continuously during web browsing in gaze estimation, and for performing calibration naturally while estimating the user's gaze. The proposed non-intrusive calibration is a tuning process over the pre-trained gaze estimation model to adapt to a new user using the obtained data. To achieve this, a generalized CNN model for estimating gaze is trained, then the non-intrusive calibration is employed to adapt quickly to new users through online learning. In experiments, the gaze estimation model is calibrated with a combination of various user interactions to compare the performance, and improved accuracy is achieved compared to existing methods.

Object Detection Performance Analysis between On-GPU and On-Board Analysis for Military Domain Images

  • Du-Hwan Hur;Dae-Hyeon Park;Deok-Woong Kim;Jae-Yong Baek;Jun-Hyeong Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.157-164
    • /
    • 2024
  • In this paper, we propose a discussion that the feasibility of deploying a deep learning-based detector on the resource-limited board. Although many studies evaluate the detector on machines with high-performed GPUs, evaluation on the board with limited computation resources is still insufficient. Therefore, in this work, we implement the deep-learning detectors and deploy them on the compact board by parsing and optimizing a detector. To figure out the performance of deep learning based detectors on limited resources, we monitor the performance of several detectors with different H/W resource. On COCO detection datasets, we compare and analyze the evaluation results of detection model in On-Board and the detection model in On-GPU in terms of several metrics with mAP, power consumption, and execution speed (FPS). To demonstrate the effect of applying our detector for the military area, we evaluate them on our dataset consisting of thermal images considering the flight battle scenarios. As a results, we investigate the strength of deep learning-based on-board detector, and show that deep learning-based vision models can contribute in the flight battle scenarios.