• Title/Summary/Keyword: DeepSORT

Search Result 57, Processing Time 0.022 seconds

Development of recognition and alert system for dangerous road object using deep learning algorithms (딥러닝 영상인식을 이용한 도로 위 위험 객체 알림 시스템)

  • Kim, Joong-wan;Jo, Hyun-jun;Hwang, Bo-ouk;Jeong, Jun-ho;Choi, Jong-geon;Yun, Tae-jin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.479-480
    • /
    • 2022
  • 고속으로 차량이 주행하는 도로에서 정지 차량이나 낙하물은 큰 사고를 유발하기에 이에 대한 대처 방안이 요구되고 있다. 갑작스런 정지 차량의 경우 예상 불가능하며, 낙하물은 순찰대를 편성하여 주기적으로 수거하고 있으나 즉각적인 대응이 어렵다. 해당 문제 해결을 위해 본 논문에서는 딥러닝 실시간 객체인식기술을 적용하여 정지 차량 및 도로 위 낙하물을 인식하며 이에 대한 정보를 제공하는 시스템을 개발하였다. 실시간 객체인식 알고리즘인 YOLOX와 실시간 객체추적기술인 deepSORT 알고리즘을 데스크톱 PC에 적용하여 구현하였다. 개발한 시스템은 정지 차량 및 낙하물에 대한 인식 결과를 제공한다. 기존 설치된 CCTV 영상을 대상으로 시스템 적용이 가능하여 저비용으로 넓은 지역에 대한 도로 위험 상황 인식을 기대할 수 있다.

  • PDF

Dynamic characteristics monitoring of wind turbine blades based on improved YOLOv5 deep learning model

  • W.H. Zhao;W.R. Li;M.H. Yang;N. Hong;Y.F. Du
    • Smart Structures and Systems
    • /
    • v.31 no.5
    • /
    • pp.469-483
    • /
    • 2023
  • The dynamic characteristics of wind turbine blades are usually monitored by contact sensors with the disadvantages of high cost, difficult installation, easy damage to the structure, and difficult signal transmission. In view of the above problems, based on computer vision technology and the improved YOLOv5 (You Only Look Once v5) deep learning model, a non-contact dynamic characteristic monitoring method for wind turbine blade is proposed. First, the original YOLOv5l model of the CSP (Cross Stage Partial) structure is improved by introducing the CSP2_2 structure, which reduce the number of residual components to better the network training speed. On this basis, combined with the Deep sort algorithm, the accuracy of structural displacement monitoring is mended. Secondly, for the disadvantage that the deep learning sample dataset is difficult to collect, the blender software is used to model the wind turbine structure with conditions, illuminations and other practical engineering similar environments changed. In addition, incorporated with the image expansion technology, a modeling-based dataset augmentation method is proposed. Finally, the feasibility of the proposed algorithm is verified by experiments followed by the analytical procedure about the influence of YOLOv5 models, lighting conditions and angles on the recognition results. The results show that the improved YOLOv5 deep learning model not only perform well compared with many other YOLOv5 models, but also has high accuracy in vibration monitoring in different environments. The method can accurately identify the dynamic characteristics of wind turbine blades, and therefore can provide a reference for evaluating the condition of wind turbine blades.

A Study on Improving Performance of the Deep Neural Network Model for Relational Reasoning (관계 추론 심층 신경망 모델의 성능개선 연구)

  • Lee, Hyun-Ok;Lim, Heui-Seok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.12
    • /
    • pp.485-496
    • /
    • 2018
  • So far, the deep learning, a field of artificial intelligence, has achieved remarkable results in solving problems from unstructured data. However, it is difficult to comprehensively judge situations like humans, and did not reach the level of intelligence that deduced their relations and predicted the next situation. Recently, deep neural networks show that artificial intelligence can possess powerful relational reasoning that is core intellectual ability of human being. In this paper, to analyze and observe the performance of Relation Networks (RN) among the neural networks for relational reasoning, two types of RN-based deep neural network models were constructed and compared with the baseline model. One is a visual question answering RN model using Sort-of-CLEVR and the other is a text-based question answering RN model using bAbI task. In order to maximize the performance of the RN-based model, various performance improvement experiments such as hyper parameters tuning have been proposed and performed. The effectiveness of the proposed performance improvement methods has been verified by applying to the visual QA RN model and the text-based QA RN model, and the new domain model using the dialogue-based LL dataset. As a result of the various experiments, it is found that the initial learning rate is a key factor in determining the performance of the model in both types of RN models. We have observed that the optimal initial learning rate setting found by the proposed random search method can improve the performance of the model up to 99.8%.

A Study on Development of a Prediction Model for Korean Music Box Office Based on Deep Learning (딥러닝을 이용한 음악흥행 예측모델 개발 연구)

  • Lee, Do-Yeon;Chang, Byeng-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.8
    • /
    • pp.10-18
    • /
    • 2020
  • Among various contents industry, this study especially focused on music industry and tried to develop a prediction model for music box office using deep learning. The deep learning prediction model designed to predict music chart-in period based on 17 variables -singer power, singer influence, featuring singer power, featuring singer influence, number of participating singers, gender of participating singers, lyric writer power, composer power, arranger power, production agency power, distributing agency power, title track, LIKEs on streaming platform, comments on streaming platform, pre-promotion article, teaser-video view, first-week performance. Additionally we conducted a linear regression analysis to sort out factors, and tried to compare the prediction performance between the original DNN prediction model and the DNN model made of sorted out factors.

Automatic Collection of Production Performance Data Based on Multi-Object Tracking Algorithms (다중 객체 추적 알고리즘을 이용한 가공품 흐름 정보 기반 생산 실적 데이터 자동 수집)

  • Lim, Hyuna;Oh, Seojeong;Son, Hyeongjun;Oh, Yosep
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.2
    • /
    • pp.205-218
    • /
    • 2022
  • Recently, digital transformation in manufacturing has been accelerating. It results in that the data collection technologies from the shop-floor is becoming important. These approaches focus primarily on obtaining specific manufacturing data using various sensors and communication technologies. In order to expand the channel of field data collection, this study proposes a method to automatically collect manufacturing data based on vision-based artificial intelligence. This is to analyze real-time image information with the object detection and tracking technologies and to obtain manufacturing data. The research team collects object motion information for each frame by applying YOLO (You Only Look Once) and DeepSORT as object detection and tracking algorithms. Thereafter, the motion information is converted into two pieces of manufacturing data (production performance and time) through post-processing. A dynamically moving factory model is created to obtain training data for deep learning. In addition, operating scenarios are proposed to reproduce the shop-floor situation in the real world. The operating scenario assumes a flow-shop consisting of six facilities. As a result of collecting manufacturing data according to the operating scenarios, the accuracy was 96.3%.

Improving the performance for Relation Networks using parameters tuning (파라미터 튜닝을 통한 Relation Networks 성능개선)

  • Lee, Hyun-Ok;Lim, Heui-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.377-380
    • /
    • 2018
  • 인간의 추론 능력이란 문제에 주어진 조건을 보고 문제 해결에 필요한 것이 무엇인지를 논리적으로 생각해 보는 것으로 문제 상황 속에서 일정한 규칙이나 성질을 발견하고 이를 수학적인 방법으로 법칙을 찾아내거나 해결하는 능력을 말한다. 이러한 인간인지 능력과 유사한 인공지능 시스템을 개발하는데 있어서 핵심적 도전은 비구조적 데이터(unstructured data)로부터 그 개체들(object)과 그들간의 관계(relation)에 대해 추론하는 능력을 부여하는 것이라고 할 수 있다. 지금까지 딥러닝(deep learning) 방법은 구조화 되지 않은 데이터로부터 문제를 해결하는 엄청난 진보를 가져왔지만, 명시적으로 개체간의 관계를 고려하지 않고 이를 수행해왔다. 최근 발표된 구조화되지 않은 데이터로부터 복잡한 관계 추론을 수행하는 심층신경망(deep neural networks)은 관계추론(relational reasoning)의 시도를 이해하는데 기대할 만한 접근법을 보여주고 있다. 그 첫 번째는 관계추론을 위한 간단한 신경망 모듈(A simple neural network module for relational reasoning) 인 RN(Relation Networks)이고, 두 번째는 시각적 관찰을 기반으로 실제대상의 미래 상태를 예측하는 범용 목적의 VIN(Visual Interaction Networks)이다. 관계 추론을 수행하는 이들 심층신경망(deep neural networks)은 세상을 객체(objects)와 그들의 관계(their relations)라는 체계로 분해하고, 신경망(neural networks)이 피상적으로는 매우 달라 보이지만 근본적으로는 공통관계를 갖는 장면들에 대하여 객체와 관계라는 새로운 결합(combinations)을 일반화할 수 있는 강력한 추론 능력(powerful ability to reason)을 보유할 수 있다는 것을 보여주고 있다. 본 논문에서는 관계 추론을 수행하는 심층신경망(deep neural networks) 중에서 Sort-of-CLEVR 데이터 셋(dataset)을 사용하여 RN(Relation Networks)의 성능을 재현 및 관찰해 보았으며, 더 나아가 파라미터(parameters) 튜닝을 통하여 RN(Relation Networks) 모델의 성능 개선방법을 제시하여 보았다.

An Implementation of a Convolutional Accelerator based on a GPGPU for a Deep Learning (Deep Learning을 위한 GPGPU 기반 Convolution 가속기 구현)

  • Jeon, Hee-Kyeong;Lee, Kwang-yeob;Kim, Chi-yong
    • Journal of IKEEE
    • /
    • v.20 no.3
    • /
    • pp.303-306
    • /
    • 2016
  • In this paper, we propose a method to accelerate convolutional neural network by utilizing a GPGPU. Convolutional neural network is a sort of the neural network learning features of images. Convolutional neural network is suitable for the image processing required to learn a lot of data such as images. The convolutional layer of the conventional CNN required a large number of multiplications and it is difficult to operate in the real-time on the embedded environment. In this paper, we reduce the number of multiplications through Winograd convolution operation and perform parallel processing of the convolution by utilizing SIMT-based GPGPU. The experiment was conducted using ModelSim and TestDrive, and the experimental results showed that the processing time was improved by about 17%, compared to the conventional convolution.

Sidewalk Gaseous Pollutants Estimation Through UAV Video-based Model

  • Omar, Wael;Lee, Impyeong
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.1
    • /
    • pp.1-20
    • /
    • 2022
  • As unmanned aerial vehicle (UAV) technology grew in popularity over the years, it was introduced for air quality monitoring. This can easily be used to estimate the sidewalk emission concentration by calculating road traffic emission factors of different vehicle types. These calculations require a simulation of the spread of pollutants from one or more sources given for estimation. For this purpose, a Gaussian plume dispersion model was developed based on the US EPA Motor Vehicle Emissions Simulator (MOVES), which provides an accurate estimate of fuel consumption and pollutant emissions from vehicles under a wide range of user-defined conditions. This paper describes a methodology for estimating emission concentration on the sidewalk emitted by different types of vehicles. This line source considers vehicle parameters, wind speed and direction, and pollutant concentration using a UAV equipped with a monocular camera. All were sampled over an hourly interval. In this article, the YOLOv5 deep learning model is developed, vehicle tracking is used through Deep SORT (Simple Online and Realtime Tracking), vehicle localization using a homography transformation matrix to locate each vehicle and calculate the parameters of speed and acceleration, and ultimately a Gaussian plume dispersion model was developed to estimate the CO, NOx concentrations at a sidewalk point. The results demonstrate that these estimated pollutants values are good to give a fast and reasonable indication for any near road receptor point using a cheap UAV without installing air monitoring stations along the road.

Emotion Recognition in Arabic Speech from Saudi Dialect Corpus Using Machine Learning and Deep Learning Algorithms

  • Hanaa Alamri;Hanan S. Alshanbari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.9-16
    • /
    • 2023
  • Speech can actively elicit feelings and attitudes by using words. It is important for researchers to identify the emotional content contained in speech signals as well as the sort of emotion that resulted from the speech that was made. In this study, we studied the emotion recognition system using a database in Arabic, especially in the Saudi dialect, the database is from a YouTube channel called Telfaz11, The four emotions that were examined were anger, happiness, sadness, and neutral. In our experiments, we extracted features from audio signals, such as Mel Frequency Cepstral Coefficient (MFCC) and Zero-Crossing Rate (ZCR), then we classified emotions using many classification algorithms such as machine learning algorithms (Support Vector Machine (SVM) and K-Nearest Neighbor (KNN)) and deep learning algorithms such as (Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM)). Our Experiments showed that the MFCC feature extraction method and CNN model obtained the best accuracy result with 95%, proving the effectiveness of this classification system in recognizing Arabic spoken emotions.

Approaching Vehicles Alert System Based on the 360 Degree Camera (360 도 카메라를 활용한 보행 시 차량 접근 알림 시스템)

  • Yoon, Soyeon;Kim, Eun-ji;Lee, Won-young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.556-559
    • /
    • 2021
  • 해당 연구는 Insta evo 360° 카메라로 촬영한 Equirectangular 형태의 영상을 활용하여 보행자에게 위험한 차량을 구분한 후 실시간적으로 차량 접근 알림을 주는 시스템에 관한 연구이다. 360° 영상 속 위험 차량 탐지와 추적을 위해 파노라마와 일반도로 이미지 데이터 세트로 전이학습 된 You Look Only Once v5 (YOLOv5), 객체 추적 알고리즘 Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT), 그리고 실험을 통해 개발한 비 위험 차량 필터링 알고리즘을 활용한다. Insta evo 360° 카메라를 머리 위에 얹어 촬영한 영상을 개발한 최종 시스템에 적용한 결과, 약 90% 정확도로 영상에서 비 위험 차량과 위험 차량을 구분할 수 있고, 위험 차량의 경우 차량의 방향을 시각적으로 알려줄 수 있다. 본 연구를 바탕으로 보행자 시야각 외부의 위험 차량에 대한 경고 알림을 주어 보행자 교통사고 발생 가능성을 줄이고, 전방위를 볼 수 있는 360° 카메라의 활용 분야가 보행 안전 시스템뿐만 아니라 더 다양해질 것으로 기대한다.