• Title/Summary/Keyword: Action Detection

Search Result 333, Processing Time 0.028 seconds

Detection of Low-Level Human Action Change for Reducing Repetitive Tasks in Human Action Recognition (사람 행동 인식에서 반복 감소를 위한 저수준 사람 행동 변화 감지 방법)

  • Noh, Yohwan;Kim, Min-Jung;Lee, DoHoon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.4
    • /
    • pp.432-442
    • /
    • 2019
  • Most current human action recognition methods based on deep learning methods. It is required, however, a very high computational cost. In this paper, we propose an action change detection method to reduce repetitive human action recognition tasks. In reality, simple actions are often repeated and it is time consuming process to apply high cost action recognition methods on repeated actions. The proposed method decides whether action has changed. The action recognition is executed only when it has detected action change. The action change detection process is as follows. First, extract the number of non-zero pixel from motion history image and generate one-dimensional time-series data. Second, detecting action change by comparison of difference between current time trend and local extremum of time-series data and threshold. Experiments on the proposed method achieved 89% balanced accuracy on action change data and 61% reduced action recognition repetition.

Trends in Online Action Detection in Streaming Videos (온라인 행동 탐지 기술 동향)

  • Moon, J.Y.;Kim, H.I.;Lee, Y.J.
    • Electronics and Telecommunications Trends
    • /
    • v.36 no.2
    • /
    • pp.75-82
    • /
    • 2021
  • Online action detection (OAD) in a streaming video is an attractive research area that has aroused interest lately. Although most studies for action understanding have considered action recognition in well-trimmed videos and offline temporal action detection in untrimmed videos, online action detection methods are required to monitor action occurrences in streaming videos. OAD predicts action probabilities for a current frame or frame sequence using a fixed-sized video segment, including past and current frames. In this article, we discuss deep learning-based OAD models. In addition, we investigated OAD evaluation methodologies, including benchmark datasets and performance measures, and compared the performances of the presented OAD models.

Trends in Temporal Action Detection in Untrimmed Videos (시간적 행동 탐지 기술 동향)

  • Moon, Jinyoung;Kim, Hyungil;Park, Jongyoul
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.20-33
    • /
    • 2020
  • Temporal action detection (TAD) in untrimmed videos is an important but a challenging problem in the field of computer vision and has gathered increasing interest recently. Although most studies on action in videos have addressed action recognition in trimmed videos, TAD methods are required to understand real-world untrimmed videos, including mostly background and some meaningful action instances belonging to multiple action classes. TAD is mainly composed of temporal action localization that generates temporal action proposals, such as single action and action recognition, which classifies action proposals into action classes. However, the task of generating temporal action proposals with accurate temporal boundaries is challenging in TAD. In this paper, we discuss TAD technologies that are considered high performance in terms of representative TAD studies based on deep learning. Further, we investigate evaluation methodologies for TAD, such as benchmark datasets and performance measures, and subsequently compare the performance of the discussed TAD models.

Quick and easy game bot detection based on action time interval estimation

  • Yong Goo Kang;Huy Kang Kim
    • ETRI Journal
    • /
    • v.45 no.4
    • /
    • pp.713-723
    • /
    • 2023
  • Game bots are illegal programs that facilitate account growth and goods acquisition through continuous and automatic play. Early detection is required to minimize the damage caused by evolving game bots. In this study, we propose a game bot detection method based on action time intervals (ATIs). We observe the actions of the bots in a game and identify the most frequently occurring actions. We extract the frequency, ATI average, and ATI standard deviation for each identified action, which is to used as machine learning features. Furthermore, we measure the performance using actual logs of the Aion game to verify the validity of the proposed method. The accuracy and precision of the proposed method are 97% and 100%, respectively. Results show that the game bots can be detected early because the proposed method performs well using only data from a single day, which shows similar performance with those proposed in a previous study using the same dataset. The detection performance of the model is maintained even after 2 months of training without any revision process.

Facial Action Unit Detection with Multilayer Fused Multi-Task and Multi-Label Deep Learning Network

  • He, Jun;Li, Dongliang;Bo, Sun;Yu, Lejun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5546-5559
    • /
    • 2019
  • Facial action units (AUs) have recently drawn increased attention because they can be used to recognize facial expressions. A variety of methods have been designed for frontal-view AU detection, but few have been able to handle multi-view face images. In this paper we propose a method for multi-view facial AU detection using a fused multilayer, multi-task, and multi-label deep learning network. The network can complete two tasks: AU detection and facial view detection. AU detection is a multi-label problem and facial view detection is a single-label problem. A residual network and multilayer fusion are applied to obtain more representative features. Our method is effective and performs well. The F1 score on FERA 2017 is 13.1% higher than the baseline. The facial view recognition accuracy is 0.991. This shows that our multi-task, multi-label model could achieve good performance on the two tasks.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Intrusion Detection System Using the Correlation of Intrusion Signature (침입신호 상관성을 이용한 침입 탐지 시스템)

  • Na Guen-Sik
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.57-67
    • /
    • 2004
  • In this paper we present the architecture of intrusion detection system that enhances the performance of system and the correctness of intrusion detection. A network intrusion is usually composed of several steps of action taken by the attackers. Each action in the steps can be characterized by its signature. But normal and non-intrusive action can also include the same signature, It can result in incorrect detection. The presented system uses the correlation of series of signatures that consist of an intrusion. So Its decision on an intrusion is highly reliable. And variations of known intrusions can easily be detected without any knowledge of the variations.

  • PDF

Improved DT Algorithm Based Human Action Features Detection

  • Hu, Zeyuan;Lee, Suk-Hwan;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.4
    • /
    • pp.478-484
    • /
    • 2018
  • The choice of the motion features influences the result of the human action recognition method directly. Many factors often influence the single feature differently, such as appearance of the human body, environment and video camera. So the accuracy of action recognition is restricted. On the bases of studying the representation and recognition of human actions, and giving fully consideration to the advantages and disadvantages of different features, the Dense Trajectories(DT) algorithm is a very classic algorithm in the field of behavior recognition feature extraction, but there are some defects in the use of optical flow images. In this paper, we will use the improved Dense Trajectories(iDT) algorithm to optimize and extract the optical flow features in the movement of human action, then we will combined with Support Vector Machine methods to identify human behavior, and use the image in the KTH database for training and testing.

Image Based Human Action Recognition System to Support the Blind (시각장애인 보조를 위한 영상기반 휴먼 행동 인식 시스템)

  • Ko, ByoungChul;Hwang, Mincheol;Nam, Jae-Yeal
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.138-143
    • /
    • 2015
  • In this paper we develop a novel human action recognition system based on communication between an ear-mounted Bluetooth camera and an action recognition server to aid scene recognition for the blind. First, if the blind capture an image of a specific location using the ear-mounted camera, the captured image is transmitted to the recognition server using a smartphone that is synchronized with the camera. The recognition server sequentially performs human detection, object detection and action recognition by analyzing human poses. The recognized action information is retransmitted to the smartphone and the user can hear the action information through the text-to-speech (TTS). Experimental results using the proposed system showed a 60.7% action recognition performance on the test data captured in indoor and outdoor environments.

Extensible Hierarchical Method of Detecting Interactive Actions for Video Understanding

  • Moon, Jinyoung;Jin, Junho;Kwon, Yongjin;Kang, Kyuchang;Park, Jongyoul;Park, Kyoung
    • ETRI Journal
    • /
    • v.39 no.4
    • /
    • pp.502-513
    • /
    • 2017
  • For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements. Most studies on actions have handled recognition problems for a well-trimmed video and focused on enhancing their classification performance. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. The method using object information achieves an F-measure of 90.27%. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used.