• Title/Summary/Keyword: Temporal Action Localization

Search Result 4, Processing Time 0.022 seconds

Trends in Temporal Action Detection in Untrimmed Videos (시간적 행동 탐지 기술 동향)

  • Moon, Jinyoung;Kim, Hyungil;Park, Jongyoul
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.20-33
    • /
    • 2020
  • Temporal action detection (TAD) in untrimmed videos is an important but a challenging problem in the field of computer vision and has gathered increasing interest recently. Although most studies on action in videos have addressed action recognition in trimmed videos, TAD methods are required to understand real-world untrimmed videos, including mostly background and some meaningful action instances belonging to multiple action classes. TAD is mainly composed of temporal action localization that generates temporal action proposals, such as single action and action recognition, which classifies action proposals into action classes. However, the task of generating temporal action proposals with accurate temporal boundaries is challenging in TAD. In this paper, we discuss TAD technologies that are considered high performance in terms of representative TAD studies based on deep learning. Further, we investigate evaluation methodologies for TAD, such as benchmark datasets and performance measures, and subsequently compare the performance of the discussed TAD models.

A Bi-directional Information Learning Method Using Reverse Playback Video for Fully Supervised Temporal Action Localization (완전지도 시간적 행동 검출에서 역재생 비디오를 이용한 양방향 정보 학습 방법)

  • Huiwon Gwon;Hyejeong Jo;Sunhee Jo;Chanho Jung
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.145-149
    • /
    • 2024
  • Recently, research on temporal action localization has been actively conducted. In this paper, unlike existing methods, we propose two approaches for learning bidirectional information by creating reverse playback videos for fully supervised temporal action localization. One approach involves creating training data by combining reverse playback videos and forward playback videos, while the other approach involves training separate models on videos with different playback directions. Experiments were conducted on the THUMOS-14 dataset using TALLFormer. When using both reverse and forward playback videos as training data, the performance was 5.1% lower than that of the existing method. On the other hand, using a model ensemble shows a 1.9% improvement in performance.

A Study on Kernel Size Variations in 1D Convolutional Layer for Single-Frame supervised Temporal Action Localization (단일 프레임 지도 시간적 행동 지역화에서 1D 합성곱 층의 커널 사이즈 변화 연구)

  • Hyejeong Jo;Huiwon Gwon;Sunhee Jo;Chanho Jung
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.199-203
    • /
    • 2024
  • In this paper, we propose variations in the kernel size of 1D convolutional layers for single-frame supervised temporal action localization. Building upon the existing method, which utilizes two 1D convolutional layers with kernel sizes of 3 and 1, we introduce an approach that adjusts the kernel sizes of each 1D convolutional layer. To validate the efficiency of our proposed approach, we conducted comparative experiments using the THUMOS'14 dataset. Additionally, we use overall video classification accuracy, mAP (mean Average Precision), and Average mAP as performance metrics for evaluation. According to the experimental results, our proposed approach demonstrates higher accuracy in terms of mAP and Average mAP compared to the existing method. The method with variations in kernel size of 7 and 1 further demonstrates an 8.0% improvement in overall video classification accuracy.

Exploration of Neurophysiological Mechanisms underlying Action Performance Changes caused by Semantic Congruency between Perceived Action Verbs and Current Actions (지각된 행위동사와 현재 행위의 의미 일치성에 따른 행위 수행 변화의 신경생리학적 기전 탐색)

  • Rha, Younghyoun;Jeong, Myung Yung;Kwak, Jarang;Lee, Donghoon
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.4
    • /
    • pp.573-597
    • /
    • 2016
  • Recent fMRI and EEG research for neural representations of action concepts insist that processing of action concepts evoke the simulation of sensory-motor information. Moreover, there are several behavioral studies showing that understanding of action verbs or sentences describing actions interfere or facilitate current action performance. However, it is unclear that online interaction between processing of action concepts and current action is based on the simulation of sensory-motor information, or other neural mechanisms. The present research aims to explore the underlying neural mechanism that how the perception of action language influence the performance of current action using high-spacial temporal resolution EEG and multiple source analysis techniques. For this, participants were asked to perform a cued-motor reaction task in which button-pressing hand action and pedal-stepping foot action were required according to the color of the cue, and we presented auditorily action verbs describing the responding actions (i.e., /press/, /step/, /stop/) just before the color cue and examined the interaction effect from the semantic congruency between the action verbs and the current action. Behavioral results revealed consistently a facilitatory effect when action verbs and responding actions were semantically congruent in both button-pressing and pedal-stepping actions, and an inhibitory effect when semantically incongruent in the button-pressing action condition. In the results of EEG source waveform analysis, the semantic congruency effects between action verbs and the responding actions were observed in the Wernicke's area during the perception of action verbs, in the anterior cingulate gyrus and the supplementary motor area (SMA) at the time when the motor-cue was presented, and in the SMA and primary motor cortex (M1) during action execution stage. Based on the current findings, we argue that perceived action verbs evoke the facilitation/inhibition effect by influencing the expectation and preparation stage of following actions rather than the directly activating the particular motor cortex. Finally we discussed the implication on the neural representation of action concepts and methodological limitations of the current research.