• Title/Summary/Keyword: Video recognition

Search Result 679, Processing Time 0.024 seconds

A Real-time Face Recognition System using Fast Face Detection (빠른 얼굴 검출을 이용한 실시간 얼굴 인식 시스템)

  • Lee Ho-Geun;Jung Sung-Tae
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1247-1259
    • /
    • 2005
  • This paper proposes a real-time face recognition system which detects multiple faces from low resolution video such as web-camera video. Face recognition system consists of the face detection step and the face classification step. At First, it finds face region candidates by using AdaBoost based object detection method which have fast speed and robust performance. It generates reduced feature vector for each face region candidate by using principle component analysis. At Second, Face classification used Principle Component Analysis and multi-SVM. Experimental result shows that the proposed method achieves real-time face detection and face recognition from low resolution video. Additionally, We implement the auto-tracking face recognition system using the Pan-Tilt Web-camera and radio On/Off digital door-lock system with face recognition system.

Video Based Face Spoofing Detection Using Fourier Transform and Dense-SIFT (푸리에 변환과 Dense-SIFT를 이용한 비디오 기반 Face Spoofing 검출)

  • Han, Hotaek;Park, Unsang
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.483-486
    • /
    • 2015
  • Security systems that use face recognition are vulnerable to spoofing attacks where unauthorized individuals use a photo or video of authorized users. In this work, we propose a method to detect a face spoofing attack with a video of an authorized person. The proposed method uses three sequential frames in the video to extract features by using Fourier Transform and Dense-SIFT filter. Then, classification is completed with a Support Vector Machine (SVM). Experimental results with a database of 200 valid and 200 spoof video clips showed 99% detection accuracy. The proposed method uses simplified features that require fewer memory and computational overhead while showing a high spoofing detection accuracy.

Human Activity Recognition Based on 3D Residual Dense Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.12
    • /
    • pp.1540-1551
    • /
    • 2020
  • Aiming at the problem that the existing human behavior recognition algorithm cannot fully utilize the multi-level spatio-temporal information of the network, a human behavior recognition algorithm based on a dense three-dimensional residual network is proposed. First, the proposed algorithm uses a dense block of three-dimensional residuals as the basic module of the network. The module extracts the hierarchical features of human behavior through densely connected convolutional layers; Secondly, the local feature aggregation adaptive method is used to learn the local dense features of human behavior; Then, the residual connection module is applied to promote the flow of feature information and reduced the difficulty of training; Finally, the multi-layer local feature extraction of the network is realized by cascading multiple three-dimensional residual dense blocks, and use the global feature aggregation adaptive method to learn the features of all network layers to realize human behavior recognition. A large number of experimental results on benchmark datasets KTH show that the recognition rate (top-l accuracy) of the proposed algorithm reaches 93.52%. Compared with the three-dimensional convolutional neural network (C3D) algorithm, it has improved by 3.93 percentage points. The proposed algorithm framework has good robustness and transfer learning ability, and can effectively handle a variety of video behavior recognition tasks.

Improving Indentification Performance by Integrating Evidence From Evidence

  • Park, Kwang-Chae;Kim, Young-Geil;Cheong, Ha-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.6
    • /
    • pp.546-552
    • /
    • 2016
  • We present a quantitative evaluation of an algorithm for model-based face recognition. The algorithm actively learns how individual faces vary through video sequences, providing on-line suppression of confounding factors such as expression, lighting and pose. By actively decoupling sources of image variation, the algorithm provides a framework in which identity evidence can be integrated over a sequence. We demonstrate that face recognition can be considerably improved by the analysis of video sequences. The method presented is widely applicable in many multi-class interpretation problems.

Displacement Measurement of Multi-point Using a Pattern Recognition from Video Signal (영상 신호에서 패턴인식을 이용한 다중 포인트 변위측정)

  • Jeon, Hyeong-Seop;Choi, Young-Chul;Park, Jong-Won
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.18 no.12
    • /
    • pp.1256-1261
    • /
    • 2008
  • This paper proposes a way to measure the displacement of a multi-point by using a pattern recognition from video signal. Generally in measuring displacement, gab sensor, which is a displacement sensor, is used. However, it is difficult to measure displacement by using a common sensor in places where it is unsuitable to attach a sensor, such as high-temperature areas or radioactive places. In this kind of places, non-contact methods should be used to measure displacement and in this study, images of CCD camera were used. When multi-point is measure by using a pattern recognition, it is possible to measure displacement with a non-contact method. It is simple to install and multi-point displacement measuring device so that it is advantageous to solve problems of spatial constraints.

Recognition of Road Surface Marks and Numbers Using Connected Component Analysis and Size Normalization (연결 성분 분석과 크기 정규화를 이용한 도로 노면 표시와 숫자 인식)

  • Jung, Min Chul
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.22-26
    • /
    • 2022
  • This paper proposes a new method for the recognition of road surface marks and numbers. The proposed method designates a region of interest on the road surface without first detecting a lane. The road surface markings are extracted by location and size using a connection component analysis. Distortion due to the perspective effect is minimized by normalizing the size of the road markings. The road surface marking of the connected component is recognized by matching it with the stored road marking templates. The proposed method is implemented using C language in Raspberry Pi 4 system with a camera module for a real-time image processing. The system was fixedly installed in a moving vehicle, and it recorded a video like a vehicle black box. Each frame of the recorded video was extracted, and then the proposed method was tested. The results show that the proposed method is successful for the recognition of road surface marks and numbers.

Sign language translation using video captioning and sign language recognition using action recognition (비디오 캡셔닝을 적용한 수어 번역 및 행동 인식을 적용한 수어 인식)

  • Gi-Duk Kim;Geun-Hoo Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.317-319
    • /
    • 2024
  • 본 논문에서는 비디오 캡셔닝 알고리즘을 적용한 수어 번역 및 행동 인식 알고리즘을 적용한 수어 인식 알고리즘을 제안한다. 본 논문에 사용된 비디오 캡셔닝 알고리즘으로 40개의 연속된 입력 데이터 프레임을 CNN 네트워크를 통해 임베딩 하고 트랜스포머의 입력으로 하여 문장을 출력하였다. 행동 인식 알고리즘은 랜덤 샘플링을 하여 한 영상에 40개의 인덱스에서 40개의 연속된 데이터에 CNN 네트워크를 통해 임베딩하고 GRU, 트랜스포머를 결합한 RNN 모델을 통해 인식 결과를 출력하였다. 수어 번역에서 BLEU-4의 경우 7.85, CIDEr는 53.12를 얻었고 수어 인식으로 96.26%의 인식 정확도를 얻었다.

  • PDF

Design and Implementation of Emergency Recognition System based on Multimodal Information (멀티모달 정보를 이용한 응급상황 인식 시스템의 설계 및 구현)

  • Kim, Eoung-Un;Kang, Sun-Kyung;So, In-Mi;Kwon, Tae-Kyu;Lee, Sang-Seol;Lee, Yong-Ju;Jung, Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.181-190
    • /
    • 2009
  • This paper presents a multimodal emergency recognition system based on visual information, audio information and gravity sensor information. It consists of video processing module, audio processing module, gravity sensor processing module and multimodal integration module. The video processing module and gravity sensor processing module respectively detects actions such as moving, stopping and fainting and transfer them to the multimodal integration module. The multimodal integration module detects emergency by fusing the transferred information and verifies it by asking a question and recognizing the answer via audio channel. The experiment results show that the recognition rate of video processing module only is 91.5% and that of gravity sensor processing module only is 94%, but when both information are combined the recognition result becomes 100%.

Performance Analysis for Accuracy of Personality Recognition Models based on Setting of Margin Values at Face Region Extraction (얼굴 영역 추출 시 여유값의 설정에 따른 개성 인식 모델 정확도 성능 분석)

  • Qiu Xu;Gyuwon Han;Bongjae Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.141-147
    • /
    • 2024
  • Recently, there has been growing interest in personalized services tailored to an individual's preferences. This has led to ongoing research aimed at recognizing and leveraging an individual's personality traits. Among various methods for personality assessment, the OCEAN model stands out as a prominent approach. In utilizing OCEAN for personality recognition, a multi modal artificial intelligence model that incorporates linguistic, paralinguistic, and non-linguistic information is often employed. This paper examines the impact of the margin value set for extracting facial areas from video data on the accuracy of a personality recognition model that uses facial expressions to determine OCEAN traits. The study employed personality recognition models based on 2D Patch Partition, R2plus1D, 3D Patch Partition, and Video Swin Transformer technologies. It was observed that setting the facial area extraction margin to 60 resulted in the highest 1-MAE performance, scoring at 0.9118. These findings indicate the importance of selecting an optimal margin value to maximize the efficiency of personality recognition models.

Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

  • Liu, Min;Tang, Jun
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.754-771
    • /
    • 2021
  • In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.