• Title/Summary/Keyword: video-based recognition system

Search Result 192, Processing Time 0.031 seconds

Development of a Video Caption Recognition System for Sport Event Broadcasting (스포츠 중계를 위한 자막 인식 시스템 개발)

  • Oh, Ju-Hyun
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.94-98
    • /
    • 2009
  • A video caption recognition system has been developed for broadcasting sport events such as major league baseball. The purpose of the system is to translate the information expressed in English units such as miles per hour (MPH) to the international system of units (SI) such as km/h. The system detects the ball speed displayed in the video and recognizes the numerals. The ball speed is then converted to km/h and displayed by the following character generator (CG) system. Although neural-network based methods are widely used for character and numeral recognition, we use template matching to avoid the training process required before the broadcasting. With the proposed template matching method, the operator can cope with the situation when the caption’s appearance changed without any notification. Templates are configured by the operator with a captured screenshot of the first pitch with ball speed. Templates are updated with following correct recognition results. The accuracy of the recognition module is over 97%, which is still not enough for live broadcasting. When the recognition confidence is low, the system asks the operator for the correct recognition result. The operator chooses the right one using hot keys.

  • PDF

Research on Intelligent Anomaly Detection System Based on Real-Time Unstructured Object Recognition Technique (실시간 비정형객체 인식 기법 기반 지능형 이상 탐지 시스템에 관한 연구)

  • Lee, Seok Chang;Kim, Young Hyun;Kang, Soo Kyung;Park, Myung Hye
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.3
    • /
    • pp.546-557
    • /
    • 2022
  • Recently, the demand to interpret image data with artificial intelligence in various fields is rapidly increasing. Object recognition and detection techniques using deep learning are mainly used, and video integration analysis to determine unstructured object recognition is a particularly important problem. In the case of natural disasters or social disasters, there is a limit to the object recognition structure alone because it has an unstructured shape. In this paper, we propose intelligent video integration analysis system that can recognize unstructured objects based on video turning point and object detection. We also introduce a method to apply and evaluate object recognition using virtual augmented images from 2D to 3D through GAN.

Virtual Contamination Lane Image and Video Generation Method for the Performance Evaluation of the Lane Departure Warning System (차선 이탈 경고 시스템의 성능 검증을 위한 가상의 오염 차선 이미지 및 비디오 생성 방법)

  • Kwak, Jae-Ho;Kim, Whoi-Yul
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.24 no.6
    • /
    • pp.627-634
    • /
    • 2016
  • In this paper, an augmented video generation method to evaluate the performance of lane departure warning system is proposed. In our system, the input is a video which have road scene with general clean lane, and the content of output video is the same but the lane is synthesized with contamination image. In order to synthesize the contamination lane image, two approaches were used. One is example-based image synthesis, and the other is background-based image synthesis. Example-based image synthesis is generated in the assumption of the situation that contamination is applied to the lane, and background-based image synthesis is for the situation that the lane is erased due to aging. In this paper, a new contamination pattern generation method using Gaussian function is also proposed in order to produce contamination with various shape and size. The contamination lane video can be generated by shifting synthesized image as lane movement amount obtained empirically. Our experiment showed that the similarity between the generated contamination lane image and real lane image is over 90 %. Futhermore, we can verify the reliability of the video generated from the proposed method through the analysis of the change of lane recognition rate. In other words, the recognition rate based on the video generated from the proposed method is very similar to that of the real contamination lane video.

Real-time Identification of Traffic Light and Road Sign for the Next Generation Video-Based Navigation System (차세대 실감 내비게이션을 위한 실시간 신호등 및 표지판 객체 인식)

  • Kim, Yong-Kwon;Lee, Ki-Sung;Cho, Seong-Ik;Park, Jeong-Ho;Choi, Kyoung-Ho
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.2
    • /
    • pp.13-24
    • /
    • 2008
  • A next generation video based car navigation is researched to supplement the drawbacks of existed 2D based navigation and to provide the various services for safety driving. The components of this navigation system could be a load object database, identification module for load lines, and crossroad identification module, etc. In this paper, we proposed the traffic lights and road sign recognition method which can be effectively exploited for crossroad recognition in video-based car navigation systems. The method uses object color information and other spatial features in the video image. The results show average 90% recognition rate from 30m to 60m distance for traffic lights and 97% at 40-90m distance for load sign. The algorithm also achieves 46msec/frame processing time which also indicates the appropriateness of the algorithm in real-time processing.

  • PDF

An Automatic Camera Tracking System for Video Surveillance

  • Lee, Sang-Hwa;Sharma, Siddharth;Lin, Sang-Lin;Park, Jong-Il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.42-45
    • /
    • 2010
  • This paper proposes an intelligent video surveillance system for human object tracking. The proposed system integrates the object extraction, human object recognition, face detection, and camera control. First, the object in the video signals is extracted using the background subtraction. Then, the object region is examined whether it is human or not. For this recognition, the region-based shape descriptor, angular radial transform (ART) in MPEG-7, is used to learn and train the shapes of human bodies. When it is decided that the object is human or something to be investigated, the face region is detected. Finally, the face or object region is tracked in the video, and the pan/tilt/zoom (PTZ) controllable camera tracks the moving object with the motion information of the object. This paper performs the simulation with the real CCTV cameras and their communication protocol. According to the experiments, the proposed system is able to track the moving object(human) automatically not only in the image domain but also in the real 3-D space. The proposed system reduces the human supervisors and improves the surveillance efficiency with the computer vision techniques.

  • PDF

Human Activity Recognition Using Spatiotemporal 3-D Body Joint Features with Hidden Markov Models

  • Uddin, Md. Zia;Kim, Jaehyoun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2767-2780
    • /
    • 2016
  • Video-based human-activity recognition has become increasingly popular due to the prominent corresponding applications in a variety of fields such as computer vision, image processing, smart-home healthcare, and human-computer interactions. The essential goals of a video-based activity-recognition system include the provision of behavior-based information to enable functionality that proactively assists a person with his/her tasks. The target of this work is the development of a novel approach for human-activity recognition, whereby human-body-joint features that are extracted from depth videos are used. From silhouette images taken at every depth, the direction and magnitude features are first obtained from each connected body-joint pair so that they can be augmented later with motion direction, as well as with the magnitude features of each joint in the next frame. A generalized discriminant analysis (GDA) is applied to make the spatiotemporal features more robust, followed by the feeding of the time-sequence features into a Hidden Markov Model (HMM) for the training of each activity. Lastly, all of the trained-activity HMMs are used for depth-video activity recognition.

Face Spoofing Attack Detection Using Spatial Frequency and Gradient-Based Descriptor

  • Ali, Zahid;Park, Unsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.892-911
    • /
    • 2019
  • Biometric recognition systems have been widely used for information security. Among the most popular biometric traits, there are fingerprint and face due to their high recognition accuracies. However, the security system that uses face recognition as the login method are vulnerable to face-spoofing attacks, from using printed photo or video of the valid user. In this study, we propose a fast and robust method to detect face-spoofing attacks based on the analysis of spatial frequency differences between the real and fake videos. We found that the effect of a spoofing attack stands out more prominently in certain regions of the 2D Fourier spectra and, therefore, it is adequate to use the information about those regions to classify the input video or image as real or fake. We adopt a divide-conquer-aggregate approach, where we first divide the frequency domain image into local blocks, classify each local block independently, and then aggregate all the classification results by the weighted-sum approach. The effectiveness of the methodology is demonstrated using two different publicly available databases, namely: 1) Replay Attack Database and 2) CASIA-Face Anti-Spoofing Database. Experimental results show that the proposed method provides state-of-the-art performance by processing fewer frames of each video.

Method of extracting context from media data by using video sharing site

  • Kondoh, Satoshi;Ogawa, Takeshi
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.709-713
    • /
    • 2009
  • Recently, a lot of research that applies data acquired from devices such as cameras and RFIDs to context aware services is being performed in the field on Life-Log and the sensor network. A variety of analytical techniques has been proposed to recognize various information from the raw data because video and audio data include a larger volume of information than other sensor data. However, manually watching a huge amount of media data again has been necessary to create supervised data for the update of a class or the addition of a new class because these techniques generally use supervised learning. Therefore, the problem was that applications were able to use only recognition function based on fixed supervised data in most cases. Then, we proposed a method of acquiring supervised data from a video sharing site where users give comments on any video scene because those sites are remarkably popular and, therefore, many comments are generated. In the first step of this method, words with a high utility value are extracted by filtering the comment about the video. Second, the set of feature data in the time series is calculated by applying functions, which extract various feature data, to media data. Finally, our learning system calculates the correlation coefficient by using the above-mentioned two kinds of data, and the correlation coefficient is stored in the DB of the system. Various other applications contain a recognition function that is used to generate collective intelligence based on Web comments, by applying this correlation coefficient to new media data. In addition, flexible recognition that adjusts to a new object becomes possible by regularly acquiring and learning both media data and comments from a video sharing site while reducing work by manual operation. As a result, recognition of not only the name of the seen object but also indirect information, e.g. the impression or the action toward the object, was enabled.

  • PDF

Recognition and tracking system of moving objects based on artificial neural network and PWM control

  • Sugisaka, M.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1992.10b
    • /
    • pp.573-574
    • /
    • 1992
  • We developed a recognition and tracking system of moving objects. The system consists of one CCD video camera, two DC motors in horizontal and vertical axles with encoders, pluse width modulation(PWM) driving unit, 16 bit NEC 9801 microcomputer, and their interfaces. The recognition and tracking system is able to recognize shape and size of a moving object and is able to track the object within a certain range of errors. This paper presents the brief introduction of the recognition and tracking system developed in our laboratory.

  • PDF

Video Editing using Hand Gesture Tracking and Recognition (손동작 추적 및 인식을 이용한 비디오 편집)

  • Bae, Cheol-Soo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.102-107
    • /
    • 2007
  • In this paper presents a gesture based driven approach for video editing. Given a lecture video, we adopt novel approaches to automatically detect and synchronize its content with electronic slides. The gestures in each synchronized topic (or shot) are then tracked and recognized continuously. By registering shots and slides md recovering their transformation, the regions where the gestures take place can be known. Based on the recognized gestures and their registered positions, the information in slides can be seamlessly extracted not only to assist video editing, but also to enhance the quality of original lecture video. In experiment with two videos, the proposed system showd each gesture recognition rate 95.5%,96.4%.