• Title/Summary/Keyword: Object-based Video Recognition

Search Result 108, Processing Time 0.04 seconds

Semantic Representation of Moving Objectin Video Data Using Motion Ontology (Motion Ontology를 이용한 비디오내 객체 움직임의 의미표현)

  • Shin, Ju-Hyun;Kim, Pan-Koo
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.1
    • /
    • pp.117-127
    • /
    • 2007
  • As the value of the multimedia data is getting high, the study on the semantic recognition and retrieval about the multimedia information is strongly demanded. In this paper, we build the motion ontology and adopt it for representing the meaning of the moving objects in video data. By referencing the WordNet structure, we extend its semantic meaning based on the reclassification of motion verbs, which are used to represent the semantic meaning of moving objects. The represented information is receded in OWL/RDF(S). Here, we could expect the 'Is-A' and 'Equivalent' reasoning of the data as we use the ontologies. And the semantic representation about the moving objects is possible through the video annotation using ontology. And we tested the accuracy of the system comparing with the key-word based system. As a result, we could get the approximately 10% improvement of the system performance.

  • PDF

A Study on Radar Video Fusion Systems for Pedestrian and Vehicle Detection (보행자 및 차량 검지를 위한 레이더 영상 융복합 시스템 연구)

  • Sung-Youn Cho;Yeo-Hwan Yoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.197-205
    • /
    • 2024
  • Development of AI and big data-based algorithms to advance and optimize the recognition and detection performance of various static/dynamic vehicles in front and around the vehicle at a time when securing driving safety is the most important point in the development and commercialization of autonomous vehicles. etc. are being studied. However, there are many research cases for recognizing the same vehicle by using the unique advantages of radar and camera, but deep learning image processing technology is not used, or only a short distance is detected as the same target due to radar performance problems. Therefore, there is a need for a convergence-based vehicle recognition method that configures a dataset that can be collected from radar equipment and camera equipment, calculates the error of the dataset, and recognizes it as the same target. In this paper, we aim to develop a technology that can link location information according to the installation location because data errors occur because it is judged as the same object depending on the installation location of the radar and CCTV (video).

Panorama Background Generation and Object Tracking using Pan-Tilt-Zoom Camera (Pan-Tilt-Zoom 카메라를 이용한 파노라마 배경 생성과 객체 추적)

  • Paek, In-Ho;Im, Jae-Hyun;Park, Kyoung-Ju;Paik, Jun-Ki
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.3
    • /
    • pp.55-63
    • /
    • 2008
  • This paper presents a panorama background generation and object tracking technique using a Pan-Tilt-Zoom camera. The proposed method estimates local motion vectors rapidly using phase correlation matching at the prespecified multiple local regions, and it makes minimized estimation error by vector quantization. We obtain the required image patches, by estimating the overlapped region using local motion vectors, we can then project the images to cylinder and realign the images to make the panoramic image. The object tracking is performed by extracting object's motion and by separating foreground from input image using background subtraction. The proposed PTZ-based object tracking method can efficiently generated a stable panorama background, which covers up to 360 degree FOV The proposed algorithm is designed for real-time implementation and it can be applied to many commercial applications such as object shape detection and face recognition in various surveillance video systems.

Road Sign Recognition and Geo-content Creation Schemes for Utilizing Road Sign Information (도로표지 정보 활용을 위한 도로표지 인식 및 지오콘텐츠 생성 기법)

  • Seung, Teak-Young;Moon, Kwang-Seok;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.2
    • /
    • pp.252-263
    • /
    • 2016
  • Road sign is an important street furniture that gives some information such as road conditions, driving direction and condition for a driver. Thus, road sign is a major target of image recognition for self-driving car, ADAS(autonomous vehicle and intelligent driver assistance systems), and ITS(intelligent transport systems). In this paper, an enhanced road sign recognition system is proposed for MMS(Mobile Mapping System) using the single camera and GPS. For the proposed system, first, a road sign recognition scheme is proposed. this scheme is composed of detection and classification step. In the detection step, object candidate regions are extracted in image frames using hybrid road sign detection scheme that is based on color and shape features of road signs. And, in the classification step, the area of candidate regions and road sign template are compared. Second, a Geo-marking scheme for geo-content that is consist of road sign image and coordinate value is proposed. If the serious situation such as car accident is happened, this scheme can protect geographical information of road sign against illegal users. By experiments with test video set, in the three parts that are road sign recognition, coordinate value estimation and geo-marking, it is confirmed that proposed schemes can be used for MMS in commercial area.

Automated Modelling of Ontology Schema for Media Classification (미디어 분류를 위한 온톨로지 스키마 자동 생성)

  • Lee, Nam-Gee;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.44 no.3
    • /
    • pp.287-294
    • /
    • 2017
  • With the personal-media development that has emerged through various means such as UCC and SNS, many media studies have been completed for the purposes of analysis and recognition, thereby improving the object-recognition level. The focus of these studies is a classification of media that is based on a recognition of the corresponding objects, rather than the use of the title, tag, and scripter information. The media-classification task, however, is intensive in terms of the consumption of time and energy because human experts need to model the underlying media ontology. This paper therefore proposes an automated approach for the modeling of the media-classification ontology schema; here, the OWL-DL Axiom that is based on the frequency of the recognized media-based objects is considered, and the automation of the ontology modeling is described. The authors conducted media-classification experiments across 15 YouTube-video categories, and the media-classification accuracy was measured through the application of the automated ontology-modeling approach. The promising experiment results show that 1500 actions were successfully classified from 15 media events with an 86 % accuracy.

A Fast SIFT Implementation Based on Integer Gaussian and Reconfigurable Processor

  • Su, Le Tran;Lee, Jong Soo
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.3
    • /
    • pp.39-52
    • /
    • 2009
  • Scale Invariant Feature Transform (SIFT) is an effective algorithm in object recognition, panorama stitching, and image matching, however, due to its complexity, real time processing is difficult to achieve with software approaches. This paper proposes using a reconfigurable hardware processor with integer half kernel. The integer half kernel Gaussian reduces the Gaussian pyramid complexity in about half [] and the reconfigurable processor carries out a parallel implementation of a full search Fast SIFT algorithm. We use a low memory, fine grain single instruction stream multiple data stream (SIMD) pixel processor that is currently being developed. This implementation fully exposes the available parallelism of the SIFT algorithm process and exploits the processing and I/O capabilities of the processor which results in a system that can perform real time image and video compression. We apply this novel implementation to images and measure the effectiveness. Experimental simulation results indicate that the proposed implementation is capable of real time applications.

  • PDF

Using Ensemble Learning Algorithm and AI Facial Expression Recognition, Healing Service Tailored to User's Emotion (앙상블 학습 알고리즘과 인공지능 표정 인식 기술을 활용한 사용자 감정 맞춤 힐링 서비스)

  • Yang, seong-yeon;Hong, Dahye;Moon, Jaehyun
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.818-820
    • /
    • 2022
  • The keyword 'healing' is essential to the competitive society and culture of Koreans. In addition, as the time at home increases due to COVID-19, the demand for indoor healing services has increased. Therefore, this thesis analyzes the user's facial expression so that people can receive various 'customized' healing services indoors, and based on this, provides lighting, ASMR, video recommendation service, and facial expression recording service.The user's expression was analyzed by applying the ensemble algorithm to the expression prediction results of various CNN models after extracting only the face through object detection from the image taken by the user.

Optimization of Deep Learning Model Based on Genetic Algorithm for Facial Expression Recognition (얼굴 표정 인식을 위한 유전자 알고리즘 기반 심층학습 모델 최적화)

  • Park, Jang-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.1
    • /
    • pp.85-92
    • /
    • 2020
  • Deep learning shows outstanding performance in image and video analysis, such as object classification, object detection and semantic segmentation. In this paper, it is analyzed that the performances of deep learning models can be affected by characteristics of train dataset. It is proposed as a method for selecting activation function and optimization algorithm of deep learning to classify facial expression. Classification performances are compared and analyzed by applying various algorithms of each component of deep learning model for CK+, MMI, and KDEF datasets. As results of simulation, it is shown that genetic algorithm can be an effective solution for optimizing components of deep learning model.

Design of Vehicle-mounted Loading and Unloading Equipment and Autonomous Control Method using Deep Learning Object Detection (차량 탑재형 상·하역 장비의 설계와 딥러닝 객체 인식을 이용한 자동제어 방법)

  • Soon-Kyo Lee;Sunmok Kim;Hyowon Woo;Suk Lee;Ki-Baek Lee
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.79-91
    • /
    • 2024
  • Large warehouses are building automation systems to increase efficiency. However, small warehouses, military bases, and local stores are unable to introduce automated logistics systems due to lack of space and budget, and are handling tasks manually, failing to improve efficiency. To solve this problem, this study designed small loading and unloading equipment that can be mounted on transportation vehicles. The equipment can be controlled remotely and is automatically controlled from the point where pallets loaded with cargo are visible using real-time video from an attached camera. Cargo recognition and control command generation for automatic control are achieved through a newly designed deep learning model. This model is designed to be optimized for loading and unloading equipment and mission environments based on the YOLOv3 structure. The trained model recognized 10 types of palettes with different shapes and colors with an average accuracy of 100% and estimated the state with an accuracy of 99.47%. In addition, control commands were created to insert forks into pallets without failure in 14 scenarios assuming actual loading and unloading situations.

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.