• Title/Summary/Keyword: Video Identification

Search Result 175, Processing Time 0.023 seconds

People Re-identification: A Multidisciplinary Challenge (사람 재식별: 학제간 연구 과제)

  • Cheng, Dong-Seon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.6
    • /
    • pp.135-139
    • /
    • 2012
  • The wide diffusion of internet and the overall increased reliance on technology for information communication, dissemination and gathering have created an unparalleled mass of data. Sifting through this data is defining and will define in the foreseeable future a big part of contemporary computer science. Within this data, a growing proportion is given by personal information, which represents a unique opportunity to study human activities extensively and live. One important recurring challenge in many disciplines is the problem of people re-identification. In its broadest definition, re-identification is the problem of newly recognizing previously identified people, such as following an unknown person while he walks through many different surveillance cameras in different locations. Our goals is to review how several diverse disciplines define and meet this challenge, from person re-identification in video-surveillance to authorship attribution in text samples to distinguishing users based on their preferences of pictures. We further envision a situation where multidisciplinary solutions might be beneficial.

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

A Feature Point Extraction and Identification Technique for Immersive Contents Using Deep Learning (딥 러닝을 이용한 실감형 콘텐츠 특징점 추출 및 식별 방법)

  • Park, Byeongchan;Jang, Seyoung;Yoo, Injae;Lee, Jaechung;Kim, Seok-Yoon;Kim, Youngmo
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.529-535
    • /
    • 2020
  • As the main technology of the 4th industrial revolution, immersive 360-degree video contents are drawing attention. The market size of immersive 360-degree video contents worldwide is projected to increase from $6.7 billion in 2018 to approximately $70 billion in 2020. However, most of the immersive 360-degree video contents are distributed through illegal distribution networks such as Webhard and Torrent, and the damage caused by illegal reproduction is increasing. Existing 2D video industry uses copyright filtering technology to prevent such illegal distribution. The technical difficulties dealing with immersive 360-degree videos arise in that they require ultra-high quality pictures and have the characteristics containing images captured by two or more cameras merged in one image, which results in the creation of distortion regions. There are also technical limitations such as an increase in the amount of feature point data due to the ultra-high definition and the processing speed requirement. These consideration makes it difficult to use the same 2D filtering technology for 360-degree videos. To solve this problem, this paper suggests a feature point extraction and identification technique that select object identification areas excluding regions with severe distortion, recognize objects using deep learning technology in the identification areas, extract feature points using the identified object information. Compared with the previously proposed method of extracting feature points using stitching area for immersive contents, the proposed technique shows excellent performance gain.

Digital Video Source Identification Using Sensor Pattern Noise with Morphology Filtering (모폴로지 필터링 기반 센서 패턴 노이즈를 이용한 디지털 동영상 획득 장치 판별 기술)

  • Lee, Sang-Hyeong;Kim, Dong-Hyun;Oh, Tae-Woo;Kim, Ki-Bom;Lee, Hae-Yeoun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.1
    • /
    • pp.15-22
    • /
    • 2017
  • With the advance of Internet Technology, various social network services are created and used by users. Especially, the use of smart devices makes that multimedia contents can be used and distributed on social network services. However, since the crime rate also is increased by users with illegal purposes, there are needs to protect contents and block illegal usage of contents with multimedia forensics. In this paper, we propose a multimedia forensic technique which is identifying the video source. First, the scheme to acquire the sensor pattern noise (SPN) using morphology filtering is presented, which comes from the imperfection of photon detector. Using this scheme, the SPN of reference videos from the reference device is estimated and the SPN of an unknown video is estimated. Then, the similarity between two SPNs is measured to identify whether the unknown video is acquired using the reference device. For the performance analysis of the proposed technique, 30 devices including DSLR camera, compact camera, camcorder, action cam and smart phone are tested and quantitatively analyzed. Based on the results, the proposed technique can achieve the 96% accuracy in identification.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

A Study on Hypermap Database (하이퍼맵 데이타베이스에 관한 연구)

  • Kim, Yong-Il;Pyeon, Mu-Wook
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.4 no.1 s.6
    • /
    • pp.43-55
    • /
    • 1996
  • The objective of this research is to design a digital map database structure supporting video images which is one of the fundamental elements of hypermap. In order to reach the research objective, the work includes the identification of the relationships between two dimensional digital map database and video elements. The proposed database model has functions for interactive browsing between video image frames and specific points on two dimensional digital map, fer connecting the map elements and features on video images. After that, the images and the database are transformed to the pilot system fer testing the map database structure. The pilot project results indicate that the map database structure can integrate functionally two dimensional digital map and video images.

  • PDF

Face Detection and Recognition for Video Retrieval (비디오 검색을 위한 얼굴 검출 및 인식)

  • lslam, Mohammad Khairul;Lee, Hyung-Jin;Paul, Anjan Kumar;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.6
    • /
    • pp.691-698
    • /
    • 2008
  • We present a novel method for face detection and recognition methods applicable to video retrieval. The person matching efficiency largely depends on how robustly faces are detected in the video frames. Face regions are detected in video frames using viola-jones features boosted with the Adaboost algorithm After face detection, PCA (Principal Component Analysis) follows illumination compensation to extract features that are classified by SVM (Support Vector Machine) for person identification. Experimental result shows that the matching efficiency of the ensembled architecture is quit satisfactory.

  • PDF

Device Identification System for Corporate Internal Network Visibility in IoT Era (IoT 시대 기업 내부 네트워크의 가시성 확보를 위한 단말 식별 시스템 설계)

  • Lee, Dae-Hyo;Kim, Yong-Kwon;Lee, Dong-Bum;Kim, Hyeob
    • Convergence Security Journal
    • /
    • v.19 no.3
    • /
    • pp.51-59
    • /
    • 2019
  • In this paper, we propose a device identification system for network visibility that can maintain the secure internal network environment in the IoT era. Recently, the area of enterprise network is getting huge and more complicated. Not only desktops and smartphones but also business pads, barcode scanners, APs, Video Surveillance, digital doors, security devices, and lots of Internet of Things (IoT) devices are rapidly pouring into the business network, and there are highly risk of security threats. Therefore, in this paper, we propose the device identification system that includes the process and module-specific functions to identify the exploding device in the IoT era. The proposed system provides in-depth visibility of the devices and their own vulnerabilities to the IT manager in company. These information help to mitigate the risk of the potential cyber security threats in the internal network and offer the unified security management against the business risks.

Telemedicine for Real-Time Multi-Consultation

  • Chun Hye J.;Youn HY;Yoo Sun K.
    • Journal of Biomedical Engineering Research
    • /
    • v.26 no.5
    • /
    • pp.301-307
    • /
    • 2005
  • We introduce a new multimedia telemedicine system which is called Telemedicine for Real-time Emergency Multi-consultation(TREM), based on multiple connection between medical specialists. Due to the subdivision of medical specialties, the existing one-to-one telemedicine system needs be modified to a simultaneous multi-consulting system. To facilitate the consultation the designed system includes following modules: high-quality video, video conferenceing, bio-signal transmission, and file transmission. In order to enhance the operability of the system in different network environment, we made it possible for the user to choose appropriate data acquisition sources of multimedia data and video resolutions. We have tested this system set up in three different places: emergency room, radiologist's office, and surgeon's office. All three communicating systems were successful in making connections with the multi-consultation center to exchange data simultaneously in real-time.

Automatic Name Line Detection for Person Indexing Based on Overlay Text

  • Lee, Sanghee;Ahn, Jungil;Jo, Kanghyun
    • Journal of Multimedia Information System
    • /
    • v.2 no.1
    • /
    • pp.163-170
    • /
    • 2015
  • Many overlay texts are artificially superimposed on the broadcasting videos by humans. These texts provide additional information to the audiovisual content. Especially, the overlay text in news videos contains concise and direct description of the content. Therefore, it is most reliable clue for constructing a news video indexing system. To make the automatic person indexing of interview video in the TV news program, this paper proposes the method to only detect the name text line among the whole overlay texts in one frame. The experimental results on Korean television news videos show that the proposed framework efficiently detects the overlaid name text line.