• Title/Summary/Keyword: Video extraction

Search Result 466, Processing Time 0.023 seconds

A study on automatic extraction of a moving object using optical flow (Optical flow 이론을 이용한 움직이는 객체의 자동 추출에 관한 연구)

  • 정철곤;김경수;김중규
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.50-53
    • /
    • 2000
  • In this work, the new algorithm that automatically extracts moving object of the video image is presented. In order to extract moving object, it is that velocity vectors correspond to each frame of the video image. Using the estimated velocity vector, the position of the object are determined. the value of the coordination of the object is initialized to the seed, and in the image plane, the moving object is automatically segmented by the region growing method. As the result of an application in sequential images, it is available to extract a moving object.

  • PDF

Video Caption Extraction in MPEG compressed video (압축 MPEG 비디오 상에서의 자막 검출 및 추출)

  • 전승수;김정림;오상욱;설상훈
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.985-988
    • /
    • 2001
  • 본 논문은 DCT를 기반으로 하여 비디오 내에서 자막을 I-frame들로부터 추출하였다. 본 논문에서 제안하는 자막 검출 및 추출 방법은 자막이 주위 배경 화면과 그 대비 값이 크다는 점과 화면상에 일정한 시간동안 유지된다는 점을 이용하였다. 먼저 비디오 내에서 I-frame들의 DCT 값들로부터 주위 배경화면과 비교하여 그 대비 값이 큰 영역들을 표시하였다. 이로부터 자막의 시간적 특성과 공간적 특성을 이용하여 자막을 포함하는 프레임을 검출하여, 그 내에 있는 자막 영역을 추출하였다.

  • PDF

Efficient Video Retrieval Scheme with Luminance Projection Model (휘도투시모델을 적용한 효율적인 비디오 검색기법)

  • Kim, Sang Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.12
    • /
    • pp.8649-8653
    • /
    • 2015
  • A number of video indexing and retrieval algorithms have been proposed to manage large video databases efficiently. The video similarity measure is one of most important technical factor for video content management system. In this paper, we propose the luminance characteristics model to measure the video similarity efficiently. Most algorithms for video indexing have been commonly used histograms, edges, or motion features, whereas in this paper, the proposed algorithm is employed an efficient similarity measure using the luminance projection. To index the video sequences effectively and to reduce the computational complexity, we calculate video similarity using the key frames extracted by the cumulative measure, and compare the set of key frames using the modified Hausdorff distance. Experimental results show that the proposed luminance projection model yields the remarkable improved accuracy and performance than the conventional algorithm such as the histogram comparison method, with the low computational complexity.

Video Scene Detection using Shot Clustering based on Visual Features (시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.47-60
    • /
    • 2012
  • Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

PCA-Based MPEG Video Retrieval in Compressed Domain (PCA에 기반한 압축영역에서의 MPEG Video 검색기법)

  • 이경화;강대성
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.1
    • /
    • pp.28-33
    • /
    • 2003
  • This paper proposes a database index and retrieval method using the PCA(Principal Component Analysis). We perform a scene change detection and key frame extraction from the DC Image constructed by DCT DC coefficients in the compressed video stream that is video compression standard such as MPEG. In the extracted key frame, we use the PCA, then we can make codebook that has a statistical data as a codeword, which is saved as a database index. We also provide retrieval image that are similar to user's query image in a video database. As a result of experiments, we confirmed that the proposed method clearly showed superior performance in video retrieval and reduced computation time and memory space.

Extraction of Superimposed-Caption Frame Scopes and Its Regions for Analyzing Digital Video (비디오 분석을 위한 자막프레임구간과 자막영역 추출)

  • Lim, Moon-Cheol;Kim, Woo-Saeng
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.11
    • /
    • pp.3333-3340
    • /
    • 2000
  • Recently, Requnremeni for video data have been increased rapidly by high progress of both hardware and cornpression technique. Because digital video data are unformed and mass capacity, it needs various retrieval techniquesjust as contednt-based rehieval Superimposed-caption ina digital video can help us to analyze the video story easier and be used as indexing information for many retrieval techniques In this research we propose a new method that segments the caption as analyzing texture eature of caption regions in each video frame, and that extracts the accurate scope of superimposed-caption frame and its key regions and color by measunng cominuity of caption regions between frames

  • PDF

Semantic Scenes Classification of Sports News Video for Sports Genre Analysis (스포츠 장르 분석을 위한 스포츠 뉴스 비디오의 의미적 장면 분류)

  • Song, Mi-Young
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.5
    • /
    • pp.559-568
    • /
    • 2007
  • Anchor-person scene detection is of significance for video shot semantic parsing and indexing clues extraction in content-based news video indexing and retrieval system. This paper proposes an efficient algorithm extracting anchor ranges that exist in sports news video for unit structuring of sports news. To detect anchor person scenes, first, anchor person candidate scene is decided by DCT coefficients and motion vector information in the MPEG4 compressed video. Then, from the candidate anchor scenes, image processing method is utilized to classify the news video into anchor-person scenes and non-anchor(sports) scenes. The proposed scheme achieves a mean precision and recall of 98% in the anchor-person scenes detection experiment.

  • PDF

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

  • Yang, Zhenwei;Shen, Liquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4476-4491
    • /
    • 2021
  • Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

All-optical packet switching system : clock extraction as a key technology (완전 광 패킷 스위칭 시스템 : 클럭 추출 핵심 기술)

  • 이혁재;원용협
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.10
    • /
    • pp.79-88
    • /
    • 2003
  • We demonstrate a novel all-optical packet switching system that is suitable for optical ring networks. For the demonstration, video signals are encoded into optical packets which are composed of header and payload. The optical packets are all-optically processed at a switching node based on all-optical header processor, packet-level clock extraction, bit-level clock extraction, all-optical data format converter and so on.