• Title/Summary/Keyword: Video Frames

Search Result 888, Processing Time 0.032 seconds

Fuzzy Logic Based Temporal Error Concealment for H.264 Video

  • Lee, Pei-Jun;Lin, Ming-Long
    • ETRI Journal
    • /
    • v.28 no.5
    • /
    • pp.574-582
    • /
    • 2006
  • In this paper, a new error concealment algorithm is proposed for the H.264 standard. The algorithm consists of two processes. The first process uses a fuzzy logic method to select the size type of lost blocks. The motion vector of a lost block is calculated from the current frame, if the motion vectors of the neighboring blocks surrounding the lost block are discontinuous. Otherwise, the size type of the lost block can be determined from the preceding frame. The second process is an error concealment algorithm via a proposed adapted multiple-reference-frames selection for finding the lost motion vector. The adapted multiple-reference-frames selection is based on the motion estimation analysis of H.264 coding so that the number of searched frames can be reduced. Therefore the most accurate mode of the lost block can be determined with much less computation time in the selection of the lost motion vector. Experimental results show that the proposed algorithm achieves from 0.5 to 4.52 dB improvement when compared to the method in VM 9.0.

  • PDF

Fast Multiple Reference Frame Selection for H.264 Encoding (H.264 부호화를 위한 고속 다중 참조 화면 결정 기법)

  • Jeong, Jin-Woo;Cheo, Yoon-Sik
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.419-420
    • /
    • 2006
  • In the new video coding standard H.264/AVC, motion estimation (ME) is allowed to search multiple reference frames for improve the rate-distortion performance. The complexity of multi-frame motion estimation increases linearly with the number of used reference frame. However, the distortion gain given by each reference frame varies with the video sequence, and it is not efficient to search through all the candidate frames. In this paper, we propose a fast mult-frame selection method using all zero coefficient block (AZCB) prediction and sum of difference (SAD) of neighbor block. Simulation results show that the speed of the proposed algorithm is up to two times faster than exhaustive search of multiple reference frames with similar quality and bit-rate.

  • PDF

Automatic Superimposed Text Localization from Video Using Temporal Information

  • Jung, Cheol-Kon;Kim, Joong-Kyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.9C
    • /
    • pp.834-839
    • /
    • 2007
  • The superimposed text in video brings important semantic clues into content analysis. In this paper, we present the new and fast superimposed text localization method in video segments. We detect the superimposed text by using temporal information contained in the video. To detect the superimposed text fast, we have minimized the candidate region of localizing superimposed texts by using the difference between consecutive frames. Experimental results are presented to demonstrate the good performance of the new superimposed text localization algorithm.

Quantization Parameter Selection Method For H.264-based Multi-view Video Coding (H.264 기반 다시점 비디오 부호화를 위한 양자화 계수 결정 방법)

  • Park, Pil-Kyu;Ho, Yo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.6C
    • /
    • pp.579-584
    • /
    • 2007
  • Recently various prediction structures have been proposed to exploit inter-view correlation among multi-view video sequences. In this paper, we propose a QP(quantization parameter) selection method for the B frame inserted in the first frames of each GOP(group of pictures), where we change QP for the B frame adaptively to achieve uniform picture quality and overall coding gain. Each B frame is coded with reference to two frames in its adjacent views. We calculate QP for the B frame based on the correlation between the two reference frames, calculated using their rate-distortion costs. By applying the proposed method to the MVC reference prediction structure, we have improved the coding gain by 0.09$\sim$0.16 dB.

Caption Detection and Recognition for Video Image Information Retrieval (비디오 영상 정보 검색을 위한 문자 추출 및 인식)

  • 구건서
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.7
    • /
    • pp.901-914
    • /
    • 2002
  • In this paper, We propose an efficient automatic caption detection and location method, caption recognition using FE-MCBP(Feature Extraction based Multichained BackPropagation) neural network for content based retrieval of video. Frames are selected at fixed time interval from video and key frames are selected by gray scale histogram method. for each key frames, segmentation is performed and caption lines are detected using line scan method. lastly each characters are separated. This research improves speed and efficiency by color segmentation using local maximum analysis method before line scanning. Caption detection is a first stage of multimedia database organization and detected captions are used as input of text recognition system. Recognized captions can be searched by content based retrieval method.

  • PDF

Effective Hand Gesture Recognition by Key Frame Selection and 3D Neural Network

  • Hoang, Nguyen Ngoc;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • Smart Media Journal
    • /
    • v.9 no.1
    • /
    • pp.23-29
    • /
    • 2020
  • This paper presents an approach for dynamic hand gesture recognition by using algorithm based on 3D Convolutional Neural Network (3D_CNN), which is later extended to 3D Residual Networks (3D_ResNet), and the neural network based key frame selection. Typically, 3D deep neural network is used to classify gestures from the input of image frames, randomly sampled from a video data. In this work, to improve the classification performance, we employ key frames which represent the overall video, as the input of the classification network. The key frames are extracted by SegNet instead of conventional clustering algorithms for video summarization (VSUMM) which require heavy computation. By using a deep neural network, key frame selection can be performed in a real-time system. Experiments are conducted using 3D convolutional kernels such as 3D_CNN, Inflated 3D_CNN (I3D) and 3D_ResNet for gesture classification. Our algorithm achieved up to 97.8% of classification accuracy on the Cambridge gesture dataset. The experimental results show that the proposed approach is efficient and outperforms existing methods.

Efficient Correlation Noise Modeling and Performance Analysis for Distributed Video Coding System (분산 동영상 부호화 시스템을 위한 효과적인 상관 잡음 모델링 및 성능평가)

  • Moon, Hak-Soo;Lee, Chang-Woo;Lee, Seong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.6C
    • /
    • pp.368-375
    • /
    • 2011
  • In the distributed video coding system, the parity bits, which are generated in encoders, are used to reconstruct Wyner-Ziv frames. Since the original Wyner-Ziv frames are not known in decoders, the efficient correlation noise modeling for turbo or LDPC code is necessary. In this paper, an efficient correlation noise modeling method is proposed and the performance is analyzed. The method to estimate the quantization parameters for key frames, which are encoded using H.264 intraframe coding technique, is also proposed. The performance of the proposed system is evaluated by extensive computer simulations.

Adaptive Correlation Noise Model for DC Coefficients in Wyner-Ziv Video Coding

  • Qin, Hao;Song, Bin;Zhao, Yue;Liu, Haihua
    • ETRI Journal
    • /
    • v.34 no.2
    • /
    • pp.190-198
    • /
    • 2012
  • An adaptive correlation noise model (CNM) construction algorithm is proposed in this paper to increase the efficiency of parity bits for correcting errors of the side information in transform domain Wyner-Ziv (WZ) video coding. The proposed algorithm introduces two techniques to improve the accuracy of the CNM. First, it calculates the mean of direct current (DC) coefficients of the original WZ frame at the encoder and uses it to assist the decoder to calculate the CNM parameters. Second, by considering the statistical property of the transform domain correlation noise and the motion characteristic of the frame, the algorithm adaptively models the DC coefficients of the correlation noise with the Gaussian distribution for the low motion frames and the Laplacian distribution for the high motion frames, respectively. With these techniques, the proposed algorithm is able to make a more accurate approximation to the real distribution of the correlation noise at the expense of a very slight increment to the coding complexity. The simulation results show that the proposed algorithm can improve the average peak signal-to-noise ratio of the decoded WZ frames by 0.5 dB to 1.5 dB.

Shot boundary Frame Detection and Key Frame Detection for Multimedia Retrieval (멀티미디어 검색을 위한 shot 경계 및 대표 프레임 추출)

  • 강대성;김영호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.1
    • /
    • pp.38-43
    • /
    • 2001
  • This Paper suggests a new feature for shot detection, using the proposed robust feature from the DC image constructed by DCT DC coefficients in the MPEG video stream, and proposes the characterizing value that reflects the characteristic of kind of video (movie, drama, news, music video etc.). The key frames are pulled out from many frames by using the local minima and maxima of differential of the value. After original frame(not do image) are reconstructed for key frame, indexing process is performed through computing parameters. Key frames that are similar to user's query image are retrieved through computing parameters. It is proved that the proposed methods are better than conventional method from experiments. The retrieval accuracy rate is so high in experiments.

  • PDF

Efficient Video Retrieval Scheme with Luminance Projection Model (휘도투시모델을 적용한 효율적인 비디오 검색기법)

  • Kim, Sang Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.12
    • /
    • pp.8649-8653
    • /
    • 2015
  • A number of video indexing and retrieval algorithms have been proposed to manage large video databases efficiently. The video similarity measure is one of most important technical factor for video content management system. In this paper, we propose the luminance characteristics model to measure the video similarity efficiently. Most algorithms for video indexing have been commonly used histograms, edges, or motion features, whereas in this paper, the proposed algorithm is employed an efficient similarity measure using the luminance projection. To index the video sequences effectively and to reduce the computational complexity, we calculate video similarity using the key frames extracted by the cumulative measure, and compare the set of key frames using the modified Hausdorff distance. Experimental results show that the proposed luminance projection model yields the remarkable improved accuracy and performance than the conventional algorithm such as the histogram comparison method, with the low computational complexity.