• Title/Summary/Keyword: Video sequence

Search Result 504, Processing Time 0.023 seconds

A Personal Video Event Classification Method based on Multi-Modalities by DNN-Learning (DNN 학습을 이용한 퍼스널 비디오 시퀀스의 멀티 모달 기반 이벤트 분류 방법)

  • Lee, Yu Jin;Nang, Jongho
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1281-1297
    • /
    • 2016
  • In recent years, personal videos have seen a tremendous growth due to the substantial increase in the use of smart devices and networking services in which users create and share video content easily without many restrictions. However, taking both into account would significantly improve event detection performance because videos generally have multiple modalities and the frame data in video varies at different time points. This paper proposes an event detection method. In this method, high-level features are first extracted from multiple modalities in the videos, and the features are rearranged according to time sequence. Then the association of the modalities is learned by means of DNN to produce a personal video event detector. In our proposed method, audio and image data are first synchronized and then extracted. Then, the result is input into GoogLeNet as well as Multi-Layer Perceptron (MLP) to extract high-level features. The results are then re-arranged in time sequence, and every video is processed to extract one feature each for training by means of DNN.

Video Watermarking Algorithm for H.264 Scalable Video Coding

  • Lu, Jianfeng;Li, Li;Yang, Zhenhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.56-67
    • /
    • 2013
  • Because H.264/SVC can meet the needs of different networks and user terminals, it has become more and more popular. In this paper, we focus on the spatial resolution scalability of H.264/SVC and propose a blind video watermarking algorithm for the copyright protection of H.264/SVC coded video. The watermark embedding occurs before the H.264/SVC encoding, and only the original enhancement layer sequence is watermarked. However, because the watermark is embedded into the average matrix of each macro block, it can be detected in both the enhancement layer and base layer after downsampling, video encoding, and video decoding. The proposed algorithm is examined using JSVM, and experiment results show that is robust to H.264/SVC coding and has little influence on video quality.

3D DCT Video Information Hiding

  • Kim, Young-Gon;Jie Yang;Lee, Hye-Joo;Hong, Jin-Woo;Lee, Moon-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.169-172
    • /
    • 2002
  • Embedding information into video data is a topic that recently gained increasing attention. This paper proposes a new approach for digital watermarking and secure copyright protection of video, the principal aim being to discourage illicit copying and distribution of copyrighted material. The method presented here is based on the three dimensional discrete cosine transform of video scene, in contrast with previous works on video watermarking where each video frame was marked separately, or where only intra-frame or motion compensation parameters were marked in MPEG compressed videos. The watermark sequence used is encrypted, pseudo-noise signal to the video. The performance of the presented technique is evaluated experimentally

  • PDF

Multi-View Video System using Single Encoder and Decoder (단일 엔코더 및 디코더를 이용하는 다시점 비디오 시스템)

  • Kim Hak-Soo;Kim Yoon;Kim Man-Bae
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.116-129
    • /
    • 2006
  • The progress of data transmission technology through the Internet has spread a variety of realistic contents. One of such contents is multi-view video that is acquired from multiple camera sensors. In general, the multi-view video processing requires encoders and decoders as many as the number of cameras, and thus the processing complexity results in difficulties of practical implementation. To solve for this problem, this paper considers a simple multi-view system utilizing a single encoder and a single decoder. In the encoder side, input multi-view YUV sequences are combined on GOP units by a video mixer. Then, the mixed sequence is compressed by a single H.264/AVC encoder. The decoding is composed of a single decoder and a scheduler controling the decoding process. The goal of the scheduler is to assign approximately identical number of decoded frames to each view sequence by estimating the decoder utilization of a Gap and subsequently applying frame skip algorithms. Furthermore, in the frame skip, efficient frame selection algorithms are studied for H.264/AVC baseline and main profiles based upon a cost function that is related to perceived video quality. Our proposed method has been performed on various multi-view test sequences adopted by MPEG 3DAV. Experimental results show that approximately identical decoder utilization is achieved for each view sequence so that each view sequence is fairly displayed. As well, the performance of the proposed method is examined in terms of bit-rate and PSNR using a rate-distortion curve.

Haptic Rendering Algorithm for Collision Situation of Two Objects (두 객체가 충돌하는 상황에서의 햅틱 렌더링 알고리즘)

  • Kim, Seonkyu;Kim, Hyebin;Ryu, Chul
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.35-41
    • /
    • 2018
  • In this paper, we define a haptic rendering algorithm for a situation that has collision between static object and single object. We classified video scenes into four categories which can be easily seen in video sequence. The proposed algorithm can detect which frame is suitable for haptic rendering by detecting the change of direction using motion estimation and change of shape using object tracking. As a result, a total of 13 frames are extracted from the sample video and playing time of these frames were calculated. We confirmed that the haptic effect appears in expected playing time by adding the appropriate haptic generating waveform thtough the haptic editing program.

A Shadow Region Suppression Method using Intensity Projection and Converting Energy to Improve the Performance of Probabilistic Background Subtraction (확률기반 배경제거 기법의 향상을 위한 밝기 사영 및 변환에너지 기반 그림자 영역 제거 방법)

  • Hwang, Soon-Min;Kang, Dong-Joong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.1
    • /
    • pp.69-76
    • /
    • 2010
  • The segmentation of moving object in video sequence is a core technique of intelligent image processing system such as video surveillance, traffic monitoring and human tracking. A typical method to segment a moving region from the background is the background subtraction. The steps of background subtraction involve calculating a reference image, subtracting new frame from reference image and then thresholding the subtracted result. One of famous background modeling is Gaussian mixture model (GMM). Even though the method is known efficient and exact, GMM suffers from a problem that includes false pixels in ROI (region of interest), specifically shadow pixels. These false pixels cause fail of the post-processing tasks such as tracking and object recognition. This paper presents a method for removing false pixels included in ROT. First, we subdivide a ROI by using shape characteristics of detected objects. Then, a method is proposed to classify pixels from using histogram characteristic and comparing difference of energy that converts the color value of pixel into grayscale value, in order to estimate whether the pixels belong to moving object area or shadow area. The method is applied to real video sequence and the performance is verified.

Region-based H.263 Video Codec with Effective Rate Control Algorithm for Low VBR Video (개선된 특징차 비교 방법을 이용한 컷 검출 알고리즘에 관한 연구)

  • 최인호;이대영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.9B
    • /
    • pp.1690-1696
    • /
    • 1999
  • Video sequence should be hierachically classified for the content-based retrieval. Cut detection algorithm is an essential process to classify shots. It is generally difficult for cut detection algorithms to detect cut points since a current frame is compared with a previous one, because movement of camera or object made adrupt scene change. We reduce ratio of failed cut detection so that compare the difference between frames of predicted cut point and their neighbors. In this paper, first we get predicted cut point, then we judge that the predicted cut point is true point or not. And we extracted DC images in MPEG video sequence for comparison. As a result of experiments. We confirmed that the cut detection ratio of the proposed algorithm is higher than of any other algorithms.

  • PDF

Overlay Text Graphic Region Extraction for Video Quality Enhancement Application (비디오 품질 향상 응용을 위한 오버레이 텍스트 그래픽 영역 검출)

  • Lee, Sanghee;Park, Hansung;Ahn, Jungil;On, Youngsang;Jo, Kanghyun
    • Journal of Broadcast Engineering
    • /
    • v.18 no.4
    • /
    • pp.559-571
    • /
    • 2013
  • This paper has presented a few problems when the 2D video superimposed the overlay text was converted to the 3D stereoscopic video. To resolve the problems, it proposes the scenario which the original video is divided into two parts, one is the video only with overlay text graphic region and the other is the video with holes, and then processed respectively. And this paper focuses on research only to detect and extract the overlay text graphic region, which is a first step among the processes in the proposed scenario. To decide whether the overlay text is included or not within a frame, it is used the corner density map based on the Harris corner detector. Following that, the overlay text region is extracted using the hybrid method of color and motion information of the overlay text region. The experiment shows the results of the overlay text region detection and extraction process in a few genre video sequence.

A "GAP-Model" based Framework for Online VVoIP QoE Measurement

  • Calyam, Prasad;Ekici, Eylem;Lee, Chang-Gun;Haffner, Mark;Howes, Nathan
    • Journal of Communications and Networks
    • /
    • v.9 no.4
    • /
    • pp.446-456
    • /
    • 2007
  • Increased access to broadband networks has led to a fast-growing demand for voice and video over IP(VVoIP) applications such as Internet telephony(VoIP), videoconferencing, and IP television(IPTV). For pro-active troubleshooting of VVoIP performance bottlenecks that manifest to end-users as performance impairments such as video frame freezing and voice dropouts, network operators cannot rely on actual end-users to report their subjective quality of experience(QoE). Hence, automated and objective techniques that provide real-time or online VVoIP QoE estimates are vital. Objective techniques developed to-date estimate VVoIP QoE by performing frame-to-frame peak-signal-to-noise ratio(PSNR) comparisons of the original video sequence and the reconstructed video sequence obtained from the sender-side and receiver-side, respectively. Since processing such video sequences is time consuming and computationally intensive, existing objective techniques cannot provide online VVoIP QoE. In this paper, we present a novel framework that can provide online estimates of VVoIP QoE on network paths without end-user involvement and without requiring any video sequences. The framework features the "GAP-model", which is an offline model of QoE expressed as a function of measurable network factors such as bandwidth, delay, jitter, and loss. Using the GAP-model, our online framework can produce VVoIP QoE estimates in terms of "Good", "Acceptable", or "Poor"(GAP) grades of perceptual quality solely from the online measured network conditions.

Clustering Technique for Sequence Data Sets in Multidimensional Data Space (다차원 데이타 공간에서 시뭔스 데이타 세트를 위한 클러스터링 기법)

  • Lee, Seok-Lyong;LiIm, Tong-Hyeok;Chung, Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.655-664
    • /
    • 2001
  • The continuous data such as video streams and voice analog signals can be modeled as multidimensional data sequences(MDS's) in the feature space, In this paper, we investigate the clustering technique for multidimensional data sequence, Each sequence is represented by a small number by hyper rectangular clusters for subsequent storage and similarity search processing. We present a linear clustering algorithm that guarantees a predefined level of clustering quality and show its effectiveness via experiments on various video data sets.

  • PDF