• Title/Summary/Keyword: Intra-Mode Coding

Search Result 162, Processing Time 0.019 seconds

Deep-learning based Object Detection in Thermal Video Using Compressed-Domain Information (열영상에서 압축 도메인 정보를 이용한 딥러닝 기반 객체 탐지 방법)

  • Byeon, JooHyung;Nam, Gunook;Park, Jangsoo;Lee, Jongseok;Sim, Donggyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.160-162
    • /
    • 2018
  • 본 논문에서는 압축 영역에서 열 영상을 이용한 딥러닝 기반의 객체 검출 방법을 제안한다. 비디오 압축 표준인 High Efficiency Video Coding(HEVC)를 이용하여 부보화된 비트스트림으로부터 Intra Prediction Mode(IPM), Prediction Unit Size(PUS), Transform Unit Size(TUS)를 추출하고 3 채널 영상으로 변환하고 객체 검출 네트워크인 YOLO 에 입력으로 넣어주어 최종적으로 객체의 위치 및 객체의 종류를 예측한다. 실험결과로써 복원된 열 영상과 검출된 결과를 주관적으로 보여줌으로써 압축영역에서 열영상을 이용한 객체 검출이 가능함을 보인다.

  • PDF

Selective Reference Line Sharing for Chroma Intra Prediction (채널 간 선택적 참조 라인 공유 방법)

  • Lee, Yujin;Park, Jeeyoon;Jeon, Byeungwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.197-198
    • /
    • 2022
  • Versatile Video Coding (VVC)는 차세대 동영상 압축 표준화 과정에서 다수의 부호화 기술을 새롭게 채택하였는데, 이중 Multiple Reference Lines (MRL)을 포함한 일부 기술은 휘도 채널에만 적용될 수 있으며 색차 성분에 대해서는 적용이 고려되지 않는다. 본 논문은 VVC 에서 휘도 채널에만 적용되는 MRL 기술을 색차 채널로 확장하기 위하여, DM(Derived Mode)을 사용하는 색차 블록의 대응 휘도 블록이 MRL 을 사용하는 경우에 해당 참조 라인을 선택적으로 공유하여 색차 블록이 화면 내 예측에 복수개의 참조 라인을 고려하여 선택할 수 있도록 하는 방법을 제안한다. 실험 결과, VVC Test Model (VTM) 15.0 대비 Cb, Cr 성분 각각 -0.09%, -0.05%의 성능 향상을 보인다.

  • PDF

Enhanced intra prediction mode decision method for VVC (VVC 부호화기의 화면내 부호화 모드 결정 개선 방법)

  • Yun, ByungJin;Gwon, Daehyeok;Choe, JaeRyun;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.191-193
    • /
    • 2019
  • ISO/IEC JTC1 WG11 Moving Picture Expers Group 과 ITU-T SC16 은 Joint Video Experts Team 을 구성하여 차세대 비디오 부호화 표준으로서 Versatile Video Coding(VVC)를 표준화 중이다. VVC 는 현재 블록의 화면내 예측 모드일 가능성이 높은 모드의 집합인 Most Probable Mode(MPM) 리스트를 유도하고, MPM 을 이용하여 효율적으로 화면내 예측 모드를 부호화한다. VVC 참조 소프트웨어는 주변 블록의 화면내 예측 모드가 일치하는지 여부에 따라 1 개 또는 2 개의 모드를 최종 후보 선택을 위한 과정인 Rate-Distortion Optimization(RDO) 과정에 추가한다. 하지만 현재 MPM 은 항상 첫 번째 후보로 Planar 모드가 위치하며 이로 인하여, 주변 블록의 화면내 예측 모드가 RDO 에 추가되지 않는 경우가 존재한다. 따라서 본 논문은 VVC 의 부호화기에서 주변 블록의 화면내 예측 모드가 고려되지 않는 경우가 존재하는 문제를 해결하기 위한 방법을 제안한다. 제안 방법은 MPM 유도 과정에서 RDO 에 포함할 후보의 개수를 수정하여 RDO 과정에 항상 주변 블록의 화면내 예측 모드가 추가되도록 한다. 본 논문은 실험을 통해 제안 방법이 약 0.04%의 부호화 효율을 향상시켰음을 보인다.

  • PDF

PSNR-based Initial QP Determination for Low Bit Rate Video Coding

  • Park, Sang-Hyun
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.3
    • /
    • pp.315-320
    • /
    • 2012
  • In H.264/AVC, the first frame of a group of pictures (GOP) is encoded in intra mode which generates a large number of bits. The number of bits for the I-frame affects the qualities of the following frames of a GOP since they are encoded using the bits remaining among the bits allocated to the GOP. In addition, the first frame is used for the inter mode encoding of the following frames. Thus, the initial quantization parameter (QP) affects the following frames as well as the first frame. In this paper, an adaptive peak signal to noise ratio (PSNR)-based initial QP determination algorithm is presented. In the proposed algorithm, a novel linear model is established based on the observation of the relation between the initial QPs and PSNRs of frames. Using the linear model and PSNR results of the encoded GOPs, the proposed algorithm accurately estimates the optimal initial QP which maximizes the PSNR of the current GOP. It is shown by experimental results that the proposed algorithm predicts the optimal initial QP accurately and thus achieves better PSNR performance than that of the existing algorithm.

Fast CU Decision Algorithm using the Initial CU Size Estimation and PU modes' RD Cost (초기 CU 크기 예측과 PU 모드 예측 비용을 이용한 고속 CU 결정 알고리즘)

  • Yoo, Hyang-Mi;Shin, Soo-Yeon;Suh, Jae-Won
    • Journal of Broadcast Engineering
    • /
    • v.19 no.3
    • /
    • pp.405-414
    • /
    • 2014
  • High Efficiency Video Coding(HEVC) obtains high compression ratio by applying recursive quad-tree structured coding unit(CU). However, this recursive quad-tree structure brings very high computational complexity to HEVC encoder. In this paper, we present fast CU decision algorithm in recursive quad-tree structure. The proposed algorithm estimates initial CU size before CTU encoding and checks the proposed condition using Coded Block Flag(CBF) and Rate-distortion cost to achieve the fast encoding time saving. And, intra mode estimation is also possible to be skipped using the CBF values acquired during the inter PU mode estimations. Experiment results shows that the proposed algorithm saved about 49.91% and 37.97% of encoding time according to the weighting condition.

Initial QP Determination Algorithm for Low Bit Rate Video Coding (저전송률 비디오 압축에서 초기 QP 결정 알고리즘)

  • Park, Sang-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.10
    • /
    • pp.2071-2078
    • /
    • 2009
  • The first frame is encoded in intra mode which generates a larger number of bits. In addition, the first frame is used for the inter mode encoding of the following frames. Thus the intial QP (Quantization Parameter) for the first frame affects the first frame as well as the following frames. Traditionally, the initial QP is determined among four constant values only depending on the bpp. In the case of low bit rate video coding, the initial QP value is fixed to 35 regardless of the output bandwidth. Although this initialization scheme is simple, yet it is not accurate enough. An accurate intial QP prediction scheme should not only depends on bpp but also on the complexity of the video sequence and the output bandwidth. In the proposed scheme, we use a linear model because there is a linear inverse proportional relationship between the output bandwidth and the optimal intial QP. Model parameters of the model are determined depending on the spatial complexity of the first frame. It is shown by experimental results that the new algorithm can predict the optimal initial QP more accurately and generate the PSNR performance better than that of the existing JM algorithm.

A Frequency Domain DV-to-MPEG-2 Transcoding (DV에서 MPEG-2로의 주파수 영역 변환 부호화)

  • Kim, Do-Nyeon;Yun, Beom-Sik;Choe, Yun-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.2
    • /
    • pp.138-148
    • /
    • 2001
  • Digital Video (DV) coding standards for digital video cassette recorder are based mainly on DCT and variable length coding. DV has low hardware complexity but high compressed bit rate of about 26 Mb/s. Thus, it is necessary to encode video with low complex video coding at the studios and then transcode compressed video into MPEG-2 for video-on-demand system. Because these coding methods exploit DCT, transcoding in the DCT domain can reduce computational complexity by excluding duplicated procedures. In transcoding DV into MPEC-2 intra coding, multiplying matrix by transformed data is used for 4:1:1-to-4:2:2 chroma format conversion and the conversion from 2-4-8 to 8-8 DCT mode, and therefore enables parallel processing. Variance of sub block for MPEG-2 rate control is computed completely in the DCT domain. These are verified through experiments. We estimate motion hierarchically using DCT coefficients for transcoding into MPEG-2 inter coding. First, we estimate motion of a macro block (MB) only with 4 DC values of 4 sub blocks and then estimate motion with 16-point MB using IDCT of 2$\times$2 low frequencies in each sub block, and finish estimation at a sub pixel as the fifth step. ME with overlapped search range shows better PSNR performance than ME without overlapping.

  • PDF

Baseline based Binary Shape Coder (기준선 기반 이진 형상 부호화기)

  • 이시화;조대성;조유신;손세훈;장의선;신재섭;서양석
    • Journal of Broadcast Engineering
    • /
    • v.2 no.2
    • /
    • pp.114-124
    • /
    • 1997
  • In object based coding, binary shape ccx:ling plays an important role by ccx:ling the outer shape of object. Here we propose a new shape ccx:ling tool, which enccx:les the outline of shape from a baseline. Different from 2-D (Vertex) shape ccx:ling algorithms. the proposed method encodeds the data that are extracted in a I-D fashion. The enccx:led data consist of the starting position, distance lists, and turning point lists. In the lossless ccx:ling mode, every contour pixel is input for ccx:ling, whereas variable sampling has been employed to enccx:le fewer contour pixels while preserving reasonable distortion. For interframe ccx:ling, a fast motion compensation was achieved by use of distance and turning point lists. Subjective viewing tests proved that the proposed method outperforms the current shape ccx:ling standard, CAE, in MPEG-4. In objective results for compression efficiency, the proposed method was significantly better in intraframe coding than CAE, whereas CAE was better in interframe ccx:ling.

  • PDF

H.263-Based Scalable Video Codec (H.263을 기반으로 한 확장 가능한 비디오 코덱)

  • 노경택
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.29-32
    • /
    • 2000
  • Layered video coding schemes allow the video information to be transmitted in multiple video bitstreams to achieve scalability. they are attractive in theory for two reasons. First, they naturally allow for heterogeneity in networks and receivers in terms of client processing capability and network bandwidth. Second, they correspond to optimal utilization of available bandwidth when several video qualify levels are desired. In this paper we propose a scalable video codec architectures with motion estimation, which is suitable for real-time audio and video communication over packet networks. The coding algorithm is compatible with ITU-T recommendation H.263+ and includes various techniques to reduce complexity. Fast motion estimation is Performed at the H.263-compatible base layer and used at higher layers, and perceptual macroblock skipping is performed at all layers before motion estimation. Error propagation from packet loss is avoided by Periodically rebuilding a valid Predictor in Intra mode at each layer.

  • PDF

Design of video encoder using Multi-dimensional DCT (다차원 DCT를 이용한 비디오 부호화기 설계)

  • Jeon, S.Y.;Choi, W.J.;Oh, S.J.;Jeong, S.Y.;Choi, J.S.;Moon, K.A.;Hong, J.W.;Ahn, C.B.
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.732-743
    • /
    • 2008
  • In H.264/AVC, 4$\times$4 block transform is used for intra and inter prediction instead of 8$\times$8 block transform. Using small block size coding, H.264/AVC obtains high temporal prediction efficiency, however, it has limitation in utilizing spatial redundancy. Motivated on these points, we propose a multi-dimensional transform which achieves both the accuracy of temporal prediction as well as effective use of spatial redundancy. From preliminary experiments, the proposed multi-dimensional transform achieves higher energy compaction than 2-D DCT used in H.264. We designed an integer-based transform and quantization coder for multi-dimensional coder. Moreover, several additional methods for multi-dimensional coder are proposed, which are cube forming, scan order, mode decision and updating parameters. The Context-based Adaptive Variable-Length Coding (CAVLC) used in H.264 was employed for the entropy coder. Simulation results show that the performance of the multi-dimensional codec appears similar to that of H.264 in lower bit rates although the rate-distortion curves of the multi-dimensional DCT measured by entropy and the number of non-zero coefficients show remarkably higher performance than those of H.264/AVC. This implies that more efficient entropy coder optimized to the statistics of multi-dimensional DCT coefficients and rate-distortion operation are needed to take full advantage of the multi-dimensional DCT. There remains many issues and future works about multi-dimensional coder to improve coding efficiency over H.264/AVC.