1. Introduction
The High Efficiency Video Coding (HEVC) is a new video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) formed by the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). In comparison with its predecessor H.264/AVC, 50% bitrate reduction is achieved for equal perceptual quality [1]. While the same block-based hybrid coding structure as in H.264/AVC is adopted in HEVC, each coding tool is improved with new technical features and characteristics. As opposed to the fixed size macro-block (MB) in H.264/AVC, the coding tree unit (CTU) in HEVC is configurable and its size can be chosen from 64x64, 32x32 and 16x16. A CTU consists of one coding unit (CU) or can be recursively split into a quadtree of CUs. The largest size of a CU is equal to that of a CTU, and the smallest is 8x8. An example of CU splitting and its corresponding quadtree is shown in Fig. 1. The dotted lines with arrows represent the processing order of the CUs in the CTU, the notation “1” in the quadtree indicates further splitting and “0” indicates leaf node. A CU can be further split into multiple prediction units (PU) for inter or intra prediction. In intra coding, the PU partition type can be either PART_2Nx2N or PART_NxN, as shown in Fig. 2. However, PART_NxN is only allowed for the minimum CU size, otherwise it is similar as four equal-size sub-CUs.
Fig. 1.Example of a 64x64 CTU and its CU partition and processing order. (a) CU partitions (b) quadtree structure
Fig. 2.PU types for intra prediction
During the transform and quantization process, a CU can also be recursively split into a quadtree of transform unit (TU). The intra prediction is applied for each TU sequentially instead of applying the intra prediction at the PU level [2]. In this way, the nearest neighboring reference samples from the already reconstructed TU are used for prediction. The HEVC defines 35 intra prediction modes, namley Intra_DC, Intra_Planar and 33 angular prediction modes denoded as Intra_Angular[k], k = 2,…,34. In comparison with H.264/AVC, the intra prediction modes defined in HEVC are able to capture more directional structures.
In the HEVC test mode (HM), a rate-distortion (RD) optimized bottom-up pruning algorithm [3] is adopted to find the best CU partition of a CTU. Given a CU, the best PU partition type, prediction mode and TU partition are found by minimizing the Lagrangian cost. While the flexible block structure and more choices of prediction mode provide better coding performance, it dramatically increases the encoder complexity. Even for intraframe coding, it is still far away from real-time application [4]. The computational complexity mainly comes from two folds. First, an optimal CU partition within a CTU can consist of various sizes of CU ranging from 8x8 to 64x64. Second, finding an optimal prediction mode requires a RD optimization process. Therefore, fast algorithms in both aspects are desirable. Research works regarding fast CU size decision include [5-8]; works targeting fast intra mode decision include [9-15]. In our previous work [16], a novel fast bottom-up pruning (FBUP) algorithm was proposed and proved to be efficient. In this paper, three techniques are proposed for fast intra mode decision and combined with the FBUP algorithm. Starting from the smallest CU size, a full quadtree is processed in a bottom-up manner. First, we check whether mode decision process is needed for a current CU node. If the mode decision process is performed, then a set of intra candidate modes are derived from the coding information obtained from its sub-blocks (sub-CUs or PART_NXN PUs), followed by a rough decision process to choose a smaller set of candidates. The number of candidates in the set is adaptively chosen based on the low complexity RD cost. In this way, we can either skip the mode decision process or reduce the number of candidate modes. Thus, the computational complexity is reduced.
In Section 2, the previous works in the literature are reviewed. Then the proposed method is described in Section 3. In Section 4, the experimental results are presented to prove the efficiency of the proposed algorithm. Compared to the original HM encoder, averaging over all the test sequences and rate points, 41.39% encoding time saving is achieved while the bitrate increase is only 0.77%. Finally, the conclusion is given in Section 5.
2. Related Works
2.1 Rate-Distortion Optimized Bottom-up Pruning
In HEVC, the CTU size is 64x64 and the maximum number of depth levels of the quadtree is D=4. The HM encoder adopts the rate-distortion optimized bottom-up pruning algorithm to obtain the optimal CU quadtree partition. Consider the full quadtree as shown in Fig. 3, we locate a node by its depth level from top to bottom and its position from left to right. Let (i, j) be the index of jth node at depth level i , then j<4i. Denote Xi,j, as a CU at node (i, j) without further splitting and Ci,j as the optimal tree after rate-distortion optimized pruning. Let J(·) be the operation to calculate the best RD cost. The bottom-up pruning algorithm traverses the full quadtree in a depth-first order. The sub-CUs of node (i, j) is pruned if and only if .
Fig. 3.Full quadtree
2.2 Fast Intra Mode Decision Algorithm in HM
Full RD optimized search in the 35 intra prediction modes is computational complex. To reduce the encoder complexity, the HM encoder adopts a fast intra mode decision algorithm, which consists of two steps described as follows.
DSATD is the Sum of Absolute Transformed Difference (SATD) which calculates the sum of abolute value of hadamard transformed coefficients of the residual signal, Rmode is the number of bits to encode mode information, Drec is the distortion of the reconstructed block, Rtotal is the total number of bits to encode current block and λ is the Lagrangian multiplier. Calculation of JSATD doesn't require the DCT transform, quantization and entropy coding process, thus it’s much simpler than calculation of the full RD cost JfullRD. Therefore, the computational complexity can be reduced if N is set to a smaller value. In the HM encoder N is set to 3, 3, 3, 8 and 8 for the PU sizes of 64×64 , 32×32 , 16×16 , 8×8 and 4×4 respectively.
3. Proposed Methods
3.1 Fast bottom-up Pruning Algorithm
3.1.1 Analysis of Bottom-up Pruning Algorithm
The bottom-up pruning algorithm performs mode decision at each node of the full quadtree. Therefore, a total of mode decisions for each CTU. However, given a CU at node (i, j), the best partition and prediction modes of its sub-CUs Ci+1,4j+k,k=0,1,2,3 are known. Thus, we can reuse the coding information in sub-CUs to check whether mode decision at current node is necessary.
Let Q(·)=0 indicates a leaf node and Q(·)=1 denotes a tree. Then Q(Ci+1,4j+k)=1 suggests that the strutural information in the k th sub-CU is complex. Hence, the current node is likely to be split. In Table 1, it shows the probability p0 of a node is split when , i.e. . The results are obtained by coding the first second (50 frames) of HEVC test sequence BasketballDrill. Similar statistics are found in coding the other sequences. Note that at level 3, it reaches the leaf node of the quadtree, thus the CU doesn’t have any sub-CUs. However, HEVC allows PART_NxN type PU at the bottom level, in which the CU can also be considered as a tree.
Table 1.The probability
3.1.2 Fast bottom-up Pruning Algorithm
Based on the above observations, we can conclude that there is a high probability that a CU remains split if its sub-CUs are sub-trees. In other words, lots of mode decision process in a large CU is unnecessary. Therefore, they could be avoided to decrease the computational complexity. In this paper, is set as the necessary condition to skip the intra mode decision process at node (i, j). An example is shown in Fig. 4. If C1,1 is a tree, then the sub-CUs of node C0,0 are not pruned. Implementation details of the proposed fast bottom-up pruning algorithmis is described by the pseudocode in Algorithm 1.
Fig. 4.Example of a pruned quadtree.
Algorithm 1Fast bottom-up pruning algorithm
3.1.2 Analysis of the Performance of Fast Bottom-up Pruning Algorithm
To analyse the performance of the proposed FBUP algorithm, we can evaluate how often a node (i, j) is skipped by FBUP while its RD cost J(Xi,j) is actually less than J(Ci,j), denoted as , and use as a messure of loss in RD performance. An example is shown in Table 2, in which the results are obtained by encoding the HEVC test sequence BasketballDrill. We can see that generally the values of p1 and ∇J are small. Similar results are observed when coding other HEVC test sequences. It indicates that the probability of the proposed FBUP algorithm fails to predict the result of RD optimized mode decision is quite small, and the loss in RD performance is also trivial.
Table 2.Analysis of the performance of proposed fast bottom-up pruning algorithm
3.2 Proposed Fast Intra Mode Decision Algorithm
In this paper, three techniques are proposed for fast intra mode decision, namely, bottom-up prediction, adaptive intra mode filtering, and DC and Planar early decision.
3.2.1 Bottom-up Prediction
In the proposed FBUP algorithm, mode decision at a CU node is selectively skipped based on the coding information of its sub-CUs. Similarly, we can also predict the intra mode of current CU node if it’s not skipped. An example is shown in Fig. 5, in which the red arrows indicate the intra prediction directions. The intra prediction mode of the block is the same as the prediction modes of its two sub-blocks.
Fig. 5.Example of bottom-up prediction.
The straightforward way of bottom-up prediction is to use the chosen best modes of sub-CUs, instead of all 35 modes, as the candidate modes for the parent CU. Though the number of candidate modes can be largely reduced by such method, however, we find that the coding efficiency is decreased significantly. Therefore, a more robust scheme is required. In this paper, the rough mode decision results of sub-CUs are employed to select the candidate modes for parent CU. Let Ωi,j be the set of candidate modes of a CU at node (i, j) and Փi+1,4j+k be the set of N modes chosen in the rough mode decision process of its k th sub-CU, then . By using the N modes that are chosen by rough mode decision instead of only one mode, prediction from sub-CUs becomes more robust. On the other hand, N is usually a small number, so it can achieve a good tradeoff between coding efficiency and encoder complexity.
3.2.2 Adaptive Intra Mode Filtering
In the fast intra mode decision algorithm in HM, the number N is set to a fixed value. In this paper, an adaptive method is proposed based on the JSATD costs in the rough mode decision process.
3.2.3 DC and Planar Early Decision
The intra mode in HEVC can be categorized into two classes: one is DC and Planar modes, the other is directional intra prediction mode. The DC and Planar modes are designed for smooth area, and the directional modes are designed for coding blocks that have directional edges. It’s assumed that the rough mode decision process can effectively distinguish these two classes. So, in this paper, if the best mode in rough mode decision process is Planar or DC mode, then the directional modes are removed from the candidate set.
3.2.4 The Overall Algorithm
The three techniques described in the previous subsections can be combined. Similar to the original intra mode decision algorithm in HM, the most probable modes (MPM) derived from neighboring coded blocks are also included in the final candidate set for RD optimized mode decision. The overall algorithm is as follows:
4. Experimental Results
The proposed algorithm was evaluated in the HEVC reference software HM8.2 [17]. The main profile intra-only encoder configuration in the common test condition [18] is used to code the HEVC test sequences in classes A to E. The maximum CU size (CTU size) is 64x64, the maximum number of CU depth levels is 4, and maximum number of TU levels is 3. Rate-distortion optimized quantization, sample adaptive offset, transform skipping, and fast transform skipping are on. Four quantization parameters (QP) {22, 27, 32, and 37} are used to encode each sequence.
To demonstrate the encoding time saving (ETS) of the proposed method, an isolated PC with Intel Core 2 3.0GHz CPU and 2.0GB RAM was used to encode the first second of each sequence. Compared with the original fast intra mode decision algorithm in HM8.2 encoder, the ETS, in percentages, and the corresponding BD-rate [19] for the proposed fast intra mode decision algorithm are shown in Table 3. Averaging over all test sequences, the ETS for different QP settings are around 25% and the Y BD-rate is only 0.40%. It’s also observed that the ETS is consistent for different test sequences and the maximum Y BD-rate is 0.70%. Thus, we can conclude that the proposed fast intra mode decision algorithm can efficiently reduce the encoder complexity without significant loss in rate-distortion performance. When the proposed fast intra mode decision algorithm is combined with the FBUP algorithm, the results are shown in Table 4. It shows that the combined algorithm can further reduce the encoder complexity. Averaging over all test sequences, the ETS for different QP settings are around 40% and the Y BD-rate is 0.77%. The maximum average bitrate increase is only 1.05%. Thus, the overall algorithm also does not compromise the RD performance.
Table 3.Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm.
Table 4.Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm combined with the FBUP algorithm.
5. Conclusion
In this paper, a fast intra mode decision algorithm is proposed for HEVC intraframe coding. The proposed algorithm consists of three techniques to reduce the number of candidate modes for RD optimized mode decision. Compared to the original fast intra mode decision algorithm in HM encoder, the proposed algorithm can achieve about 25% encoding time saving with only 0.40% bitrate increase. Also, the proposed fast intra mode decision algorithm can be well combined with the proposed fast bottom-up pruning algorithm. The encoder complexity is further reduced, about 41% encoding time savings are achieved, and the average bitrate increase is only 0.77%.
References
- G. J. Sullivan, J.-R. Ohm, W.-J. Han, T. Wiegand and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) standard," IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1649 -1668, December, 2012. https://doi.org/10.1109/TCSVT.2012.2221191
- J. Lainema, F. Bossen, Woo-Jin Han, Junghye Min and K. Ugur, "Intra coding of the HEVC standard," IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1792 -1801, December, 2012. https://doi.org/10.1109/TCSVT.2012.2221525
- G.J. Sullivan and R.L. Baker, "Efficient quadtree coding of images and video," IEEE Trans. Circuits Syst. Image Process., vol. 3, no. 3, pp. 327 -331, May, 1994. https://doi.org/10.1109/83.287030
- F. Bossen, B. Bross, K. Suhring and D. Flynn, "HEVC complexity and implementation analysis," IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1685-1696, December, 2012. https://doi.org/10.1109/TCSVT.2012.2221255
- Xiaolin Shen, Lu Yu, and Jie Chen, "Fast coding unit size selection for HEVC based on bayesian decision rule," in Proc. of Picture Coding Symposium (PCS), pp. 453-456, May 7-9, 2012.
- Hui Li Tan, Fengjiao Liu, Yih Han Tan and Chuohao Yeo, "On fast coding tree block and mode decision for High-Efficiency Video Coding (HEVC)," in Proc. of IEEE Conf. Acoust., Speech and Signal Process., pp. 825-828, March 25-30, 2012.
- L. Shen, Z. Liu, X. Zhang, W. Zhao and Z. Zhang, "An effective CU size decision method for HEVC encoders," IEEE Trans. Multimedia, vol. 15, no. 2, pp. 465-470, February, 2013. https://doi.org/10.1109/TMM.2012.2231060
- Kiho Choi and Euee S. Jang, "Fast coding unit decision method based on coding tree pruning for high efficiency video coding," Optical Engineering, vol. 51, no. 3, pp. 030502-1-030502-3, March, 2012. https://doi.org/10.1117/1.OE.51.3.030502
- Yinji Piao, Junghye Min and Jianle Chen, "Encoder improvement of unified intra prediction," in document JCTVC-C207, October, 2010.
- Liang Zhao, Li Zhang, Siwei Ma and Debin Zhao, "Fast mode decision algorithm for intra prediction in HEVC," in IEEE Proc. Visual Commun. Image Process., pp. 1-4, November 6-9, 2011.
- T.L. da Silva, L. V. Agostini and L. A. da Silva Cruz, "Fast HEVC intra prediction mode decision based on edge direction information," in Proc. of The 20th European Signal Process. Conf. (EUSIPCO), pp. 1214-1218, August, 2012.
- H. Zhang and Z. Ma, "Early termination schemes for fast intra mode decision in High Efficiency Video Coding," in Proc. of IEEE Int. Symp. Circuits and Syst., pp. 25-48, May 19-23, 2013.
- Heming Sun, Dajiang Zhou and S. Goto, "A low complexity HEVC intra prediction algorithm based on level and mode filtering," in IEEE Proc. Multimedia and Expo, pp. 1085-1090, July 9-13 2012.
- Wei Jiang, Hanjie Ma and Yaowu Chen, "Gradient based fast mode decision algorithm for intra prediction in HEVC," in Proc. of IEEE Proc. Consumer Electronics, Communications and Networks (CECNet), pp. 1836-1840, April 21-23, 2012.
- Yongfei Zhang, Zhe Li and Bo Li, "Gradient-based fast decision for intra prediction in HEVC," in IEEE Proc. Visual Commun. Image Process., pp. 1-6, November 27-30, 2012.
- Han Huang, Yao Zhao, Chunyu Lin and Huihui Bai, "Fast bottom-up pruning for HEVC intraframe coding," IEEE Proc. Visual Commun. Image Process., 2013.
- HEVC test model reference software HM8.2 [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-8.2/
- F. Bossen, "Common test conditions and software reference configurations," in document JCTVC-I1100, May, 2012.
Cited by
- Fast Encoder Design for Multi-view Video vol.8, pp.7, 2014, https://doi.org/10.3837/tiis.2014.07.015