DOI QR코드

DOI QR Code

Video Content-Based Bit Rate Estimation Scheme for Transcoding in IPTV Services

  • Cho, Hye Jeong (AV Research and Development Laboratory, ARION Technology Inc.) ;
  • Sohn, Chae-Bong (Department of Electronics and Communications Enginering, Kwangwoon University) ;
  • Oh, Seoung-Jun (Department of Electronic Enginering, Kwangwoon University)
  • Received : 2013.10.31
  • Accepted : 2014.01.04
  • Published : 2014.03.31

Abstract

In this paper, a new bit rate estimation scheme is proposed to determine the bit rate for each subclass in an MPEG-2 TS to H.264/AVC transcoder after dividing an input MPEG-2 TS sequence into several subclasses. Video format transcoding in conventional IPTV and Smart TV services is a time-consuming process since the input sequence should be fully transcoded several times with different bit-rates to decide the bit-rate suitable for a service. The proposed scheme can automatically decide the bit-rate for the transcoded video sequence in those services which can be stored on a video streaming server as small as possible without losing any subject quality loss. In the proposed scheme, an input sequence to the transcoder is sub-classified by hierarchical clustering using a parameter value extracted from each frame. The candidate frames of each subclass are used to estimate the bit rate using a statistical analysis and a mathematical model. Experimental results show that the proposed scheme reduces the bit rate by, on an average approximately 52% in low-complexity video and 6% in high-complexity video with negligible degradation in subjective quality.

Keywords

1. Introduction

Broadcasting and communications convergence services use limited networks to deliver IPTV, Smart TV, and other Internet services to consumers. The compression technology of serviced video is applied according to two business models: Managed Network and Open Internet. IPTV service providers deliver the H.264/AVC video content through the Managed Network. In the Open Internet, the service is delivered over the public Internet and should enable access to video content not only from TV sets but also from other home devices, such as portable multimedia players and laptop computers. The scalable video coding (SVC) technology enables the system to consider the available bandwidth for other devices. The fully implemented SVC, however, also comes with some increase in complexity and bit rate for the same fidelity as compared with single-layer coding [1]. A further study is needed on how to best control the SVC rate according to the network resource availability [2]. Most IPTV services are focused on delivering high-resolution/high-quality video over the Managed Network, with supporting quality of service (QoS).

The MPEG-2 standard has been widely deployed in video distribution infrastructures, such as cable and satellite networks, as well as in several consumer applications, such as DVDs and DVRs. The H.264/AVC standard is used in many video streaming services limited by the network bandwidth and offers a significant reduction in the bit rate over earlier standards-based technologies such as MPEG-2 (65%) and MPEG-4 (40-50%) [3] [4]. The standard achieves better performance in terms of both the peak signal to noise ratio (PSNR) and visual quality at the same bit rate as compared with prior video coding standards.

In video streaming services with IPTV and Smart TV, a video transcoder is necessary to leverage the compression efficiency offered by H.264/AVC with broadcast quality content produced in the MPEG-2 format. To service video content over the Managed Network for users, Fig. 1 shows the process used for video content transmission.

Fig. 1.Video content transcoding process in an IPTV service.

In the transcoder, the input video is decoded by MPEG-2 and re-encoded by H.264/AVC at a fixed bit rate. After performing the validation of subjective quality, the video content is stored on a video streaming server and then serviced to users with varied, engaging content via a streaming server [5]. The encoded video content is usually delivered through constant bit rate (CBR) channels. The bit rate channels needed for SDTV and HDTV video can be as high as 2–3Mbps and 10–12Mbps, respectively. Each item of video content on a CBR channel does not take into account the content’s characteristics because it is encoded by two different fixed bit rates; however, the serviced video content varies from low-complexity video to high-complexity video. The former can be encoded with a bit rate less than the fixed bit rate, without degradation in subjective quality. In other words, the conventional scheme based on a fixed bit rate causes bandwidth loss and requires a huge amount of storage space on a streaming server. When the open IPTV service is activated later, IPTV service providers can deliver the content, which, unlike specific companies’ customized content, is a network resource that anyone can access. In order to deliver a considerable amount of content on a CBR channel, it is important to select an efficient bit rate.

Solving this problem requires a scheme capable of finding an appropriate bit rate for video content while maintaining a subjective quality equivalent to that of a scheme that uses a fixed bit rate. Employing this scheme requires determining a bit rate for video content prior to encoding it. A video transcoder can provide an additional controller that can also estimate the bit rate. A simple technique to estimate the video content’s bit rate is to vary the bit rate step in the H.264/AVC encoder part of the transcoder. The visual quality should be verified at each encoding pass. Even though this method can provide an accurate bit rate, it is a very time-consuming process. The time required to estimate the bit rate should be minimized to meet the video streaming service requirements.

In this paper, a scheme is proposed for automatically estimating the bit rate of each subclass without the repeated full encoding and subjective quality test. Using parameters, the video content is divided into several segments. To estimate the bit rate of each segment, candidate frames are extracted, which include intra-frames that require a high number of bits. Finally, the bit rate of each segment is estimated by statistical analysis and a mathematical model based on a given target quality. The remainder of this paper is organized as follows. Section II explains the analysis of video content with respect to the quality and bit rate. Section III proposes a bit rate estimation scheme for unsupervised segmentation using the frame complexity of video content. Then, the experimental results and conclusions are presented in Sections IV and V, respectively.

 

2. Analysis of the Quality and Bit Rate of Video Content

The purpose of this analysis is to examine the human perceived quality corresponding to the bit rates of a video. The subjective quality of the H.264/AVC encoded video is evaluated, in which a low-complexity content category such as “lecture” is coded at bit rates from 1.0 to 2.5Mbps. The evaluation is performed using the double-stimulus continuous quality scale (DSCQS) method of ITU-R Rec. BT.500-7 [6]. All the coded stimuli are rated by each of the five viewers. General conclusions were based on the quality ratings of the presented stimuli. The main idea of measuring the DSCQS score is to determine the differential mean opinion score (DMOS) between the reference encoded at 2.5Mbps and the test sequences averaged by all the viewers. A DMOS value, dMOS, is defined as follows:

where MOSr is the MOS of the reference sequence encoded at 2.5Mbps, and MOSp is the MOS of the test sequence encoded below 2.5Mbps. The task is to assess the degradation of the test sequence with respect to the reference sequence. If dMOS is near “0”, then the test sequence is similar to the reference sequence. Fig. 2 shows the result of the average of all dMOS’s in a low-complexity video. The quality degradation determined by the video encoded bit rate was, on an average, 1.4Mbps. Therefore, the low-complexity video can encode a bit rate lower than 2.5Mbps, with negligible degradation of subjective quality.

Fig. 2.Result of quality evaluation.

Further, the difference between the variable bit rate (VBR) at QP 22 and the CBR at 2.5Mbps is analyzed for the test sequence. As shown in Fig. 3, some video content can be encoded at a lower bit rate than at the fixed bit rate. Video content can be divided into two or three subclasses in terms of the quality of experience (QoE). It can also be delivered using more than one bit rate according to subclasses in a CBR channel.

Fig. 3.The differential ratio between VBR and CBR.

 

3. Proposed Scheme

In this section, a bit rate estimation scheme is proposed that reduces the bit rate while maintaining the target quality in video streaming services limited by the network bandwidth. Fig. 4 shows a block diagram of the proposed scheme. Given an input sequence as MPEG-2 TS, the TS parser is used to gather MPEG-2 video data and their data is decompressed by MPEG-2 decoder. Deinterlacer performs deinterlacing interlaced video frames to progressive video frames because a common way to compress video is to interlace it. Using those parameters, the frames of video can be divided into several segments. To estimate the bit rate of each segment, candidate frames are extracted, which includes intra-frames that require a large number of bits. Finally, the bit rate of each segment is estimated by statistical analysis and a mathematical model based on the target quality. The input video is re-encoded by H.264/AVC at estimated bit rate. After performing the validation of subjective quality, the video content is stored on a video streaming server.

Fig. 4.Block diagram of the proposed scheme.

The proposed scheme differs from the conventional scheme in that it employs a bit rate estimator. Because the proposed scheme does not encode full frames of video content, it is very important to determine parameters that can serve to indirectly measure a frame’s bits.

3.1 Frame Complexity Estimation for an Intra-frame

Some content complexity measurements for coding still images can be obtained without pre-encoding by using variance, edge, and gradient methods [7]. From the deviation of each macroblock (MB), the complexity can also be determined [8]. In the gradient-based method, the computation for calculating the gradient is low, and the output bit rate of each intra-frame is highly correlated [9]. These properties are highly desirable for measuring the complexity of an intra-frame. In addition to the gradient information, the histograms of luminance and chrominance pixel values are also very useful when combined with the gradient to represent the content complexity.

Given the arbitrary sth test sequence Qs, the set contains a number of groups of pictures (GOPs) specified in the order in which the intra- and inter-frames are arranged:

where M is the total number of GOPs, and N is the number of frames in a GOP. Qs(i,j) denotes the jth frame of the ith GOP. Our objective is to measure the intra-frame complexity in Qs. In order to measure the frame complexity, the complexity measurement defined in [10], FCintra, is used. The value of FCintra for Qs(i,j) ∈ Qs, CC(Qs(i,j)), can be computed by (3).

where

In (3), Grads,i and SOHs,i are the gradient and the statistic, respectively, of the histogram information of the ith intra-frame. Ys,i(x, y) is the luminance value of pixel (x, y) in the ith frame. Us,i(x, y) and Vs,i(x, y) are the corresponding chrominance values. KYLY, KULU, and KVLV are the sizes of the Y-, U-, and V-frames in Qs(i,1). HYs,i[l] is the histogram of the luminance level l, and HUs,i[l] and HVs,i[l] are the histograms corresponding to the chrominance level l.

To investigate the relationship between the actual number of encoded bits and FCintra, various test sequences were extensively encoded using the intra-coding mode under constant quantization parameters (QPs), and both the number of encoded bits and the FCintra for each frame were recorded. Fig. 5 shows the scatter plots of the number of bits versus FCintra at different QPs in our test content, where each dot represents a frame. Fig. 5 also shows the accuracy of the linear approximations (as blue dotted lines) by plotting the correlation coefficient r, which is an indicator of how closely the approximated linear relationship represents the actual data. The value of r lies between -1 and 1. For the test sequences, the value of r between the number of bits and FCintra is, on an average, 0.93. When the value of r is at or near 1, the approximated linear relationship is the most reliable. Therefore, it is clear that a linear relationship exists in our test sequences with different slopes, and (3) can be used accurately to estimate the number of bits for intra-frames.

Fig. 5.Scatter plots of the number of encoded bits versus FCintra: (a) Documentary, (b) Lecture, (c) Religion, and (d) Sports.

3.2 Hierarchical Clustering-Based Video Sub-classification

Each of the subclasses—clusters, or groups of patterns of FCintra —has a similar number of bits. The classifier for FCintra is designed by hierarchical clustering with Bayesian decision theory [11].

Consider a sequence T containing n samples and c clusters. To conduct agglomerative hierarchical clustering for FCintra, the number of initial clusters, n, is determined by analyzing the temporal characteristic between frames. The scaled-invariant feature transform (SIFT) is sequentially applied to detect stable frames among temporal frames [12]. Let T(x,y,t) be the ordinal signature of the (x,y)th block of the tth frame in T. Gσ(x,y,t) defines a 3×3×3 Gaussian kernel with standard deviation σ as follows:

A 3×3×3 difference-of-Gaussian (DoG) kernel [13] is derived by computing the difference between two Gaussian kernels as follows:

where k > 1 is a multiplicative factor, and s = 1,2,…, is the scale of the DoG kernel. Then, the DoG kernel sliding over T is used to generate a vector ψ by the convolution operation as follows:

for t = 1,…,m. If the tth element in ψ is a local extreme, it is considered to be a key frame in T. In this paper, the parameters are set to σ = 1.8, , and s = 3. A sequence consists of the static subclass ω0 and dynamic subclass ω1 divided by distribution of ψ. The two subclasses are defined as follows:

where ω0 denotes the same value between the tth element and (t-1)th element in ψ, whereas ω1 denotes the different value between them. The number of initial clusters n is decided by the intervals of successive ω0’s and the number of ω1’s. Fig. 6 shows the number of initial clusters in a sequence.

Fig. 6.Examples of the number of initial clusters

To show ω0 and ω1 for the distribution of frame variations, the lines in the figure denote 0 and 1 for ω0 and ω1, respectively. The number of initial clusters in a sequence is finally 71 as shown in Fig. 6. Each cluster center is the average of FCintra’s in ω0 and an FCintra in ω1, respectively. The measure of the distance between two clusters uses the Euclidean metric [14].

Given two clusters, whether they are in the same subclass or not is decided by the Bayesian decision theory. This approach is based on quantifying the trade-offs between various classification decisions using probability and the costs that accompany such decisions. It makes the assumption that the decision problem is posed in probabilistic terms and that all of the relevant probability values are known. More generally, assume that there is a prior probability P(ωk) of each subclass k. These prior probabilities reflect prior knowledge of how likely it is that the static or dynamic subclass can be obtained before a sequence actually appears. The difference between the representative FCintra’s in the two clusters is measured. Its value x is considered to be a random variable whose distribution depends on the class and is expressed as p(x|ωk). To determine the subclass of a cluster, the following decision rule is used: decide ω0 if P(ω0|x) > P(ω1|x); otherwise decide ω1. The decision rule can be expressed as follows:

Suppose that both the prior probabilities P(ωk) and the conditional densities P(x|ωk) are known. It is known that the joint probability density of finding a pattern that is in subclass ωk and has feature value x can be written two ways: P(ωk,x) = P(ωk|x)p(x) = P(x|ωk)P(ωk). Bayes’ formula can be expressed as follows:

Using (9), the decision rule of (8) can be rewritten as follows:

The quantity on the left is called the likelihood ratio and is denoted by Λ(x)

The quantity on the right-hand side of (10) is the threshold of the test and is denoted by η:

Thus, the Bayes criterion leads to the likelihood ratio test (LRT) shown in (13):

Owing to the goodness of fit between the actual data and the theoretical data, the distributions of P(x|ω0) and P(x|ω1) are assumed to have an approximately exponential distribution:

where k is 0 or 1 of each subclass ω, and αk and βk are the model’s parameters. In this paper, the prior probabilities P(ω0) and P(ω1) for test sequences are investigated as shown in Table 1. On an average, P(ω0) is 0.93, and P(ω1) is 0.07. The model parameter values are α0 = 1,140,000, β0= 2.824, α1 = 2,810, and β1 = 0.390.

Table 1.Prior probabilities according to test sequences

Using (13), it can be determined whether the given two clusters are merged or not: two clusters are merged if Λ(x) is greater than η. Finally, c clusters can be obtained according to FCintra distribution, as shown in Fig. 7.

Fig. 7.Relationship between FCintra distribution fc and the final clusters

Although the correlation between FCintra and the number of bits is high, the maximum FCintra frame does not always have the maximum number of encoded bits. Thus, the candidate intra-frame needs to be extracted. The candidate frame set Hs contains intra-frames, and a candidate frame in Hs is the frame that requires more than a certain number of encoded bits. Hs is specified in (15):

In (15), is a candidate intra-frame, D is the number of candidate frames, M is the number of intra-frames, θ(•) is a nondecreasing mapping function from the integer set {1,…,M}, and μc is the average of FCintra’s in each cluster. If CC(Qs(i,1)) is greater than the content-adaptive threshold μc, the ith intra-frame is extracted as of the cth cluster.

3.3 Model-Based Bit Rate Estimation

Using candidate frames with FCintra value of each cluster, the bit rates of clusters can be estimated via statistical analysis and a mathematical model. To estimate the bit rate while maintaining the given PSNR quality, a PSNR-Q model derived from the H.264/AVC quantization process [15] is proposed in this paper. With this model, an estimated QP is determined and is finally applied to the bit rate estimation. The relationship between the quantization step size (Qstep) and QP is given in (16) as follows:

where PF and MF are a post-scaling and a multiplication factor, respectively, in the H.264/AVC standard, and qbits = 15+floor (QP/6). When uniform quantization is applied to the uniformly distributed inputs, the mean square error (MSE) is given by

From (16) and (17), the PSNR can be derived as

where a and b are constants obtained by linear regression [16]. As a result, the value of QP can be estimated as

where PSNRt is a given target PSNR, and QPe is an estimated QP.

Using QPe, the number of intra-frame bits is first estimated. Some parameters obtained by intra-frame estimation are used to estimate the number of inter-frames bits in a GOP. To estimate the number of intra-frame bits, a simple but effective Rate-Quantization (R-Q) model is used. An exponential relationship between the actual number of encoded bits and QP was modeled by Zhou and his colleagues [17]. For simplicity, the R-Q model for an intra-frame is defined as:

where Rq,1(QPe) is the number of encoded bits for the qth candidate intra-frame at QPe, and αq and βq are the model parameters. To reveal the relationship between the number of encoded bits and QP, Fig. 8 shows several examples of curve-fitting results for intra-frames, with each small dot of the mathematically approximated curves representing the actual number of encoded bits of an intra-frame at each QP. Because αq and βq can be obtained by exponential regression, Rq,1 can also be calculated by (20).

Fig. 8.R-Q curves for the test sequences. (a) Music video, (b) Lecture, (c) Sports, (d) Documentary

It is difficult to directly estimate the number of inter-frame bits in H.264/AVC. Thus, the bit rate conversion method introduced in [18] is used with the value of QPe instead of using the intra-frame R-Q model. The bit rate conversion is defined as

where Rq,j+1(QPP) is the number of encoded bits for the (j+1)th inter-frame in the qth GOP at QPP, and G is a GOP size. As defined in (21), this method requires encoding a GOP at a certain value of QP, QPs, as a reference, that is, Rq,j+1 (QPs) is computed in advance. In experiments, the value of QPs used is 26. Furthermore, QPP is set to QPe+1 here because an inter-frame QP is an intra-frame QP+1 in H.264/AVC rate control. After estimating the number of intra- and inter-frame bits, the total number of bits for each GOP, Rq, can be estimated using (20) and (21) as follows:

The bit rate of each cluster is estimated using the GOP that is expected to have the maximum number of encoded bits among all candidate frames in each cluster. If the same bit rate between clusters is estimated, these clusters are grouped as a segment. Finally, the number of segments in a sequence is less than or equal to the number of clusters.

 

4. Experimental Results

The performance of the proposed scheme is evaluated with several types of IPTV content. The proposed scheme will be called class-based bit rate estimation (CBRE) hereinafter, and the conventional scheme with a fixed bit rate of 2.5 Mbps will be called fixed bit rate estimation (FBRE) [19]. The standard definition (SD) resolution video content is categorized into four genres: lecture, religion and documentary, drama and animation, and music video and sports. A total of 30 videos in Table 2 are used as test sequences.

Table 2.Test sequences

In our experiment, the size of GOP is 15, and its type is set to IPPP. The target PSNR is set to 42dB. The simulated results encoded by FBRE can be compared in terms of the bit rate and quality to those encoded by CBRE. In order to evaluate the bit rate reduction, ΔR is calculated as follows:

where and indicate the bit rates by FBRE and CBRE in the ith cluster, respectively.

Table 3 shows the results of bit rate reduction. CBRE can reduce the bit rate by up to 65.2% as compared with FBRE. CBRE can reduce the bit rate, on an average, by approximately 52% and 6% in low- and high-complexity video sequences, respectively. Because CBRE assigns the bit rate according to the complexity of each segment, a relatively high bit rate reduction in the low-complexity video class can be achieved.

Table 3.Bit rate reduction ratios of CBRE

Since the bit rate can be estimated by encoding candidate frames instead of the total frames, the computational complexity for CBRE depends on the ratio of the number of candidate frames to the total number of frames. Fig. 9 shows these ratios in the test sequences.

Fig. 9.Ratios of the number of candidate frames to the total number of frames in test sequences

Table 4 shows that the difference in the PSNR performance is approximately 1.2dB on an average. However, that is too small a difference to affect the subject quality degradation in test sequences as shown in Fig. 10, since the target bit rate is set to 40dB in (19), which makes it difficult to determine a subjective quality difference.

Table 4.PSNR difference between FBRE and CBRE

Fig. 10.Subjective quality comparison: (a) CBRE and (b) FBRE

 

5. Conclusions

The transcoding bit-rate decision in conventional IPTV and Smart TV services is a time-consuming process since the input sequence should be fully transcoded several times with different bit-rates to decide a suitable bit-rate. This paper shows that the video bit rate in an MPEG-2 TS to H.264/AVC transcoder which is an essential device in those services can be automatically decided with keeping subjective video quality. The proposed bit rate estimation scheme was organized into two modules: one was hierarchical clustering-based sub-classification and the other was statistical analysis-based bit rate estimation. The input sequence was grouped as several subclasses by hierarchical clustering using the parameter value extracted from each frame. The candidate frames of each subclass were used to estimate the bit rate using statistical analysis and mathematical model. The bit rate could be automatically estimated by encoding only the candidate frames.

The proposed scheme could reduce the fixed bit rate, on an average, by 52% in low-complexity video and by 6% in high-complexity video while maintaining the subjective quality, respectively. For future work, we plan to study some practical issues for implementing the proposed scheme. Note that in real TV services, additional works need to be developed in order to simplify the proposed scheme, especially clustering-based video sub-classification. We also need to extend the results to HD test sequences.

References

  1. H. L. Cycon, T. C. Schmidt, M. Wahlisch, D. Marpe, and M. Winken, "A temporally scalable video codec and its applications to a video conferencing system with dynamic network adaption for mobiles," IEEE Trans. Consumer Electron., vol. 57, no. 3, pp. 1408-1415, Aug. 2011. https://doi.org/10.1109/TCE.2011.6018901
  2. S. Park, and S. H. Jeong, "Mobile IPTV: approaches, challenges, standards and QoS support," IEEE Internet Comput., vol. 13, no. 3, pp. 22-31, May-Jun. 2008.
  3. T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264/AVC video coding standard," IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 560-576, Jul. 2003. https://doi.org/10.1109/TCSVT.2003.815165
  4. A. Joch, F. Kossentini, H. Schwarz, T. Wiegand, and G.J. Sullivan, "Performance comparison of video coding standards using Lagrangian coder control," in Proc. of IEEE Int. Conf. Image Processing, vol. 2, pp. II-501-504, Sep. 2002.
  5. T. Kim and H. Bahn, "Implementation of the storage manager for an IPTV set-top box," IEEE Trans. Consumer Electron., vol. 54, no. 4, pp. 1770-1775, Nov. 2008. https://doi.org/10.1109/TCE.2008.4711233
  6. ITU-R Recommendation BT.500-11, "Methodology for the subjective assessment of the quality of television pictures," ITU, 2002.
  7. Wook Joong Kim, Jong Won Yi, and Seong Dae Kim, "A bit allocation method based on picture activity for still image coding," IEEE Trans. Image Process., vol. 8, no. 7, pp. 974-977, 1999. https://doi.org/10.1109/83.772244
  8. J. Li and E. Abdel-Raheem, "Efficient rate control H.264/AVC intra frame," IEEE Trans. Consumer Electron., vol. 56, no. 5, pp. 1043-1048, May 2010. https://doi.org/10.1109/TCE.2010.5506037
  9. X. Jing, L.-P. Chau, and W.-C. Siu, "Frame complexity-based rate-quantization model for H.264/AVC intraframe rate control," IEEE Trans. Signal Process. Lett., vol. 15, pp. 373-376, 2008. https://doi.org/10.1109/LSP.2008.920010
  10. Y. Zhou, Y. Sun, Z. Feng, and S. Sun, "New rate-distortion modeling and efficient rate control for H.264/AVC video coding," Signal Process.: Image Commun., vol. 24, no. 5, pp. 345-356, May 2009. https://doi.org/10.1016/j.image.2009.02.014
  11. Richard O. Duda, Peter E. Hart, and David G. Stork, Pattern Classification, 2nd ed., Wiley-Interscience, 2000, pp. 20-82.
  12. C.Y. Chiu, C.S. Chen, and L.F. Chien, "A framework for handling spatiotemporal variations in video copy detection," IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 3, pp. 412-417, Mar. 2008. https://doi.org/10.1109/TCSVT.2008.918447
  13. G. Lowe, "Distinctive image features from scale-invariant keypoints," Int. Journal of Computer Vision, vol. 60, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  14. M. M. Deza and E. Deza, Encyclopedia of Distances, 1st ed., Springer, 2009, pp. 89-100.
  15. Y. Liu, Z. G. Li, and Y. C. Soh, "A novel rate control scheme for low delay video communication of H.264/AVC standard," IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 1, pp. 68-78, Jan. 2007. https://doi.org/10.1109/TCSVT.2006.887081
  16. A.L. Edwards, An Introduction to Linear Regression and Correlation, W.H. Freeman, pp. 33-46, 1976.
  17. Y. Zhou, Y. Sun. Z. Feng, and S. Sun, "New rate-distortion modeling and efficient rate control for H.264/AVC video coding," Signal Process.: Image Commun., vol. 24, no. 5, pp. 345-356, May 2009. https://doi.org/10.1016/j.image.2009.02.014
  18. Q. Tang, H. Mansour, P. Nasiopoulos, and R. Ward, "Bit-rate estimation for bit-rate reduction H.264/AVC video transcoding in wireless networks," in Proc. of IEEE Int. Sym. Wireless Pervasive Comput., pp. 464-467, May 2008.
  19. H. J. Cho, J. Lee, D. Y. Noh, S. H. Jang, J. C. Kwon, and S. J. Oh, "A new video bit rate estimation scheme using a model for IPTV services," KSII Trans. Internet and Information Syst., vol. 5, no. 10, pp. 1814-1829, Oct. 2011.