DOI QR코드

DOI QR Code

Evaluation of Various Tone Mapping Operators for Backward Compatible JPEG Image Coding

  • Choi, Seungcheol (Department of Electronics Engineering, Sejong University) ;
  • Kwon, Oh-Jin (Department of Electronics Engineering, Sejong University) ;
  • Jang, Dukhyun (Department of Electronics Engineering, Sejong University) ;
  • Choi, Seokrim (Department of Electronics Engineering, Sejong University)
  • Received : 2015.03.19
  • Accepted : 2015.07.16
  • Published : 2015.09.30

Abstract

Recently, the standardization of backward compatible JPEG image coding for high dynamic range (HDR) image has been undertaken to establish an international standard called "JPEG XT." The JPEG XT consists of two layers: the base layer and the residual layer. The base layer contains tone mapped low dynamic range (LDR) image data and the residual layer contains the error signal used to reconstruct the HDR image. This paper gives the result of a study to evaluate the overall performance of tone mapping operators (TMOs) for this standard. The evaluation is performed using five HDR image datasets and six TMOs for profiles A, B, and C of the proposed JPEG XT standard. The Tone Mapped image Quality Index (TMQI) and no reference image quality assessment (NR IQA) are used for measuring the LDR image quality. The peak signal to noise ratio (PSNR) is used to evaluate the overall compression performance of JPEG XT profiles A, B, and C. In TMQI and NR IQA measurements, TMOs using display adaptive tone mapping and adaptive logarithmic mapping each gave good results. A TMO using adaptive logarithmic mapping gave good PSNRs.

Keywords

1. Introduction

In digital image processing, the dynamic range of an image is defined as the ratio between the maximum and the minimum measurable light intensity. Whereas the dynamic range of real-world light is 109 : 1, the newest image sensor has a dynamic range of about 2.8 10 :14 : 1. For this reason, the digitized image has a low dynamic range (LDR) when compared to the full dynamic range. High dynamic range (HDR) imaging shows potential in a variety of applications because it provides more detailed information for under-exposed and over-exposed regions in the LDR image [1][2].

Even today, although high end products of the digital imaging industry use the sensor digitizing HDR images with 10-16 bits for each color channel, display devices using 8-bits for each channel are still the mainstream in the industry. Thus, the standard for HDR image coding providing the backward compatibility with LDR images has caused much interest in the industry. A working group (WG) of the Joint Photographic Experts Group (JPEG) meeting in Paris in 2012 established a JPEG XT standard (ISO/IEC 18477) for the compression of HDR images. Even though JPEG 2000 standard (ISO/IEC 15444) and JPEG XR standard (ISO/IEC 29199) provide HDR image coding specifications, they are not compatible with the legacy JPEG standard (ISO/IEC 10918). For this reason, their usage is reduced in the industry. The JPEG XT standard has emerged to overcome this problem.

Fig. 1 shows the functional block diagram of a JPEG XT encoder. The solid line path shows the flow for providing backward compatibility, which compresses the tone mapped LDR image using the legacy ISO/IEC 10918-1 encoder. The dashed line path shows the flow for producing the error signal between the original HDR image and the tone mapped LDR image.

Fig. 1.Overview of the JPEG XT encoding process

Since the main function of the backward compatible part is the tone mapping operation, it is essential to evaluate the influence of the tone mapping operator (TMO) on the JPEG XT compression performance.

The JPEG XT standard provides three different profiles A, B, and C, based on the method of producing the error signal. Profiles A and B produce the residual image using a division operation between the tone mapped LDR image and the original HDR image. Profile C produces the residual image using a subtraction operation. In addition, the function of the transform block is different for each profile and the information for the transform is saved in the syntax defined in ISO/IEC 18477-3.

Fig. 2 shows the structure of the basic decoding process defined in JPEG XT. Since the base layer delivers the encoded LDR image from the legacy JPEG encoder, this layer becomes the backward compatible part, which can be decoded by the JPEG legacy code stream decoder. The residual layer, which is not processed in the legacy JPEG decoder, delivers the residual error signal with the information including the inverse tone mapping information. The JPEG XT decoder reconstructs the final HDR image using de-correlated LDR data and decoded residual error signal data [3].

Fig. 2.Overview of the JPEG XT decoding process [3]

In 2013, the JPEG WG determined the verification test procedure and released the demo software in response to the call for proposals [4]. Since the main purpose of the first verification test was not the coding performance comparison of profiles but the examination of whether the proposed profiles are consistent with the requirements, the test was performed by fixing the TMO as the operator proposed by Reinhard et al. [5]. The test result of each profile was measured by the signal to noise ratio (SNR), mean relative square error (MRSE), and High Dynamic Range Visible Difference Predictor (HDR-VDP) 2.0 [6]-[9].

This paper evaluates various TMOs for the JPEG XT profiles. Comparison of TMOs was performed based on two aspects: rate-distortion (R-D, PSNR versus overall bit rate) performance and quality of the tone mapped LDR images. The evaluation was performed using five HDR image datasets and six TMOs. The peak signal to noise ratio (PSNR) was used to evaluate the effect of TMOs on the overall compression performance of JPEG XT profiles A, B, and C. The Tone Mapped image Quality Index (TMQI) and no reference image quality assessment (NR IQA) were used for measuring the LDR image quality.

In section 2, related considerations are described. The method and procedure for the test are illustrated in section 3. Experimental results are presented in section 4. Finally, section 5 concludes the paper.

 

2. Related Considerations

2.1 JPEG XT profiles

As shown in Fig. 3, the residual image of profile A contains the HDR luminance ratio and the chrominance residuals. The luminance and the chrominance signals are processed in the YCbCr color space. The luminance signal is the ratio of the HDR luminance divided by the tone mapped LDR luminance. The chrominance signal is the difference between LDR chrominance and HDR chrominance. However, as shown in Fig. 4, the residual data of profile B is the fractional part of the tone mapped LDR image data divided by the original HDR image data in the RGB color space. Profile C, on the other hand, uses a difference signal obtained by subtracting the tone mapped LDR image data from the original HDR image data as the error signal. As shown in Fig. 5, the refinement scan is added to increase the bit number of the coefficient in the discrete cosine transform domain.

Fig. 3.High level overview of the decoding process of a profile A compliant decoder [10]

Fig. 4.High level overview of the decoding process of a profile B compliant decoder [10]

Fig. 5.High level overview of the decoding process of a profile C compliant decoder [10]

2.2 Image Quality Assessment for Tone Mapped LDR image

A subjective evaluation may seem to be the most appropriate way for measuring the quality of the image. However, it is time-consuming and expensive. For this reason, the objective IQA metrics are frequently used. PSNR and Structural SIMilarity (SSIM) index [11] are commonly used for the evaluation of the image coding performance. They measure the image quality by calculating the distortion of the coded image referenced to the original image and are designed for the images whose original and coded versions are in the same dynamic range. However, in JPEG XT, the dynamic range of the tone mapped LDR image is not the same as the dynamic range of the original HDR image. Therefore, another metric is needed to measure the quality of the LDR image from JPEG XT.

Recently, new metrics for evaluating the quality of the tone mapped LDR image were introduced. Aydin et al. [12] suggested the Dynamic Range Independent Quality Measure (DRIM) providing a map for the change of contrast after performing the tone mapping. However, DRIM does not provide the image quality in the form of a score. Yeganeh and Wang [13] proposed the TMQI metric, which is a composite measurement based on the modified SSIM and the statistical naturalness. The TMQI measures the quality of the LDR image by using the original HDR image as the reference. Ma et al. [14] utilized the TMQI in order to design their optimized TMO. In this paper, the TMQI is used for evaluating the effect of TMOs on the coding performance of JPEG XT.

2.3 No Reference Image Quality Assessment

As introduced in the previous section, PSNR, SSIM, DRIM, and TMQI are the full reference IQAs. However, NR IQAs are designed to measure the quality of the images for the case that the appropriate reference image does not exist such as in the case of image fusion. Recently, NR IQA metrics, which utilize essential perceptual attributes of the human visual system (HVS), have been proposed. Zhang and Le [15] proposed a metric denoted by Q using image brightness details. Q was originally introduced for JPEG2000 images. However, the measure of Q is based on the basic activity of general pixels as it was designed to be also suitable for measuring luminance details. The performance of this measure was shown quite competitively among state-of-the-art NR IQA metrics. Panetta et al. [16] proposed the Color Root Mean Enhancement ( CRME ) measurement using the contrast information of the image. The measure of CRME incorporates the idea of relative root-mean-square contrast and just-noticeable-differences characteristic of HVS. The performance test of CRME was done by using the TID2008 image quality database [17]. Experimental results on Gaussian blur and contrast change distortions showed high correlations between the measure and the subjectively evaluated mean opinion score of the database. Hasler et al. [18] proposed a metric denoted by quantifying the colorfulness and naturalness of the image. They performed a psychophysical experiment that asked 20 people to give a global colorfulness rating for a set of 84 images. The measure obtained about 95% correlation with the experimental data.

Furthermore, a few subjective tests were conducted to evaluate the performance of the TMOs. Čadík et al. [19] investigated the influence of the perceptual image attributes: brightness, contrast, colors, details, and artifacts, on the overall image quality. One typical problem with some TMOs is halo artifacts introduced due to contrast, which is usually related to variations in image brightness. Another study by Narwaria et al. [20] investigated and quantified the impact of TMOs on human visual attention in HDR images. The results have shown that TMOs have a significant impact on visual attention patterns.

In this paper, an objective NR IQA, S, is proposed to assess the quality of the tone mapped LDR image. S incorporates the major attributes: image details, contrast, and naturalness, which affect the subjective image quality. Since artifacts and brightness are related to contrast [19], those attributes were not adopted in our method. To measure these attributes, S is formulated using multiple linear regression in 3-dimensional functions on the image details, contrast, and naturalness attributes. Q, CRME, and measure the image details, the contrast, and the naturalness, respectively, as follows, with the larger value implies the better performance.

with

and

The balance parameter α, β, γ may be adjusted based on the user’s preference for the details of the brightness, the contrast, and the colorfulness. In this evaluation, the default values of all these parameters are set to the same value.

 

3. Evaluations

3.1 Evaluation Software

The official reference software of the JPEG XT standard is still not released. However, the JPEG WG provides the demo software implementing all the profiles of JPEG XT. In this paper’s evaluation, this demo software is used. The JPEG WG has also registered an official application program to calculate the PSNR [21]. This program is used for calculating the PSNR values in this paper as well.

3.2 Encoding Parameters

The JPEG XT encoder employs two encoding systems for the base image and the residual image. Both systems use the legacy ISO/IEC 10918-1 encoder to guarantee the backward compatibility. The compression ratio of legacy ISO/IEC 10918-1 is determined based on the quality value provided by the user. Therefore, two quality values for each of the base image and the residual image need to be set. In the last verification test performed by the JPEG WG in 2013, the specific quality value combinations based on experiments were used to get approximately 1, 2, 3, 4, and 5 bits per pixel.

In this paper’s evaluation, to fully understand the influence of the TMO, more quality value combinations were tested. The quality values of the base image were set to 70 and 90. The quality values of the residual image were set to 20, 40, 50, 60, 70, 80, 90, and 100. It was found to be sufficient to use only quality values of 70 and 90 for the base image because experimental results using different quality values were similar.

Table 1.Selected HDR Images

3.3 Datasets

Five HDR images of different dynamic ranges from the Fairchild database were used for this evaluation. These images are the ones originally selected by the JPEG WG for the verification tests of the JPEG XT standard. The dynamic range and the resolution of HDR images were the selection criteria of the WG. Tone mapped versions of the selected HDR images are shown in Table 2.

Table 2.Tone mapped LDR images

3.4 Tone Mapping Operators

Generally, the TMO is used to display the HDR image or video on conventional display devices. TMOs are commonly divided into two categories: local operator and global operator. Global TMOs apply the same rule for mapping all the pixels in an image based on global image characteristics, and local TMOs use a spatially varying mapping rule depending on each pixel’s neighbors.

For this paper’s evaluation, six TMOs were used. They were “Reinhard-02” by Reinhard et al. [5], “Drago” by Drago et al. [22], “iCAM06” by Kuang et al. [23], “Mai-11” by Mai et al. [24], “Mantiuk-08” by Mantiuk et al. [25], and “Reinhard-05” by Reinhard and Devlin [26]. The Reinhard-02 and Reinhard-05 have been chosen to extend the JPEG WG’s verification test. The Drago, iCAM06, and Mantiuk-08 were used in several evaluation researches [19][20], and the Mai-11 was selected because it is a recently proposed TMO.

These 6 TMOs, which are used to evaluate the influence of the mapping methods, are divided into the following categories: the global operator (Drago, Mai-11, and Reinhard-05), local operator (iCAM06, Mantiuk-08, and Reinhard-02). Table 2 shows the resulting LDR images generated by the chosen TMOs. They were produced by programs provided by the authors. The images that are generated by the Reinhard-05 TMO tend to be relatively dark. Those LDR images were used without any modification.

 

4. Experimental Results

This evaluation focuses on the following measurements: the quality of the tone mapped LDR image and the PSNR performance of the reconstructed HDR image. TMQI and S defined by equation (1) are used for measuring the quality of the LDR images. Since the JPEG XT standard supports backward compatibility with legacy JPEG standard, the quality of the reconstructed HDR image is relied on the base layer, which is tone mapped version of the original HDR image. For this reason, the performance evaluation of the TMOs is an essential work.

4.1 Results of TMQI Measurement

Fig. 6 shows the TMQI measurement of the tone mapped LDR images. The plot depicts the content dependency of the TMQI performance. The BloomingGores2 and CanadianFalls images show relatively better quality than do the McKeesPub, MtRushmore2, and WillyDesk images, since the naturalness measurement is included in the TMQI metric. As a result, TMQI performance is highly dependent on the image content. The averaged TMQI performance of the TMOs is shown in Fig. 7. The Mantiuk-08, which uses display adaptive tone mapping, got the best score at 0.832. In contrast, as shown in Table 2 and Fig. 7, the Reinhard-05 TMO exhibited the worst tone mapping performance for TMQI among compared TMOs.

Fig. 6.TMQI comparison between datasets

Fig. 7.TMQI comparison between TMOs

4.2 Results of NR IQA Measurement

Table 3 compares NR IQA scores, which represent the quality of tone mapped LDR images. Higher values signify better scores for all metrics. Q, CRME, and exhibit different results for different images. Q, CRME, and are mainly based on the values of brightness detail, contrast, and colorfulness, respectively. It is shown that the spectrum of Q and CRME is much narrower than that of . It is hard to see the performance differences among TMOs when Q and CRME are used for the comparison. Since is the most discriminative metric among three metrics, we may conclude that the performance differences among TMOs are mainly dependent on the colorfulness. S is a combined IQA metric, which weighs these three NR IQA measurements equally. Overall, the Drago TMO shows the highest measured S among the six TMOs.

Table 3.IQA Comparison between TMOs

4.3 Results of PSNR Measurement

We have obtained the PSNR-based R-D curves for three profiles, six TMOs, and five HDR images. Since JPEG XT images consist of base layer and residual layer, the overall bit rate has to be allocated to each of the layers. To provide performance evaluation under realistic environment, the quality value of the base layer was set at 70, since it is a common quality value for JPEG encoder. We have varied the quality value of the residual layer to find the best relationship between the PSNR and bit rate. The various rate values correspond to the various rates of residual code streams in the JPEG XT code stream encoded with the parameters described in section 3.

Fig. 8 shows the R-D curve differentiating the performance of three profiles. It can be seen that profile A exhibits relatively less dependence on different TMOs. One thing to note is that the PSNR performance of profiles A and B show earlier saturation at low bit rates than profile C. The R-D curves of profile B for different TMOs tend to change similarly in shape. However, the PSNR performance is highly dependent on the TMOs used. On the other hand, profile C exhibits the low dependence on TMOs and outperforms profiles A and B for the higher bit rate. Profile C shows an increasing PSNR as the bit rate grows, while profiles A and B tend to saturate the PSNR earlier. These results may be concluded as follows:

Fig. 8.PSNR performance comparison between TMOs. Base layer quality value is set at 70 and residual layer quality values are set to 20, 40, 50, 60, 70, 80, 90, and 100. This figure shows PSNR as a function of bit rate. An R-D curve shows the performance of each profile: blue lines (profile A), red lines (profile B), and green lines (profile C). A Marker represents an image of dataset: BloomingGorse2 (Circle), CanadianFalls (Square), McKeesPub (Cross), MtRushmore2 (Point), and WillyDesk (Asterisk).

Overall, the Drago TMO has shown the best PSNR performance for all profiles among the TMOs used in this paper’s experiment. And the results of the TMQI describe that the performance of the JPEG XT does not rely on the quality of the tone mapped LDR image.

 

5. Conclusion

This paper gives the results of a study to evaluate the performance of various TMOs through the tone mapped LDR image quality and PSNR performance of the reconstructed HDR image. Two metrics were used to measure the quality of tone mapped LDR images: TMQI based on statistical characteristics and S based on HVS characteristics. Mantiuk-08 gave the highest TMQI score and Drago gave the highest value of S. As for the PSNR performance, the Drago TMO demonstrated the best results.

The results of this evaluation can be extended in performance evaluation of the JPEG XT standard to include a larger set of HDR images and other tone mapping operations.

References

  1. E. Reinhard, W. Heidrich, P. Debevec, S. Pattanaik, G. Ward, and K. Myszkowski, High dynamic range imaging: acquisition, display, and image based lighting, Morgan Kaufmann, 2010.
  2. F. Banterle, A. Artusi, K. Debattista, and A. Chalmers, Advanced high dynamic range imaging: theory and practice, CRC Press, 2011.
  3. T. Richter, W. Husak, A. Ninan, A. Ten, W. Jia, W. Rozzi, P. Korshunov, T. Ebrahimi, A. Artusi, and M. Agostinelli, Text of ISO/IEC WD1 18477-2, JPEG document, wg1n6864, Strasbourg, France, Oct. 2014.
  4. T. Richter, Notes from the Brussels Interim Meeting, JPEG document, wg1n6586, Brussels, Belgium, Dec. 2013.
  5. E. Reinhard, M. Stark, P. Shirley, and J. Fewerda, “Photographic tone reproduction for digital images,” ACM Transactions on Graphics, vol. 21, no. 3, pp. 267-276, July 2002. Article (CrossRef Link) https://doi.org/10.1145/566654.566575
  6. K. Fliegel and L. Krasula, JPEG XT verification tests by CTU in Prague, JPEG document, wg1n6584, San Jose, CA, Jan. 2014.
  7. P. Korshunov and T. Ebrahimi, JPEG XT verification tests by EPFL, JPEG document, wg1n6587, San Jose, CA, Jan. 2014.
  8. A. Pinheiro, JPEG XT verification tests by U.B.I., Covilha, Portugal, JPEG document, wg1n6590, San Jose, CA, Jan. 2014.
  9. T. Bruylants, JPEG XT verification test, VUB-iMinds, JPEG document, wg1n6592, San Jose, CA, Jan. 2014.
  10. T. Richter, A. Artusi, and M. Agostinelli, Text of ISO/IEC DIS1 18477-7, JPEG document, wg1n6839, Strasbourg, France, Oct. 2014.
  11. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600-612, Apr. 2004. Article (CrossRef Link) https://doi.org/10.1109/TIP.2003.819861
  12. T. Aydin, R. Mantiuk, K. Myszkowski, and H. Seidel, “Dynamic range independent image quality assessment,” ACM Transactions on Graphics, vol. 27, no. 3, pp. 69, Aug. 2008. Article (CrossRef Link) https://doi.org/10.1145/1360612.1360668
  13. H. Yeganeh and Z. Wang, “Objective quality assessment of tone mapped images,” IEEE Trans. Image Process., vol. 22, no. 2, pp. 657-667, Feb. 2013. Article (CrossRef Link) https://doi.org/10.1109/TIP.2012.2221725
  14. K. Ma, H. Yeganeh, K. Zeng, and, Z. Wang, "High dynamic range image tone mapping by optimizing tone mapped image quality index," in Proc. of IEEE International Conference on Multimedia and Expo, vol. 3, no. 7, pp. 37, July 2014. Article (CrossRef Link)
  15. J. Zhang and T. M. Le, “A new no-reference quality metric for JPEG2000 images,” IEEE Trans. Image Process., vol. 56, no. 2, pp. 743-750, May 2010. Article (CrossRef Link)
  16. K. Panetta, C. Gao, and S. Agaian, “No reference color image contrast and quality measure,” IEEE Trans. Consumer Electron., vol. 59, no. 3, pp. 643-651, Aug. 2013. Article (CrossRef Link) https://doi.org/10.1109/TCE.2013.6626251
  17. N. Ponomarenko, V. Lukin, A. Zelensky, K. Egiazarian, M. Carli, and F. Battisti, “TID2008-a database for evaluation of full-reference visual quality assessment metrics,” Advances of Modern Radioelectronics, vol. 10, no. 4, pp. 30-45, 2009.
  18. D. Hasler and S. E. Suesstrunk, "Measuring colorfulness in natural images," in Proc. of SPIE Human Vision and Electronic Imaging VIII, vol. 5007, pp. 87-95, June 2003. Article (CrossRef Link)
  19. M. Čadík, M. Wimmer, L. Neumann, and A. Artusi, “Evaluation of HDR tone mapping methods using essential perceptual attributes,” Computer & Graphics, vol. 32, no. 3, pp. 330-349, June 2008. Article (CrossRef Link) https://doi.org/10.1016/j.cag.2008.04.003
  20. M. Narwaria, M. P. Da Silva, P. Le Callet, and R. Pepion, “Tone mapping based HDR compression: Does it affect visual experience?,” Special Issue on Advances in High Dynamic Video Research, vol. 29, no. 2, pp. 257-273, Feb. 2014. Article (CrossRef Link)
  21. T. Richter, difftest_ng utility for measurement and format conversions, JPEG document, wg1n6562, San Jose, CA, Jan. 2014.
  22. F. Drago, K. Myszkowski, T. Annen, and N. Chiba, "Adaptive logarithmic mapping for displaying high contrast scenes," in Proc. of EUROGRAPH- ICS 2003, vol. 22, no. 3, pp. 419-426, Sep. 2003. Article (CrossRef Link)
  23. J. Kuang, G. M. Johnson, and M. D. Fairchild, “iCAM06: A refined image appearance model for HDR image rendering,” Journal of Visual Communication and Image Representation, vol. 18, pp. 406-414, Oct. 2007. Article (CrossRef Link) https://doi.org/10.1016/j.jvcir.2007.06.003
  24. Z. Mai, H. Mansour, R. Mantiuk, P. Nasiopoulos, R. Ward, and W. Heidrich, "Optimizing a tone curve for backward-compatible high dynamic range image and video compression," IEEE Trans. Image Process., vol. 20, no. 6, pp. 1558-1571, June 2011. Article (CrossRef Link) https://doi.org/10.1109/TIP.2010.2095866
  25. R. Mantiuk, S. Daly, and L. Kerofsky, “Display adaptive tone mapping,” ACM Transactions on Graphics, vol. 27, no. 3, pp. 68, Aug. 2008. Article (CrossRef Link) https://doi.org/10.1145/1360612.1360667
  26. E. Reinhard and K. Devlin, “Dynamic range reduction inspired by photoreceptor physiology,” IEEE Trans. Visualization and Computer Graphics, vol. 11, no. 1, pp. 13-24, Jan. 2005. Article (CrossRef Link) https://doi.org/10.1109/TVCG.2005.9