Citation: | Yang Tong, Yu Mei, Jiang Hao, et al. Visual perception based rate distortion optimization method for high dynamic range video coding[J]. Opto-Electronic Engineering, 2018, 45(1): 170627. doi: 10.12086/oee.2018.170627 |
[1] | Chalmers A, Debattista K. HDR video past, present and future: a perspective[J]. Signal Processing: Image Communication, 2017, 54: 49–55. doi: 10.1016/j.image.2017.02.003 |
[2] | Hulusic V, Debattista K, Valenzise G, et al. A model of perceived dynamic range for HDR images[J]. Signal Processing: Image Communication, 2017, 51: 26–39. doi: 10.1016/j.image.2016.11.005 |
[3] | Lin Y T, Wang C M, Chen W S, et al. A novel data hiding algorithm for high dynamic range images[J]. IEEE Transactions on Multimedia, 2017, 19(1): 196–211. doi: 10.1109/TMM.2016.2605499 |
[4] | Yang Y, Wang X, Liu Q, et al. A bundled-optimization model of multiview dense depth map synthesis for dynamic scene reconstruction[J]. Information Sciences, 2015, 320: 306–319. doi: 10.1016/j.ins.2014.11.014 |
[5] | Yang Y, Liu Q, Liu H, et al. Dense depth image synthesis via energy minimization for three-dimensional video[J]. Signal Processing, 2015, 112: 199–208. doi: 10.1016/j.sigpro.2014.07.020 |
[6] | Yang Y, Deng H P, WU J, et al. Depth map reconstruction and rectification through coding parameters for mobile 3D video system[J]. Neurocomputing, 2015, 151: 663–673. doi: 10.1016/j.neucom.2014.04.088 |
[7] | LIU Q, Yang Y, Ji R R, et al. Cross-view down/up-sampling method for multiview depth video coding[J]. IEEE Signal Processing Letters, 2012, 19(5): 295–298. doi: 10.1109/LSP.2012.2190060 |
[8] | Francois E, Fogg C, He Y W, et al. High dynamic range and wide color gamut video coding in HEVC: status and potential future enhancements[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(1): 63–75. doi: 10.1109/TCSVT.2015.2461911 |
[9] | Kerofsky L, Ye Y, He Y W. Recent developments from MPEG in HDR video compression[C]//Proceedings of 2016 IEEE International Conference on Image Processing (ICIP), 2016: 879–883. |
[10] | Luthra A, Francois E, Husak W. Call for evidence (CfE) for HDR and WCG video coding[R]. ISO/IEC JTC1/SC29/WG11 MPEG2015/N15083. Geneva, Switzerland: ISO, 2015. |
[11] | Koz A, Dufaux F. Methods for improving the tone mapping for backward compatible high dynamic range image and video coding[J]. Signal Processing: Image Communication, 2014, 29(2): 274–292. doi: 10.1016/j.image.2013.08.017 |
[12] | Mai Z C, Mansour H, Mantiuk R, et al. Optimizing a tone curve for backward-compatible high dynamic range image and video compression[J]. IEEE Transactions on Image Processing, 2011, 20(6): 1558–1571. doi: 10.1109/TIP.2010.2095866 |
[13] | Zhang Y, Reinhard E, Bull D. Perception-based high dynamic range video compression with optimal bit-depth transformation[C]//Proceedings of the 2011 18th IEEE International Conference on Image Processing (ICIP), 2011: 1321–1324. |
[14] | Motra A, Thoma H. An adaptive Logluv transform for high dynamic range video compression[C]//Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), 2010: 2061–2064. |
[15] | Zhang Y, Naccari M, Agrafiotis D, et al. High dynamic range video compression exploiting luminance masking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(5): 950–964. doi: 10.1109/TCSVT.2015.2426552 |
[16] | Miller S, Nezamabadi M, Daly S. Perceptual signal coding for more efficient usage of bit codes[J]. SMPTE Motion Imaging Journal, 2013, 122(4): 52–59. doi: 10.5594/j18290 |
[17] | Barten P G J. Formula for the contrast sensitivity of the human eye[J]. Proceedings of SPIE, 2004, 5294: 231–238. |
[18] | Yu S T, Jung C, KE P. Adaptive PQ: adaptive perceptual quantizer for HEVC main 10 profile-based HDR video coding[C]//Proceedings of 2016 Visual Communications and Image Processing (VCIP), 2016: 1-4. |
[19] | Zhang Y, Agrafiotis D, Naccari M, et al. Visual masking phenomena with high dynamic range content[C]//Proceedings of the 20th IEEE International Conference on Image Processing (ICIP), 2013: 2284–2288. |
[20] | Jung C, Lin Q Z, Yu S T. HEVC encoder optimization for HDR video coding based on perceptual block merging[C]//Proceedings of 2016 Visual Communications and Image Processing (VCIP), 2016: 1–4. |
[21] | Banitalebi-Dehkordi A, Dong Y Y, Pourzazd T M, et al. A learning-based visual saliency fusion model for high dynamic range video (LBVS-HDR)[C]//Proceedings of the 2015 23rd European Signal Processing Conference, 2015: 1541–1545. |
[22] | Sullivan J G, Ohm J, Han J W, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. doi: 10.1109/TCSVT.2012.2221191 |
[23] | Zhang H X, Lin S W, Xue P. Improved estimation for just-noticeable visual distortion[J]. Signal Processing, 2005, 85(4): 795–808. doi: 10.1016/j.sigpro.2004.12.002 |
[24] | Durand F, Dorsey J. Fast bilateral filtering for the display of high-dynamic-range images[J]. ACM Transactions on Graphics (TOG), 2002, 21(3): 257–266. |
[25] | Narwaria M, mantiuk R K, Da Silva M P, et al. HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images[J]. Journal of Electronic Imaging, 2015, 24(1): 010501. doi: 10.1117/1.JEI.24.1.010501 |
[26] | Hanhart P, Bernardo M V, Pereira M, et al. Benchmarking of objective quality metrics for HDR image quality assessment[J]. EURASIP Journal on Image and Video Processing, 2015, 2015: 39. doi: 10.1186/s13640-015-0091-4 |
[27] | Azimi M, Banitalebi A, Dong Y, et al. A survey on the performance of the existing full reference HDR video quality metrics: a new HDR video dataset for quality evaluation purposes[C]//Int I. Conf. on Multimedia Signal Processing, 2014. |
Overview: In view of the drastic increase of storage resources and transmission bandwidth requirement for high dynamic range (HDR) video compared to the traditional low dynamic range (LDR) video, we propose a new dynamic rate distortion optimization algorithm based on visual perception for HDR video encoding to improve the performance of high efficiency video coding (HEVC) Main 10, in which visual attention and texture masking properties of HDR video content are used into HDR video coding. Firstly, the visual saliency map is acquired for the current input HDR video frame. With the information of visual selective attention, we design a non-uniform distortion weight distribution strategy to different regions of interest and improve the conventional method of distortion calculation, which makes the measurement of distortion more in line with human visual system. At the same time, we also take the characteristics of human visual system into account to HDR video coding, such as that human visual system is also very sensitive to distortion in flat areas that are not easily noticeable to the observer, and can tolerate more distortions in areas with complex texture in salient areas. In order to further eliminate the perceived redundancy in HDR video coding, a bilateral filter is used to separate the texture components of the input video frame from which we can extract the texture characteristics to adjust the Lagrange multiplier adaptively. Then, the rate distortion cost function incorporated visual perception is calculated instead of the original rate distortion cost formula, which is applied to the encoder to dynamically adjust the quantization parameters, so as to realize reasonably the trade-off between coded bits and distortion. In the end, the HDR video rate distortion optimization algorithm based on visual perception is established and applied to the whole coding process, including pattern decision, motion estimation and rate-distortion optimization quantization. The proposed algorithm can make it possible to keep the HDR video quality in line with human visual perception while reducing the bitrates. The experimental results show that the proposed algorithm can save an average of 7.46% and 6.53% bitrate with the same HDR-visible Difference Predictor-2.2 (HDR-VDP-2.2) and PSNR_DE compared with HEVC Main 10, saving the maximum of 18.52 % and 11.49%, respectively. It can be seen from the experimental results and partial enlargement that the proposed algorithm preserves the image details and structure information well and has good coding effects for scenes with large visual saliency and complex texture. The proposed algorithm is more reasonable in coding bit allocation strategy, which can reduce the consumption of the overall bitrates and still maintain the visual quality of the reconstructed HDR video.
Dynamic distortion optimization model based on visual perception
BalloonFestival sequence (a) and its saliency map (b)
The rule of additivity for HEVC
The first image of Market3 sequence and its detail layer image. (a) Original image frame; (b) Original image frame grayscale; (c) Base layer image; (d) Detail layer image
HDR sequence of MPEG AHG test material. (a) FireEater2; (b) Market3; (c) Tibul2; (d) BalloonFestival
BD-rate results of test sequence under different values of parameter a
Comparison of rate-distortion curves between HM 16.9 and the proposed algorithm. (a) BalloonFestival; (b) Tibul2; (c) Market3; (d) FireEater2
The 27th image of BalloonFestival sequence. (a) The original 27th frame of the image with a partial enlargement; (b) The reconstructed image of HM 16.9 and partial enlargement, Q = 53.7123, 5280 bits; (c) The reconstructed image of the proposed algorithm and partial enlargement, Q = 53.864, 4800 bits