Video pre-processing with JND-based Gaussian filtering of
superpixels
Lei Ding, Ge Li*, Ronggang Wang, Wenmin Wang
School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University
ABSTRACT
In this paper, an innovative method of HEVC video pre-processing is proposed. The method applies a simple linear
iterative clustering (SLIC), which adapts a k-means clustering to group pixels into perceptually meaningful atomic
regions of superpixels. By calculating the average of weighted average of luminance differences around each pixel in the
superpixel, a suitable parameter of Gaussian filter for the superpixel is determined. Experimental results show that bit
rate can be reduced up to 29% without loss in visual quality.
Keywords: HEVC, JND, superpixel, video pre-processing, Gaussian filtering
1. INTRODUCTION
The recently developed HEVC [1], high efficiency video coding standard, is becoming more and more popular. Its
improved compression performance relative to the existing standard is in the range of 50% bit rate reduction.The human
eyes perceive images through the human visual system (HVS), which provides a possibility to get a higher video
compression ratio. Extensive research has been conducted to improve the performance of encoder conformable with
standards.
Video pre-processing [2] can improve the subjective quality of a reconstructed video or reduce the bit rate in the
generation of a compressed bit stream. The usual video pre-processing adopted video data spatial filtering, temporal
filtering and image sharpening [3]. By applying a visual perception threshold (PTHD) with just noticeable distortion
(JND), one can achieve a compression gain up to 10% to 15% by exploiting video data perceptual redundancy [4-6].The
Gaussian filter has been widely used for de-noising in image processing. By applying Gaussian filtering, the new value
of pixel (x, y) is the weighted average of the pixels around itself. Gaussian filtering makes the image smoother, which
means the deviation of the pixels in a coding block will become smaller, and thus will reduce the bit rate in the process
of motion estimation, transformation, scaling and quantization. Smooth areas can withstand strong filtering without
being noticed, while edge areas or textured areas will be blurred, thus these areas should be filtered slightly or not at all.
The superpixel is an area with similar texture, contour, colour, etc. Superpixel algorithms group pixels into perceptually
meaningful atomic regions. They capture image redundancy and provide a convenient primitive from which to compute
image features. Algorithms for generating superpixels can be broadly categorized as either graph-based or gradient
ascent methods. Graph-based approaches to superpixel generation treat each pixel as a node in a graph. Edge weights
between two nodes are proportional to the similarity between neighbouring pixels. The superpixels are created by
minimizing a cost function defined over the graph. Whereas the gradient-ascent-based methods start from a rough initial
clustering of pixels, iteratively refine the cluster until some convergence criterion is met to form superpixels. SLIC [7], a
state-of-the-art method for generating superpixels based on K-means clustering, has been shown to outperform existing
superpixel methods.
Human eyes cannot perceive any changes below the JND threshold of around a pixel due to their underlying
spatial/temporal masking properties [8]. The sensitivity of distortion by human eyes can vary significantly in different
areas of a frame, upon which the JND model is set. The major factors that contribute to the JND model are spatial
contrast sensitivity function, luminance adaption [9-10] etc. These factors reflect the texture, edge and boundary of the
frame, and the frame can be filtered according to these factors.
Based on the related works above, we present a new approach of incorporating SLIC and JND for video pre-processing
in order to reduce the bit rate without loss of visual quality. Subjective and objective evaluation is carried out to verify
the effectiveness of the proposed approach.
Robert L. Stevenson, Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 9410, 941004
Proc. of SPIE-IS&T Vol. 9410 941004-1