1942 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 5, MAY 2013
(a) (b) (c) (d)
Fig. 1. Interchanging phase and amplitude of stereoscopic images. (a) Original left image. (b) Original right image. (c) Image constructed using the phase
of (a) and the amplitude of (b). (d) Image constructed using the amplitude of (a) and the phase of (b).
presents the proposed perceptual quality assessment metric.
The experimental results are given and discussed in Section IV,
and finally conclusions are drawn in Section V.
II. B
INOCULAR VISUAL CHARACTERISTICS ANALYSES
It has been known that the binocular vision is a complex
visual process that requires the brain and both eyes work-
ing together to produce depth perception and clear vision
[42]. As an example of one-dimensional signals, consider-
ing a simple binocular cell with the left and right recep-
tive fields, the binocular energy response to a stereoscopic
image pair I
l
(x) and I
r
(x) at position x can be described
as [43]:
r
q
=
∞
−∞
[ f
l
(x)I
l
(x) + f
r
(x)I
r
(x)]dx
2
=
∞
−∞
g(x)e
jωx
I
l
(x) + e
iφ
−
I
r
(x)
dx
2
(1)
where f
l
(x) = g(x)e
j (ωx+φ
l
)
and f
r
(x) = g(x)e
j (ωx+φ
r
)
,
being Gabor functions for the left and right images, respec-
tively; g(x) is Gaussian kernel function; and φ
−
= φ
r
− φ
l
,
being the phase difference between the left and right images.
The stimulus disparity can be estimated by D =
ˆ
φ
−
/ω,where
ˆ
φ
−
is the phase difference that maximizes the binocular energy
response r
q
,andω is the radial frequency of the cell.
From another perspective, if we have known the position
shift d between the left and right receptive field centers, the
binocular energy response in Eq. (1) can be written as
r
q
=
∞
−∞
e
jωx
g(x)I
l
(x) + e
jωd
g(x + d)I
r
(x)
dx
2
(2)
and the stimulus disparity is given by D =
ˆ
d,where
ˆ
d
is the position shift that maximizes the binocular energy
response r
q
.
From the above equations, since the stimulus disparity can
be estimated by D =
ˆ
φ
−
/ω and D =
ˆ
d, we can find
that both phase difference and position shift can describe
the same disparity information in the binocular vision
(i.e.,
ˆ
φ
−
/ω =
ˆ
d). In other words, phase difference between left
and right images provides the main cue for binocular disparity
identification and depth perception in the binocular vision,
and the distortion in phase may affect precisely identification
of binocular disparity and further affect the perceived depth.
We present two examples to illustrate the above phenomena.
In the first example, the original left and right images of
‘Lovebird1’ test sequence are shown in Fig. 1(a) and (b).
By interchanging the phase and amplitude, we construct the
image in Fig. 1(c) using the phase of Fig. 1(a) and the
amplitude of Fig. 1(b), and the image in Fig. 1(d) using the
amplitude of Fig. 1(a) and the phase of Fig. 1(b). It can be
noticed that the same object in Fig. 1(c) and Fig. 1(d) has
different position shifts because the two images have different
phase information, while the position shifts are the same in
Fig. 1(a) and Fig. 1(c), or in Fig. 1(b) and Fig. 1(d) if two
images have the same phase information. This illustrates that
the phase conveys the disparity information in the binocular
vision.
In the second example, without considering the position
shift between images, phase and amplitude have different
contributions in determining image quality. The first row
of Fig. 2 shows the (a) original, (b) Gaussian blurred and
(c) JPEG compressed left images, respectively. The second
row of Fig. 2 shows the constructed images using their
respective phases of Figs. 2(a)-(c) but constant amplitude
(luminance-inverted for better display), while the third row
in Fig. 2 shows the constructed images using their respective
amplitudes Figs. 2(a)–(c) but constant phase. We can find
that most important features such as edges and contours are
preserved in Figs. 2(d)-(f), and the structure is degraded due
to blurring and JPEG compression. In other words, the con-
structed images in Figs. 2(g)-(i) convey less useful information
although they reflect the blurring and JPEG compression
distortions. Existing studies have shown that phase information
is very important in feature description [44]–[46]. Therefore,
phase similarity (difference) between the original and dis-
torted images is expected to give a reasonable estimation of
quality degradation (i.e., phase has a larger impact on the
quality score than amplitude, as demonstrated in the next
subsection IV.B).
It is well known that visual masking effect (e.g., formulated
as just-noticeable difference (JND)) has played an important
role in pro-HVS signal processing [47]. For example, the HVS
can tolerate more error in higher frequency components while
the distortion in lower frequency components has a larger
impact on the visual quality. Recently, Zhao et al. proposed a
BJND model to measure the minimum distortion in the two
views of stereoscopic images with psychophysical experiments
[48]. In the following, we summarize the derivation of the
BJND model. By incorporating the luminance and contrast