Both covariation and FLOM are proved to be bounded
for jointly SαS random variables. The operator ðU Þ
〈β〉
is
mainly used to suppress the outliers in a SαS distribution.
Based on the well defined covariation and FLOM, the ði; jÞth
entries of two M M matrices for the array output xðtÞ are
formulated by ½x
i
ðtÞ; x
j
ðtÞ
α
in ROC-MUSIC and E fx
i
ðtÞj
x
j
ðtÞj
p 2
x
n
j
ðtÞg in FLOM-MUSIC respectively. However, we
can see from both Eqs. (3) and (4) that ðU Þ
〈β〉
only acts on
the random variable Y but not on X at all. So the
performance of either covariation or FLOM based MUSIC
algorithms will degenerate significantly when the impul-
siveness of both x
i
ðtÞ and x
j
ðtÞ is remarkable [12].
Belkacemi [12] defined the conjugate of the FLOM
operator X
〈β〉
¼ðX
n
Þ
〈β〉
¼ðX
〈β〉
Þ
n
and proposed the PFLOM
based operator for X and Y by
W
PFLOM
¼EfX
〈
b
2
〉
Y
〈
b
2
〉
g 0o bo α ð5Þ
Thereby, the corresponding matrix for the array output
xðtÞ based on PFLOM can be derived with the ði; jÞth entry
given by Efx
〈
b
2
〉
i
ðtÞx
〈
b
2
〉
j
ðtÞg. Since the impulsiveness in x
i
ðtÞ
and x
j
ðtÞ has been restrained simultaneously by the opera-
tor ðU Þ
〈
b
2
〉
and ðU Þ
〈
b
2
〉
, it is easy to understand why the
PFLOM based MUSIC algorithm performs superior to the
covariation and FLOM based ones especially in highly
impulsive noise environments. However, compared with
the performance in less impulsive noise situations (e.g.
α4 1:8), all the FLOS based MUSIC algorithms (including
the PFLOM based MUSIC) exhibit a distinct degradation in
the strong impulsive noise environments (e.g. α o 1:5).
In addition, for the scenario that noncircular signals (e.g.
BPSK, AM signals) are contaminated by SαS noise, these
FLOS based MUSIC algorithms perform poorer due to the
model mismatch than they do in the scenario of circular
signals [11].
3. Correntropy based correlation
Here, we propose a new operator based on correntropy,
namely the correntropy based correlation, and then apply
it with MUSIC to gain a more robust DOA estimation
results in highly impulsive noise envrionments.
3.1. Correntropy
Inspired by kernel-based methods [15] and information
theoretic learning (ITL) methods [16], based on the infor-
mation potential (IP), the correntropy for two arbitrary
random variables X and Y is defined as follows [14]:
V
s
ðX; YÞ¼E½κ
s
ðX YÞ ð6Þ
where κ
s
ðU Þ is the kernel function that satisfies Mercer's
theory [17] and E½U denotes the mathematical expecta-
tion. A brief interpretation for Mercer' theory and kernel
function can be found in Appendix A.
Using a Taylor series expansion for the widely used
Gaussian kernel, the correntropy can be rewritten as
V
s
ðX; YÞ¼
1
ffiffiffiffiffiffi
2π
p
s
∑
1
n ¼ 0
ð1Þ
n
2
n
s
2n
n!
E½ðX YÞ
2n
ð7Þ
which involves all the even-order moments of the random
variable ðX YÞ. It is notable that the term corresponding
to n ¼1in(7) is proportional to EðX
2
ÞþEðY
2
Þ2EðXY Þ,
which indicates that the conventional covariance function
(the autocorrelation for zero-mean processes) is also
included within correntropy. Liu [13] has compared cor-
rentropy with the constrained covariance [18] and con-
cluded that the former is a much simpler, while possibly
weaker measure of independence than the latter but a
much stronger measure than the traditional covariance.
Correntropy can also induce a metric which can be
described as
CIMðX; YÞ¼sqrtfE½κð0Þκ
s
ðX YÞg ð8Þ
Apparently, for Gaussian kernel, κð0Þ¼1=
ffiffiffiffiffiffi
2π
p
s. The metric
correntropy induced metric (CIM) possesses a “mix norm”
property, that is, it behaves like an L2 norm when two
points are close, and like an L1 norm when the two points
are getting apart and eventually like a L0 norm when the
two points get further apart. This property demonstrates
the inherent robustness to outliers of correntropy. Liu [13]
analyzed the influence to CIM of the kernel size and
concluded that a small kernel size leads to a tight linear
region and to a large L0 region, while a larger kernel size
will enlarge the linear region and shrink the L0 region,
that is, the kernel size controls the level of the outlier
suppression.
In practice, the joint probability density function is
often unknown, and only a finite number of data
fðx
i
; y
i
Þg
N
i ¼ 1
for X and Y are available, where the sample
estimator of correntropy can be obtained through
^
V
s
ðX; YÞ¼
1
N
∑
N
i ¼ 1
κ
s
ðx
i
y
i
Þð9Þ
The merit of correntropy is that it can convey informa-
tion not only about the correlation but also the statistical
distribution for stochastic processes as the correntropy
possesses a lot of properties quantifying the PDF of the
data directly. Meanwhile, correntropy is robust against
outliers due to the inner product in the feature space
computed via the Gaussian kernel. All these features would
inspire us to develop new array signal processing methods
using conventional correlation matrices, such as, signal and
noise subspace decompositions, projections, etc.
3.2. Correntropy based correlation
In this paper, we define a new operator based on
correntropy which can be applied in SαS processes for a
wide range of 1o α r 2, and that would make it as an
effective substitute for the conventional correlation
functions.
Theorem 1. Let X and Y be i.i.d SαS random variables with
the characteristic exponent 1o α r 2, the correntropy-based
correlation (CRCO) by applying Gaussian kernel of X and Y
given by
R
CRCO
¼E exp
jX Yj
2
2s
2
XY
ð10Þ
is bounded. Where s is the kernel size. See Appendix B for the
proof of the boundedness of CRCO.
J. Zhang et al. / Signal Processing 104 (2014) 346–357348