Entropy 2016, 18, 105 5 of 21
distribution I
S
can be minimized. Here, we further analyse the effect of DSM on the KL-divergence
between
ˆ
l(R, I
S
) and I
S
.
Specifically, we propose the following Proposition 2, which proves that if
ˆ
λ decreases, the
KL-divergence between
ˆ
l(R, I
S
) and I
S
will be increased monotonously.
Proposition 2. If
ˆ
λ (
ˆ
λ > 0) decreases, the KL-divergence between
ˆ
l(R, I
S
) and I
S
will increase.
Proof. Using the simplified notations in Table 2, let the KL-divergence of between
ˆ
l(R, I
S
) and I
S
be
formulated as:
D(
ˆ
l(R, I
S
), I
S
) =
m
∑
i=1
ˆ
l(R, I
S
)(i) log(
ˆ
l(R, I
S
)(i)
I
S
(i)
) =
m
∑
i=1
ˆ
l(i) log(
ˆ
l(i)
I
S
(i)
)
(5)
Now, let ξ = 1/
ˆ
λ as we did in the proof of Proposition 1 (see [6]). According to Equation (2), we
have
ˆ
l(R, I
S
) = ξ × M + (1 − ξ) × I
S
. It then turns out that:
ˆ
l(i) = ξ × (M(i) − I
S
(i)) + I
S
(i). (6)
Based on Equations (5) and (6), we get:
D(
ˆ
l(R, I
S
), I
S
) =
m
∑
i=1
(ξ × (M(i) − I
S
(i)) + I
S
(i)) log(
ξ × (M(i) − I
S
(i)) + I
S
(i)
I
S
(i)
)
(7)
Let D(ξ) = D(
ˆ
l(R, I
S
), I
S
). The derivative of D(ξ) can be calculated as:
D
0
(ξ) =
m
∑
i=1
[M(i) − I
S
(i) + (M(i) − I
S
(i)) log(
ξ × (M(i) − I
S
(i)) + I
S
(i)
I
S
(i)
)]
(8)
Since
∑
m
i=1
M(i) = 1 and
∑
m
i=1
I
S
(i) = 1,
∑
m
i=1
[M(i) − I
S
(i)] becomes zero. We then have:
D
0
(ξ) =
m
∑
i=1
(M(i) − I
S
(i)) log(
ξ × (M(i) − I
S
(i)) + I
S
(i)
I
S
(i)
)
=
m
∑
i=1
(M(i) − I
S
(i)) log(
ξ × (M(i) − I
S
(i))
I
S
(i)
+ 1)
(9)
Let the i-th term in the summation of Equation (9) be:
D
0
(ξ)(i) = (M(i) − I
S
(i)) log(
ξ × (M(i) − I
S
(i))
I
S
(i)
+ 1)
It turns out that when M(i) > I
S
(i) or M(i) < I
S
(i), D
0
(ξ)(i) is greater than zero. When
M(i) = I
S
(i), D
0
(ξ)(i) is zero. However, M(i) does not always equal to I
S
(i). Therefore,
D
0
(ξ) =
∑
m
i=1
D
0
(ξ)(i) is greater than zero.
In conclusion, we have D
0
(ξ) > 0. This means that D(ξ) (i.e., D(
ˆ
l(R, I
S
), I
S
)) increases after ξ
increases. Since λ = 1/ξ, after
ˆ
λ decreases, D(
ˆ
l(R, I
S
), I
S
) will increase.
Table 2. Simplified notations.
Original Simplified Linear Coefficient
l(R, I
S
)(i) l(i) λ
ˆ
l(R, I
S
)(i)
ˆ
l(i)
ˆ
λ (estimate of λ)
l
L
(R, I
S
)(i) l
L
(i) λ
L
(lower bound of
ˆ
λ)