86 S. ZHOU
inner minimization problem on
ξ
i
is substituted by the conjugate function [29]of
loss function L (·), denoted as L
∗
: R → R
+
∪{+∞}, which is defined as
L
∗
(v) := max
u
{uv− L(u)} = − min
u
{L(u) − uv}.
Then Lagrangian dual (4) is simplified as the following optimization problem
min
γ
1
2
λ
γ
YKY
γ
− e
γ
+
m
∑
i=1
L
∗
(
γ
i
), (5)
where K is a symmetrical kernel matrix with its component K
i, j
= k(x
i
,x
j
),and
Y is a diagonal matrix with y =(y
1
,y
2
,···, y
m
)
as its diagonal elements. The
duality relationship maintains the result (1) induced by the representer theorem.
Table 1 lists some popular loss and their conjugate functions.
Table 1. Popular loss functions and their conjugate functions.
Loss Function Conjugate function
Hinge: L(u)=max{0,u} L
∗
(v)=
0, 0 ≤ v ≤ 1
+∞, others
Huber: L
δ
(u)=
max{0,u}, |u| >
δ
(u+
δ
)
2
4
δ
, |u|≤
δ
L
∗
(v)=
δ
v(v−1), 0 ≤ v ≤ 1
+∞, others
Logistic: L
p
(u)=
1
p
log(1+ exp(pu)) L
∗
(v)=
1
p
log(1− v)
(1−v)
v
v
, 0 ≤ v ≤ 1
+∞, others
Squared hinge: L(u)=
1
2
max{0,u}
2
L
∗
(v)=
1
2
v
2
, v ≥ 0
+∞, v < 0
p-normed: L(u)=
1
p
|u|
p
(1 < p < ∞) L
∗
(v)=
1
q
|v|
q
,
1
p
+
1
q
= 1
Least squares: L(u)=
1
2
u
2
L
∗
(v)=
1
2
v
2
Absolute: L(u)=|u| L
∗
(v)=
0, |v|≤1
+∞, |v| > 1
As for the representer theorem, plugging (1)in(2) and eliminating the equalities
constraints, we have
min
α
∈R
m
λ
2
α
K
α
+
m
∑
i=1
L
1− y
i
m
∑
j=1
α
j
K
i, j
, (6)
(6) is called primal SVM in [5]and[22], and some algorithms are given according
to the different loss functions.
By the duality technique above, we can prove that the dual of problem (6)has
thesameformas(5). So (6) and is equivalent to (5) naturally but they have
different computational stability since the small parameter
λ
appears different
place.
Stat., Optim. Inf. Comput. Vo l . 1, December 2013.