
Improved Separable Dictionary Learning
Fengzhen Zhang
1,2
, Yigang Cen
1,2,*
, Ruizhen Zhao
1,2
, Hengyou Wang
1,2,3
1
Institute of Information Science, Beijing Jiaotong University, Beijing, China
2
Key Laboratory of Advanced Information Science and Network Technology of Beijing, Beijing, China
3
School of Science, Beijing University of Civil Engineering and Architecture, Beijing, China
13112062@bjtu.edu.cn, ygcen@bjtu.edu.cn, rzzhao@bjtu.edu.cn, 12112064@bjtu.edu.cn
Abstract—Due to the extensive applications of sparse
representation, the algorithms of dictionary learning have
received widespread attentions. In this paper, we proposed a new
separable dictionary learning algorithm: improved separable
dictionary learning (ISeDiL). Different with the traditional
separable learning methods, we divided the learning procedure
into two steps: sparse coding and dictionary optimization. In the
first step, we used the separable fast iterative shrinkage-
thresholding algorithm (SFISTA) to obtain the sparse coefficient
matrices. In the second step, we projected the dictionaries to
oblique manifold, and used the conjugate gradient method to
optimize the dictionaries. By projecting the dictionaries to
oblique manifold, the dictionaries can be optimized as a whole
part. At last, we showed the experimental procedure, and the de-
noising results of the proposed algorithm.
Keywords—sparse representation, separable dictionary learning,
oblique manifold, fast iterative shrinkage-thresholding algorithm
I. INTRODUCTION
Recently, sparse representation has received great
attentions due to its extensive applications, e.g. image de-
noising [1], face recognition [2, 3], and image super-resolution
reconstruction [4, 5]. Sparse representation suggests that a
signal
m
y
can be represented by a few atoms of a given
over-complete dictionary
mn
nm
u
D
, i.e.
yDx
.
Here the vector
n
x
is a sparse vector, which means that
most of its entries are zeros or have small magnitudes [5-8].
The problem of sparse representation can be described as:
0
0
:min ..Pst
x
xyDx
,(1)
or
0,
02
:min ..Pst
H
H
d
x
xyDx
,(2)
where
0
is the
0
A
pseudo-norm which counts the number of
nonzero entries. In the problem mentioned above, the
dictionary
mnu
D
is a pre-specified transform matrix, such
as wavelets, curvelets, and contourlets. These pre-specified
dictionaries are appealing because they are simpler. However,
these dictionaries cannot express some special signals well,
such as natural facial images. Therefore, dictionary learning
algorithms have received significant attentions.
The problem of dictionary learning can be expressed as:
2
2
,
arg min g( ) . . ,st
H
d
xD
xYDxD
. (3)
Here,
g:
nNu
o
is a sparse penalty function.
mNu
Y
is
the training samples, and
N
is the number of training samples.
is some pre-defined set of matrices, e.g. each column of
()
mn
nm
u
D
has unit Euclidean norm,
D
has a full rank,
and
D
does not have linearly dependent columns. Two popular
algorithms for dictionary learning are the method of optimized
directions (MOD) [9, 10] and K-singular value decomposition
(K-SVD) [11]. They are alternative iteration between sparse
coding and dictionary optimization. The difference between
them is the method used in dictionary optimization procedure.
MOD used the least squares algorithm in dictionary
optimization, i.e.
1
TT
DYxxx
. While K-SVD employ the
method of SVD. The K-SVD updated the atoms one by one of
the dictionary [11]. An analysis sparse model was proposed in
[12], and the dictionary was projected to oblique manifold to
update as a whole part. However, these methods mentioned
above are used for 1D vectors. If deal with 2D signals, e.g.
images, we need to divide the 2D signals into small patches,
and then reshape these patches to vectors. With the increase of
patch size, they are restricted due to the limitations in memory
and computational ability. Thus, the methods of separable
dictionary learning are proposed. We will introduce these
separable dictionary learning methods in Section II in detail.
The remainder of this paper is organized as follows. The
theories and the related works of separable dictionary learning
algorithm are introduced in Section II. The proposed method is
described in detail in Section III. The results of experiment are
shown in Section IV. At last, we conclude this paper in Section
V.
II.
SEPARABLE DICTIONARY LEARNING
Due to the restrictions of traditional dictionary learning
methods, separable dictionary learning methods are received
significant attentions. We introduce the theories and the related
works of separable dictionary learning in detail in this section.
In the theories of separable dictionary learning algorithm, it
suppose that the dictionary
mn
nm
u
D
has a separable
,(((
,&63