Robust Bayesian Sparse Representation Based on
beta-Bernoulli Process Prior
Zengyuan Mi , Qin Lin , Yue Huang , Xinghao Ding
*
School of Information Science and Technology, Xiamen University, Xiamen China, mzy19890220@163.com
Abstract
—There has been a significant growing interest in
the study of sparse representation recent years. Although
many algorithms have been developed, outliers in the
training data make the estimation unreliable. In the
paper, we present a model under non-parametric
Bayesian framework to solve the problem. The noise term
in the sparse representation is decomposed into a
Gaussian noise term and an outlier noise term, which we
assume to be sparse. The beta–Bernoulli process is
employed as a prior for finding sparse solutions.
Keywords-outliers; non-parametric Bayesian; beta-Bernoulli
process; sparse representation
I.
I
NTRODUCTION
There has been a significant growing interest in the study
of sparse representation recent years. It has found applications
in a wide range of diverse fields. These include image
denoising, inpainting, blind source separation (BSS) and
compressive sensing [1]. The sparse representation of a signal
is modeled by [2]:
YD
αε
=+
(1)
where Y represents an
MN×
signal matrix, D is an
MK×
matrix with K atoms
12
{, , , }
K
dd d⋅⋅⋅
called the
dictionary,
α
is a
KN×
coefficient matrix which is sparse
and
ε
is an
MN×
noise matrix.
The representation of a signal with an overcomplete
dictionary has several advantages,including the fact that it
encourages a simple model and therefore over-training is
often avoided [3].
Many algorithms have been proposed to find sparse
coefficient when the dictionary matrix is given. The simplest
one is the relevance vector machine (RVM) [4]. Kernel
function matrix is chosen as the dictionary and Student’s t-
distribution is employed to enhance the sparsity of the
coefficients. Designing dictionaries to better fit the above
model can be done by either selecting one from a pre
specified set of linear transforms or adapting the dictionary to
a set of training signals. In [5] Michal Aharon and Michael
Elad proposed a novel algorithm for dictionary training
(KSVD), this algorithm could be interpreted as a
generalization of the K-Means clustering process. In [6] non-
parametric Bayesian techniques are considered for learning
dictionaries for sparse image representations (BPFA), which
gives a more intuitive constraint on the sparsity than RVM.
All the above is done with the assumption that the noise
obeys a Gaussian distribution. However, outliers in the
training data (such as salt and pepper noise in the images) will
make the assumption unreliable [7].
In the paper we extend the BPFA algorithm to efficiently
find out outliers. A robust model under non-parametric
Bayesian framework is proposed to decompose the noise term
in the BPFA model into a Gaussian noise term and an outlier
noise term, which we assume to be sparse. The beta-Bernoulli
prior [8] is employed for finding sparse solutions to the image
representation.
The remainder of the paper is organized as follows. We
review beta-Bernoulli process and present the proposed model
in the next section with inference discussed in the Section III.
Experiment results on image denoising are showed in Section
IV. Then we make conclusions in Section V.
II. MODEL
CONSTRUCTION
A. Beta-Bernoulli process
The two-parameter beta-Bernoulli process, which is a
non-parametric prior with three inputs:
0a >
and
0b >
, and
base measure
0
H
is represented as:
0
~(,,)HBPabH
.The base
measure
0
H
is denoted as:
0
1
1
k
K
d
k
H
K
δ
=
=
∑
, where
k
d
δ
is the
k
th
column of D. Provided that K is reasonably small, the
discrete beta-Bernoulli process,
1
k
K
kd
k
Hz
δ
=
=
∑
, requires the
generation of the vector
k
z
, where
~()
~(/,(1)/)
kk
k
z Bernoulli
Beta a K b K K
π
π
−
(2)
B. Model construction
The noise term in the model (1) described in BPFA is
often assumed to be independent Gaussian samples with zero
mean. However, in the presence of outliers, Gaussian noise is
not an appropriate assumption for
ε
. We considered splitting
The project is supported by the National Natural Science Foundation of
China (No.30900328, 61172179), the Fundamental Research Funds for the
Central Universities (No. 2011121051), the Natural Science Foundation of
Fujian Province of China(No. 2012J05160)