Learning Compressed Sensing
Yair Weiss
1,2
1
Hebrew University of Jerusalem
yweiss@cs.huji.ac.il
Hyun Sung Chang
2
and William T. Freeman
2
2
MIT CSAIL
{hyunsung,billf} @mit.edu
Abstract— Compressed sensing [7], [6] is a recent set of
mathematical results showing that sparse signals can be exactly
reconstructed from a small number of linear measurements.
Interestingly, for ideal sparse signals with no measurement
noise, random measurements allow perfect reconstruction while
measurements based on principal component analysis (PCA)
or independent component analysis (ICA) do not. At the same
time, for other signal and noise distributions, PCA and ICA can
significantly outperform random projections in terms of enabling
reconstruction from a small number of measurements.
In this paper we ask: given a training set typical of the signals
we wish to measure, what are the optimal set of linear projections
for compressed sensing ? We show that the optimal projections
are in general not the principal components nor the independent
components of the data, but rather a seemingly novel set of
projections that capture what is still uncertain about the signal,
given the training set. We also show that the projections onto
the learned uncertain components may far outperform random
projections. This is particularly true in the case of natural images,
where random projections have vanishingly small signal to noise
ratio as the number of pixels becomes large.
I. INTRODUCTION
Compressed sensing [7], [6] is a set of recent mathematical
results on a classic question: given a signal x ∈ R
n
and a
set of p linear measurements y ∈ R
p
, y = W x, how many
measurements are required to allow reconstruction of x ?
Obviously, if we knew nothing at all about x, i.e. x can be
any n dimensional vector, we would need n measurements. Al-
ternatively, if we know our signal x lies in a low-dimensional
linear subspace, say of dimension k, then k measurements are
enough. But what if we know that x lies in a low-dimensional
nonlinear manifold ? Can we still get away with fewer than
n measurements ?
To motivate this question, consider the space of natural
images. An image with n pixels can be thought of as a vector
in R
n
but natural images occupy a tiny fraction of the set of
all signals in this space. If there was a way to exploit this fact,
we could build cameras with a small number of sensors that
would still enable us perfect, high resolution, reconstructions
for natural images.
The basic mathematical results in compressed sensing deal
with signals that are k sparse. These are signals that can
be represented with a small number, k of active (non-zero)
basis elements. For such signals, it was shown in [7], [5],
that ck log n generic linear measurements are sufficient to
recover the signal exactly (with c a constant). Furthermore,
the recovery can be done by a simple convex optimization or
by a greedy optimization procedure [8].
These results have generated a tremendous amount of ex-
citement in both the theoretical and practical communities. On
the theoretical side, the performance of compressed sensing
with random projections has been analyzed when the signals
are not exactly k sparse, but rather compressible (i.e. can
be well approximated with a small number of active basis
elements) [7], [5] as well as when the measurements are
contaminated with noise [11], [19]. On the practical side,
applications of compressed sensing have been explored in
building “single-pixel” cameras [20], medical imaging [14]
and geophysical data analysis [12].
Perhaps the most surprising result in compressed sensing is
that perfect recovery is possible with random projections. This
is surprising given the large amount of literature in machine
learning and statistics devoted to finding projections that are
optimal in some sense (e.g. [4]). In fact, as we review in
the next section, for ideal sparse signals with no measure-
ment noise, random measurements significantly outperform
measurements based on principal component analysis (PCA)
or independent component analysis (ICA). At the same time,
for other signal and noise distributions, PCA and ICA can
significantly outperform random projections.
In this paper we ask: given a training set typical of the
signals we wish to measure, what are the optimal set of linear
projections for compressed sensing ? We show that the optimal
projections are in general not the principal components nor the
independent components of the data, but rather a seemingly
novel set of projections that capture what is still uncertain
about the signal, given the training set. We also show that
the projections onto the learned uncertain components may
far outperform random projections. This is particularly true
in the case of natural images, where random projections have
vanishingly small signal to noise ratio as the number of pixels
becomes large.
II. RANDOM PROJECTIONS VERSUS PCA AND ICA
To compare random projections to PCA and ICA, con-
sider the sparse signals illustrated by the image patches in
figure 1. In this dataset, each signal x has exactly one non-
zero component, and this non-zero component is uniformly
distributed in the range [−U
i
, U
i
]. We assume that all indices
i have approximately the same range, i.e. U
i
≈ 1, but to break
symmetries we set U
i
= 1 + /i.
We are interested in the probability of correct reconstruc-
tion, from a projected signal:
y = Wx (1)