Image Super-Resolution as Sparse Representation of Raw Image Patches
Jianchao Yang
†
, John Wright
‡
, Thomas Huang
†
,YiMa
‡
ECE Department, University of Illinois at Urbana-Champagin, USA
Beckman Institute
†
and Coordinated Science Laboratory
‡
{jyang29, jnwright, huang, yima@uiuc.edu}
Abstract
This paper addresses the problem of generating a super-
resolution (SR) image from a single low-resolution input
image. We approach this problem from the perspective of
compressed sensing. The low-resolution image is viewed
as downsampled version of a high-resolution image, whose
patches are assumed to have a sparse representation with
respect to an over-complete dictionary of prototype signal-
atoms. The principle of compressed sensing ensures that
under mild conditions, the sparse representation can be
correctly recovered from the downsampled signal. We will
demonstrate the effectiveness of sparsity as a prior for reg-
ularizing the otherwise ill-posed super-resolution problem.
We further show that a small set of randomly chosen raw
patches from training images of similar statistical nature to
the input image generally serve as a good dictionary, in the
sense that the computed representation is sparse and the
recovered high-resolution image is competitive or even su-
perior in quality to images produced by other SR methods.
1. Introduction
Conventional approaches to generating a super-
resolution (SR) image require multiple low-resolution
images of the same scene, typically aligned with sub-pixel
accuracy. The SR task is cast as the inverse problem of
recovering the original high-resolution image by fusing
the low-resolution images, based on assumptions or
prior knowledge about the generation model from the
high-resolution image to the low-resolution images. The
basic reconstruction constraint is that applying the image
formation model to the recovered image should produce
the same low-resolution images. However, because much
information is lost in the high-to-low generation process,
the reconstruction problem is severely underdetermined,
and the solution is not unique. Various methods have been
proposed to further regularize the problem. For instance,
one can choose a MAP (maximum a-posteriori) solution
under generic image priors such as Huber MRF (Markov
Random Field) and Bilateral Total Variation [14, 11, 25].
However, the performance of these reconstruction-based
super-resolution algorithms degrades rapidly if the mag-
nification factor is large or if there are not enough low-
resolution images to constrain the solution, as in the ex-
treme case of only a single low-resolution input image [2].
Another class of super-resolution methods that can over-
come this difficulty are learning based approaches, which
use a learned co-occurrence prior to predict the correspon-
dence between low-resolution and high-resolution image
patches [12, 26, 16, 5, 20].
In [12], the authors propose an example-based learn-
ing strategy that applies to generic images where the low-
resolution to high-resolution prediction is learned via a
Markov Random Field (MRF) solved by belief propaga-
tion. [23] extends this approach by using the Primal Sketch
priors to enhance blurred edges, ridges and corners. Nev-
ertheless, the above methods typically require enormous
databases of millions of high-resolution and low-resolution
patch pairs to make the databases expressive enough. In [ 5],
the authors adopt the philosophy of LLE [22] from manifold
learning, assuming similarity between the two manifolds in
the high-resolution patch space and the low-resolution patch
space. Their algorithm maps the local geometry of the low-
resolution patch space to the high-resolution patch space,
generating high-resolution patch as a linear combination of
neighbors. Using this strategy, more patch patterns can be
represented using a smaller training database. However, us-
ing a fixed number K neighbors for reconstruction often re-
sults in blurring effects, due to over- or under-fitting.
In this paper, we focus on the problem of recovering
the super-resolution version of a given low-resolution im-
age. Although our method can be readily extended to han-
dle multiple input images, we mostly deal with a single in-
put image. Like the aforementioned learning-based meth-
ods, we will rely on patches from example images. Our
method does not require any learning on the high-resolution
patches, instead working directly with the low-resolution
training patches or their features. Our approach is motivated
1
978-1-4244-2243-2/08/$25.00 ©2008 IEEE
Authorized licensed use limited to: National Tsing Hua University. Downloaded on August 31, 2009 at 07:49 from IEEE Xplore. Restrictions apply.