COLOR DEMOSAICKING VIA NONLOCAL TENSOR REPRESENTATION
Lili Huang
⋆†
Xuan Wu
†
Wenze Shao
∗
Hongyi Liu
†
Zhihui Wei
†
Liang Xiao
†
⋆
Guangxi University of Science and Technology, Liuzhou, China
†
Nanjing University of Science and Technology, Nanjing, China
∗
Nanjing University of Posts and Telecommunications, Nanjing, China
ABSTRACT
A single sensor camera can capture scenes by means of color
filter array. Each pixel samples only one of the three primary
colors. Color demosaicking (CDM) is a process of recon-
struction a full color image from this sensor data. In this
paper, we propose a novel CDM scheme based on learned
simultaneous sparse coding over nonlocal tensor representa-
tion. First, similar 2D patches are grouped to form a three-
order tensor, that is, 3D array. Then, three sub-dictionaries,
which characterize the coherent structures that appear in each
dimension of the grouped tensor, are learned jointly by us-
ing Tucker decomposition. The consequent coefficient tensor
is imposed by the grouped-block-sparsity constraint, which
forces the similar patches to share the same atoms of the dic-
tionaries in their sparse decomposition. Experimental results
demonstrate the effectiveness both in the average CPSNR and
visual quality.
Index Terms— Color demosaicking, dictionary learning,
tensor decomposition, nonlocal method, sparse representation
1. INTRODUCTION
The common approach in single sensor digital cameras is to
use a color filter array (CFA) to capture one of the three pri-
mary colors at each location. In such digital cameras, an
important part of image processing pipeline is color demo-
saicking (CDM), which estimates the other two missing color
components to reconstruct a full color image.
Most recently, sparse representation (SR) and nonlocal
self-similarity (NSS) models have shown powerful capabil-
ity in solving numerous problems in image processing and
pattern recognition. They are also popular for CDM methods,
such as the self-similarity driven demosaicking (SDD) [1], the
learned sparse coding (LSC) demosaicking [2], the learned si-
multaneous sparse coding (LSSC) demosaicking [3], etc.
Traditional SR and NSS based methods work on vector
spaces by representing patches as vectors. However, a patch
The research is supported in part by the National Natural Science
Foundation (NSF) of China (61302178, 91538108, 11431015, 61571230,
61402239, 61301217, 61301215), NSF of Guangxi Province (2014GXNS-
FAA118360), Jiangsu Province (BK2012800, BK20130868, BK20130883).
is intrinsically a matrix, or the second order tensor with two
modes including spatial height and width. Vectorization re-
sults in damaging nature structure and correlation in the orig-
inal tensorial data. Therefore, it is desired to carry out sparse
coding directly on the tensor objects rather than its vectorized
versions. This not only preserves higher-order structure of the
image, but can improve learnability of dictionary, especially
in cases that the number of training samples is small [4, 5, 6].
The proposed method, as all above methods, which refor-
mulates the demosaicking as the denoising problem, includes
two stages. In the first stage, the full color image is initially
estimated to recover regions that fit with the NSS assump-
tion. We consider this estimation as noisy signal, and regard
interpolation errors as demosaicking noises. In the second
stage, we use denoising technique based on LSSC over non-
local tensor representation (LSSC-NTR) to remove the errors,
and obtain the improved estimate as result of denoising pro-
cedure. Specifically, similar 2D patches are first grouped to
form a three-order tensor. Then, three spatially adaptive sub-
dictionaries are learned jointly by using Tucker decomposi-
tion, i.e., a form of multilinear principle component analysis
(MPCA),which seeks the bases in each dimension that char-
acterize the most representative information regarding the un-
derlying structure of the tensor and thus removes the noise
and trivial information [7]. The consequent coefficient tensor
is imposed by the grouped-block-sparsity constraint, which
forces the similar patches to share the same atoms of the dic-
tionaries in their sparse decomposition. This can implicitly
average out the errors not only across the spatial height and
width, but also in the third dimension so as to keep the simi-
larity between intensity values at corresponding pixels for the
different patches. Finally, each group of similar patches is
reconstructed collaboratively by these learned subdictionar-
ies along three ways. Averaging and summing these smaller
overlapping patches, we obtain the desired CDM result.
2. NOTATIONS AND PRELIMINARIES
A tensor is a multiway array or multidimensional matrix. The
order of a tensor is referred to as the number of its dimensions,
also called ways or modes. In this paper, scalars (zero-order
tensors) are denoted by lowercase letters; vectors (one-order