
Image classification using local tensor singular
value decompositions
Elizabeth Newman
Department of Mathematics
Tufts University
Medford, Massachusetts 02155
Email: e.newman@tufts.edu
Misha Kilmer
Department of Mathematics
Tufts University
Medford, Massachusetts 02155
Email: misha.kilmer@tufts.edu
Lior Horesh
IBM TJ Watson Research Center
1101 Kitchawan Road
Yorktown Heights, NY
Email: lhoresh@us.ibm.com
Abstract—From linear classifiers to neural networks, image
classification has been a widely explored topic in mathematics,
and many algorithms have proven to be effective classifiers.
However, the most accurate classifiers typically have significantly
high storage costs, or require complicated procedures that may
be computationally expensive. We present a novel (nonlinear)
classification approach using truncation of local tensor singular
value decompositions (tSVD) that robustly offers accurate results,
while maintaining manageable storage costs. Our approach takes
advantage of the optimality of the representation under the tensor
algebra described to determine to which class an image belongs.
We extend our approach to a method that can determine specific
pairwise match scores, which could be useful in, for example,
object recognition problems where pose/position are different. We
demonstrate the promise of our new techniques on the MNIST
data set.
I. INTRODUCTION
Image classification is a well-explored problem in which an
image is identified as belonging to one of a known number
of classes. Researchers seek to extract particular features
from which to determine patterns comprising an image. Algo-
rithms to determine these essential features include statistical
methods such as centroid-based clustering, connectivity/graph-
based clustering, distribution-based clustering, and density-
based clustering [13], [14], [15], as well as learning algorithms
(linear discriminant analysis, support vector machines, neural
networks) [5].
Our approach differs significantly from techniques in the
literature in that it uses local tensor singular value decompo-
sitions (tSVD) to form the feature space of an image. Tensor
approaches are gaining increasing popularity for tasks such as
image recognition and dictionary learning and reconstruction
[3], [9], [7], [10]. These are favored over matrix-vector-based
approaches as it has been demonstrated that a tensor-based
approach enables retention of the original image structural
correlations that are lost by image vectorization. Tensor ap-
proaches for image classification appear to be in their infancy,
although some approaches based on the tensor HOSVD [11]
have been explored in the literature [6].
Here, we are motivated by the work in [3] which em-
ploys optimal low tubal-rank tensor factorizations through
use of the t-product [1] and by the work in [2] describing
tensor orthogonal projections. We present a new approach
for classification based on the tensor SVD from [1], called
the tSVD, which is elegant for its straightforward mathe-
matical interpretation and implementation, and which has the
advantage that it can be easily parallelized for great com-
putational advantage. State-of-the-art matrix decompositions
are asymptotically challenged in dealing with the demand to
process ever-growing datasets of larger and more complex
objects [16], so the importance of this dimension of this study
cannot be overstated. Our method is in direct contrast to deep
neural network based approaches which require many layers
of complexity and for which theoretical interpretation is not
readily available [17]. Our approach is also different from
the tensor approach in [6] because truncating the tSVD has
optimality properties that truncating the HOSVD does not
enjoy. We conclude this study with a demonstration on the
MNIST [4] dataset.
A. Notation and Preliminaries
In this paper, a tensor is a real-valued
1
third-order tensor,
or three-dimensional array of data, denoted by a capital script
letter. As depicted in Figure 1, A is an ×m×n tensor. Frontal
slices A
(k)
for k =1,...,n are × m matrices. Lateral slices
A
j
for j =1,...,m are × n matrices oriented along the
third dimension. Tubes a
ij
for i =1,..., and j =1,...,m
are n × 1 column vectors oriented along the third dimension
[2].
(a) Tensor A. (b) Frontal
slices A
(k)
.
(c) Lateral
slices
A
j
.
(d) Tubes a
ij
.
Fig. 1. Representations of third-order tensors.
To multiply a pair of tensors, we need to understand the
t-product, which requires the following tensor reshaping ma-
chinery. Given A∈R
×m×n
, the unfold function reshapes
1
We assume real-valued tensors because we are working with real-valued
image data. However, the subsequent notation and definitions can be extended
to complex-valued tensors [8].
2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)
978-1-5386-1251-4/17/$31.00 ©2017 IEEE