Kernelized Support Tensor Machines
Lifang He
1
Chun-Ta Lu
1
Guixiang Ma
1
Shen Wang
1
Linlin Shen
2
Philip S. Yu
1 3
Ann B. Ragin
4
Abstract
In the context of supervised tensor learning, pre-
serving the structural information and exploit-
ing the discriminative nonlinear relationships of
tensor data are crucial for improving the perfor-
mance of learning tasks. Based on tensor fac-
torization theory and kernel methods, we pro-
pose a novel Kernelized Support Tensor Ma-
chine (KSTM) which integrates kernelized ten-
sor factorization with maximum-margin crite-
rion. Specifically, the kernelized factorization
technique is introduced to approximate the ten-
sor data in kernel space such that the complex
nonlinear relationships within tensor data can be
explored. Further, dual structural preserving ker-
nels are devised to learn the nonlinear bound-
ary between tensor data. As a result of joint
optimization, the kernels obtained in KSTM ex-
hibit better generalization power to discrimina-
tive analysis. The experimental results on real-
world neuroimaging datasets show the superior-
ity of KSTM over the state-of-the-art techniques.
1. Introduction
In many real-world applications, data samples intrinsically
come in the form of two-dimensional (matrices) or multi-
dimensional arrays (tensors). In medical neuroimaging, for
instance, a functional magnetic resonance imaging (fMRI)
sample is naturally a third-order tensor consisting of 3D
voxels. There has been extensive work in supervised tensor
learning (STL) recently. For example, (Tao et al., 2007)
proposed a STL framework that extends the standard lin-
ear support vector machine (SVM) learning framework to
tensor patterns by constructing multilinear models. Under
1
Department of Computer Science, University of Illinois at
Chicago, Chicago, IL, USA
2
Institute for Computer Vision, Shen-
zhen University, Shenzhen, China
3
Institute for Data Science,
Tsinghua University, Beijing, China
4
Department of Radiology,
Northwestern University, Chicago, IL, USA. Correspondence to:
Linlin Shen <llshen@szu.edu.cn>.
Proceedings of the 34
th
International Conference on Machine
Learning, Sydney, Australia, PMLR 70, 2017. Copyright 2017
by the author(s).
this learning framework, several tensor-based linear mod-
els (Zhou et al., 2013; Hao et al., 2013) have been devel-
oped. These methods assume, explicitly or implicitly, that
data are linearly separable in the input space. However,
in practice this assumption is often violated and the linear
decision boundaries do not adequately separate the classes.
In order to apply kernel methods for tensor data, several
works (Signoretto et al., 2011; 2012; Zhao et al., 2013a)
have been presented to convert the input tensors into vec-
tors (or matrices), which are then used to construct kernels.
This kind of conversion, though, will destroy the structure
information of the tensor data. Moreover, the dimension-
ality of the resulting vector typically becomes very high,
which leads to the curse of dimensionality and small sam-
ple size problems (Lu et al., 2008; Yan et al., 2007).
Recently, (Hao et al., 2013; He et al., 2014; Ma et al.,
2016) employed CANDECOMP/PARAFAC (CP) factor-
ization (Kolda & Bader, 2009) on the input tensor to foster
the use of kernel methods for STL problems. However, as
indicated in (Rubinov et al., 2009; Luo et al., 2011), the
underlying structure of real data is often nonlinear. Al-
though the CP factorization provides a good approximation
to the original tensor data, it only concerned with multilin-
ear formulas. Thus, it is difficult to model complex non-
linear relationships within the tensor data. Most recently,
(He et al., 2017) extended CP factorization to the nonlinear
case through the exploitation of the representer theorem,
and then used kernelized CP (KCP) factorization to facil-
itate kernel learning. To the best of our knowledge, there
is no existing work that tackles factorization and prediction
as a joint optimization problem over the kernel methods.
This paper focuses on developing kernelized tensor factor-
ization with kernel maximum-margin constraint, referred
as Kernelized Support Tensor Machine (KSTM). KSTM
includes two principal ingredients. First, inspired by (Sig-
noretto et al., 2013), we introduce a general formulation
of kernelized tensor factorization in the tensor product re-
producing kernel Hilbert space, namely kernelized Tucker
model, which provides a new perspective on understand-
ing the KCP process. Second, we apply kernel trick to
embed the compact representations extracted by KCP into
the dual structure-preserving kernels (He et al., 2014) in
conjunction with a maximum-margin method to solve the