Tensor Ring Restricted Boltzmann Machines
Maolin Wang, Chenbin Zhang, Yu Pan, Jing Xu and Zenglin Xu
SMILE Lab, School of Computer Science and Engineering
University of Electronic Science and Technology of China
Chengdu, Sichuan, China
Email: {morin.w98, aleczhang13, ypyupan, xujing.may, zenglin}@gmail.com
Abstract—Restricted Boltzmann Machines are important and
useful generative models which learn a probability distribution
from a set of vector inputs. Despite their success in a number
of applications, standard RBMs designed for vectorized inputs
are incapable of dealing with high-order data, since vector-
ization of high-order data may cause both modes collapsing
and explosive parameter growth. To address this issue, we
formulate a new tensor-input RBM model, which employs the
tensor-ring (TR) decomposition structure to naturally represent
the high-order relationship between the visual layer and the
hidden layer. For convenience, we name the proposed model as
TR-RBM. In particular, the tensor ring decomposition enjoys
many good properties, such as the rank stableness, leading to
better generalization performance compared with other low-rank
decomposition methods. Moreover, TR-RBM can also reduce the
complexity of RBM by reshaping of both visible and hidden layers
into the tensor forms, leading a significant drop of parameter size.
Experimental results in comparison with the classical RBMs and
the Matrix-Product-Operator RBM have shown the promising
performance of the proposed method in the tasks of feature
extraction and denoising.
Index Terms—tensors, tensor decomposition, tensor ring, Re-
stricted Boltzmann Machines, feature extraction
I. INTRODUCTION
Restricted Boltzmann Machines (RBM) are generative mod-
els which can learn a probability distribution from the set of
inputs [1]. Due to the powerful ability of extracting features,
RBMs have been widely used in speech recognition [2],
collaborative filtering [3], network anomaly detection [4] and
computer vision [5].
A standard RBM is specially designed for vector input
data and not efficient for matrices or higher-dimensional array
data which is very common in many applications. To solve
this problem, transforming high-dimensional data into a one-
dimensional vector was applied before. However, this method
ignores the relationship between different data modes and may
lead to the curse of dimensionality which means the explosion
of the number of corresponding parameters [6].
A promising way to generalize the vector input model to
the high-order model input is to apply tensorized models or
low-rank tensor decompositions. Due to the good properties
and successful applications of the Tensor Ring Decomposi-
tion(TRD) [7]–[9] in convolutional neural networks [8], [10]
We thank the anonymous reviewers for valuable comments to improve
the quality of our paper. This work was partially supported by Na-
tional Natural Science Foundation of China (Nos.61572111 and 61876034),
and a Fundamental Research Fund for the Central Universities of China
(No.ZYGX2016Z003).
and recurrent neural networks [9], [11], [12], we propose the
Tensor-Ring Restricted Boltzmann machine(TR-RBM) model.
In detail, the weight matrix is reshaped into a high-dimensional
tensor and then we decompose it by using TRD, thus the cor-
relation information among data modes can be maintained and
utilized. Powered by the great properties of TRD, our model
is expected to have better information extraction performance
among data modes but with fewer parameters.
This article has the following contributions:
• We first apply tensor ring decomposition structure
on RBM. And the classical RBM, matrix-variate
RBM(MvRBM) and tensor-variate RBM(TvRBM) can be
regarded as special cases of TR-RBM. The computation
complexity and flexibility of TR-RBM is better than
matrix product operator(MPO) RBM.
• The parameter number of TR-RBM is highly compressed.
So the explosion of the RBM parameters with the order
of input data can be avoided.
• An alternating optimization algorithm of TR-RBM is de-
signed. The space and time complexity is also provided.
The rest of the paper is organized as follows. Section II
presents the insightful review to the related literature works
and highlights the property of the Tensor Ring Decomposition.
Section III shows some preliminaries of RBMs and tensors.
In Section IV, we describe our TR-RBM model, introduce the
learning algorithm, and give the complexity of the algorithm.
In Section V, we conduct a series of experiment to evaluate
the performance of the proposed method. Finally, Section VI
concludes this paper.
II. RELATED WORKS
Many real-world data are multi-modes e.g. patient drug
responses with four modes(person, medicine, biomarker, time)
[13]. Vectoring multi-modes data is a common approach which
may result in relationship information loss, and further influ-
ence the performance of models. In addition, this approach
suffers from a large number of parameters as the dimension of
input data increases [14]. In parallel, tensor decomposition on
higher-mode data can capture higher-order correlations while
maintaining fewer parameters. Therefore, research on low-rank
tensor structures have been attached great importance.
Standard low-rank tensor structures include the Tucker de-
composition [15], CANDECOMP/PARAFAC(CP) [16], [17],
pair-wise decomposition [18], [19], InfTucker and its varia-
tions [13], [20]–[22], tensor train decomposition(TTD) [23],