基于模糊受限玻尔兹曼机的高维数据分类模糊深度模型

180 浏览量更新于2024-08-27 收藏 931KB PDF 举报

"本文提出了一种基于模糊受限玻尔兹曼机（Fuzzy Restricted Boltzmann Machines, FRBM）的模糊深度模型，用于高维数据分类。该模型被称为模糊深度信念网络（Fuzzy Deep Belief Network, FDBN），利用FRBM在生成和判别方面的优秀特性。学习过程分为预训练阶段和后续的微调阶段。在预训练阶段，一系列FRBM逐层有序训练：首先使用原始样本训练第一个FRBM，然后利用其生成的左右概率平均值来初始化下一层的权重。" 在高维数据分类问题中，传统的机器学习方法可能会遇到过拟合、计算复杂度高以及特征选择困难等问题。文章提出的模糊深度信念网络（FDBN）是一种新型的深度学习架构，它结合了模糊系统和受限玻尔兹曼机（Restricted Boltzmann Machines, RBMs）的优点。模糊系统允许处理不确定性，并且可以更好地适应不精确或模糊的数据，而RBMs则具有强大的非监督学习能力，能够从原始数据中学习高级抽象特征。 FDBN的构建主要基于FRBMs，它们是RBM的模糊版本。FRBM的每个单元都具有模糊集的概念，可以更好地表示和处理不确定性和模糊性。在预训练过程中，每个FRBM被单独训练，通过贪婪的逐层无监督学习方法，使得每一层都能够学习到上一层的特征表示。预训练阶段结束后，整个FDBN的参数被用于有监督的微调阶段，以优化分类性能。文章中提到的预训练和微调策略是一种典型的深度学习训练流程，预训练可以帮助模型快速捕获数据的潜在结构，微调则可以进一步优化模型，使其更好地适应特定任务。这种方法降低了对大量标注数据的依赖，同时提高了模型的泛化能力。此外，文章可能还讨论了FDBN在处理高维数据时相比于其他方法的优越性，例如对比传统的深度信念网络（DBN）和深度神经网络（DNN）。FDBN在处理模糊和不确定性数据时可能表现出更高的准确性和稳定性。同时，文章可能还包含了实验部分，通过与其他方法的比较，展示了FDBN在各种数据集上的性能。总结来说，这篇研究论文提出了一个创新的模糊深度学习模型，即FDBN，该模型特别适合处理高维且包含不确定性的数据分类问题。通过预训练和微调的两步学习策略，FDBN能够在捕获数据本质特征的同时，提高分类的准确性，为高维数据的分析提供了一种新的有效工具。

1063-6706 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TFUZZ.2019.2902111, IEEE

Transactions on Fuzzy Systems

• A hybrid learning strategy is designed for the Fuzzy

DBN: A series of FRBMs are pre-trained in a layer-wised

manner and stacked one above another to form a deep

model, where all the parameters are fuzzy numbers. Then

the well pre-trained fuzzy parameters are defuzziﬁed

before ﬁne-tuning the whole deep structure by wake-

sleep and SGD algorithms so as to reduce the model

complexity.

• The proposed fuzzy DBN can be directly applied to

high-dimensional data classiﬁcation, which is veriﬁed

and compared with other deep models on some popular

benchmarks: MNIST, NORB and Scene datasets. We

believe this is the ﬁrst comprehensive study on applying

the fuzzy deep model to these image datasets, and the

experimental results reported here can provide a valuable

reference and baseline for future investigation on fuzzy

deep models.

Section II provides some brief preliminaries about the DBN,

FRBM and DFRBM. How to establish a Fuzzy DBN as well as

the hybrid learning approach is described detailedly in Section

III. Three experiments are carried out to compare our proposed

model with some other deep models in Section IV. Section V

summarizes our whole work.

II. PRELIMINARIES

A. Establishing Deep Belief Nets with RBMs

DBNs are capable of learning multiple layers of non-linear

features from unlabeled data. The high-order features learned

by the upper layer are extracted from the hidden units in the

lower layer, which is realized through training RBMs in a

greedy layer-wise way by CD algorithm and stacking them

one above another. A generative DBN is able to perform image

inpainting and reconstruction.

Suppose we have an N-layer DBN where the input vis-

ible vector is x and the lth layer of hidden vector is h

(l = 1, 2, . . . , N), then the joint probability distribution for

the DBN has the following form [32]

P (x, h

, . . . , h

) =

N−1

l=1

P (h

l−1

)

P (h

N−1

, h

) (1)

where x , h

, P (h

N−1

, h

) is the joint distribution deﬁned

by the top RBM, and

N−1

l=1

P (h

l−1

) represents the distri-

bution of the directed sigmoid belief network below.

And the conditional probabilities are

P (h

l−1

) = sigm(b

+ h

)

P (h

l−1

) = sigm(c

+ h

l−1

)

P (h

N−1

, h

) =

N−1

−E(h

N−1

)

−E(h

N−1

)

(2)

where E(h

N−1

, h

) represents the energy function of RBM

on the top, i.e.,

E(h

N−1

, h

) = −h

N−1

− h

N−1

− h

and W

is the weight matrix, b

and c

are the visible bias

vector and hidden bias vector for the lth RBM, respectively

(l = 1, 2, . . . , N ).

Fig. 3: Training the lth RBM by CD algorithm.

Algorithm 1. Pre-training a DBN in a layer-wise manner

Input: Training data h

= x; initialize θ

= (W

, b

, c

) = 0,

l = 1, 2, . . . , N; learning rate ;.

Output: A DBN with N layers.

1: for l = 1 to N do

2: train the lth RBM with data h

l−1

by CD method (see Fig. 3);

3: get the well-learned parameters W

, b

and c

;

4: sample h

∼ Q(h

l−1

; θ

) = P (h

l−1

; θ

) by Eq. (2);

5: end for

It is difﬁcult to perform Gibbs sampling by the conditional

distributions P (·) in Eq. (2) since they could not be factorized.

Therefore, we usually use the approximate posteriors denoted

by Q(·) for model inference and sampling, i.e., the distribution

of the lth RBM is represented by Q(h

l−1

, h

) during the

pre-training phase and Q(h

l−1

) is employed to perform a

bottom-up inference. We should notice that only the posterior

Q(h

N−1

) is identical to the true probability P (h

N−1

)

for the top RBM while the rest of Q(·) are all approximations.

B. Learning Deep Belief Nets

Training a DBN usually falls into two learning stages: the

pre-training in a layer-wise manner and ﬁne-tuning as a whole.

In the phase of pre-training, the bottom RBM is trained by

the original data x, then the values h

l−1

produced by the

hidden units of the (l − 1)th RBM are used to train the lth

RBM in upper layer (l = 1, 2, . . . , N). The procedure will be

repeated to train several RBMs successively and ﬁnally result

in a deep model, which is illustrated in Algorithm 1 [8]. The

deep structure can be either trained into a generative model or

a discriminative one depending on the type of RBM (with or

without label units) chosen in the top layer of a DBN.

1) Generative Model + Supervised Fine-tuning Algorith-

m: This learning procedure [1] uses a generative RBM on

the top the same as RBMs used in the lower layers when

performing layer-wise pre-training. The resulting DBN is a

generative model which provides an efﬁcient way of realizing

approximate inference since it only requires one single bottom-

up pass to infer the values of the top-level hidden variables.

After layer-wise training in the ﬁrst phase, the resulting

DBN with an extra layer representing the label units added

to the top RBM can be ﬁne-tuned by BP or SGD algorithms.

Then it is capable of performing classiﬁcation and recognition.

We might also explain it, in another way, that the pre-trained

parameters of DBN are employed to initialize a multilayer

neural network which is then trained in a traditional gradient-

based manner.

Authorized licensed use limited to: SOUTH CHINA UNIVERSITY OF TECHNOLOGY. Downloaded on March 28,2020 at 14:23:28 UTC from IEEE Xplore. Restrictions apply.

剩余11页未读，继续阅读

weixin_38736760

粉丝: 5
资源: 980

基于模糊受限玻尔兹曼机的高维数据分类模糊深度模型

Fuzzy Restricted Boltzmann Machine

Support vector learning for fuzzy rule-based classification systems

Text classification based on SMO and fuzzy model

A three-dimensional probabilistic fuzzy control system for network queue management

A classification method of fuzzy sets based on rough fuzzy number

Athree-dimensional probabilistic fuzzy control system for network queue management (2009年)

A T-S Fuzzy Model-Based Algorithm for Blood Glucose Control in Diabetic Patients

A multi-granularity fuzzy computing model for sentiment classification of Chinese reviews

模糊集matlab代码-Features-extraction-based-on-Beta-Elliptic-Model-and-Fuzzy-

Improvement on Fuzzy-Model-Based Stability Criteria of Nonlinear Networked Control Systems

最新资源