信息论学习下SVDD的选择性集成提升方法

PDF格式 | 1.18MB | 更新于2024-08-27 | 143 浏览量 | 举报

本文主要探讨了"基于信息理论学习的SVDD选择性集成"这一主题，由作者Hong-Jie Xing和Yong-Le Wei来自河北大学的数学与信息科学学院以及计算机科学技术学院共同研究。他们针对传统支持向量数据描述符(SVDD)在提高泛化性能和抵抗噪声方面存在的局限，提出了一种新颖的集成方法。传统的SVDD依赖于最小化总体误差或平方误差（SSE）准则，然而这种方法可能无法充分利用所有基础分类器的信息，特别是在处理噪声数据时。为了解决这个问题，研究人员引入了两个关键概念：互信息论中的correntropy和Rényi熵。首先，他们利用基础分类器的半径与集成模型半径之间的correntropy作为替代SSE的评估标准。Correntropy是一种非线性度量，相比于平方误差，它能更好地捕捉数据分布的复杂性和不确定性，从而有助于减少噪声的影响，提升模型的稳健性。其次，为了衡量集成后的多样性，作者定义了训练样本与集成中心之间的Rényi熵。Rényi熵作为一种多信息论工具，可以量化数据间的差异和分散程度，有助于确保集成模型具有良好的泛化能力。此外，为了进一步增强模型的选择性，文章引入了1-norm正则化项到目标函数中。这种正则化策略有助于避免过拟合，确保模型在选择基础分类器时更加谨慎，只包含对整体性能贡献最大的部分。实验结果在合成数据集和基准数据集上验证了这种基于信息理论学习的SVDD选择性集成方法的有效性。结果显示，与传统的SVDD相比，该方法显著提高了模型的泛化性能，并且对噪声数据表现出更好的鲁棒性。这为在实际应用中构建更高效、更健壮的机器学习模型提供了新的思路和方法。

展开

Selective Ensemble of SVDDs Based on Information

Theoretic Learning

Hong-Jie Xing

College of Mathematics and Information Science

Hebei University

Baoding 071002, Hebei Province, China

hjxing@hbu.edu.cn

Yong-Le Wei

College of Computer Science and Technology

Hebei University

Baoding 071002, Hebei Province, China

lcyd_le@163.com

Abstract—To make the traditional support vector data

description (SVDD) achieve better generalization performance

and more robust against noise, a selective ensemble method based

on correntropy and Renyi entropy is proposed. In this proposed

ensemble method, the correntropy between the radii of the basis

classifiers and the radius of the ensemble is utilized to substitute

the sum-squared-error (SSE) criterion. The Renyi entropy of the

distances between the training samples and the center of

ensemble is defined as the diversity measure for the proposed

ensemble. Moreover, an

 -norm based regularization term is

introduced into the objective function of the proposed ensemble

to implement the selective ensemble. Experimental results on

synthetic and benchmark data sets show that the proposed

ensemble strategy can achieve better performance than its

related approaches.

Keywords—one-class classification; support vector data

description; correntropy; Renyi entropy; selective ensemble

I. INTRODUCTION

As is well-known, one-class classification [1] is regarded as

an important research issue in the field of machine learning.

Till now, a large number of one-class classification methods

have been proposed. The two commonly used one-class

classifiers are one-class support vector machine (OCSVM) [2]

and support vector data description (SVDD) [3]. OCSVM first

utilizes certain kernel functions to map the normal data into a

high-dimensional feature space to achieve better separability.

Then, an optimal hyperplane in the feature space can be

obtained to separate the images of normal data and the origin

with the maximum margin. SVDD establishes a hyper-sphere

in the feature space to enclose all the images of normal data.

The testing data can be classified as normal if they are enclosed

in the hyper-sphere, while classified as novel if they are lying

outside of the hyper-sphere. When the Gaussian kernel function

is used, Tax and Duin proved that SVDD is equivalent to

OCSVM [3].

To make one-class classifier achieve better performance,

Tax and Duin [4] proposed the ensemble of one-class

classifiers. Seguí et al. [5] and Rätsch et al. [6] proposed the

weighted bagging based ensemble of one-class classifiers and

the Boosting based ensemble of one-class classifiers,

respectively. Krawczyk et al. [7] proposed the clustering based

ensemble of one-class classifiers. The clustering algorithm is

utilized to split the whole normal class into the disjointed sub-

regions. On each sub-region, a single one-class classifier is

trained. Finally, the outputs of all the one-class classifiers are

combined together.

Although an ensemble of classifiers is often superior to one

single classifier, the computational cost for obtaining the

ensemble will become expensive when the number of base

classifiers is large. To overcome the aforementioned

disadvantage, Zhou et al. [8] proposed the selective ensemble

and proved that it is better to use a part of the base classifiers to

construct the ensemble rather than using all of them. However,

the existing one-class classifier ensembles have not considered

the selective ensemble. Moreover, the classification boundary

achieved by the single one-class classifier is not compact

enough. In the paper, we propose a selective ensemble strategy

for SVDD to get the optimal combination weights of base

classifiers. The proposed ensemble is mainly based on

correntropy and Renyi entropy derived from information

theoretic learning [9].

Finally, the experimental results demonstrate that the

proposed ensemble strategy can effectively reduce the number

of base classifiers of ensemble, and its classification

performance is equivalent or even better than those of the

single SVDD and the other two ensemble approaches.

II. P

RELIMINARIES

A. SVDD

SVDD was proposed by Tax and Duin [3]. It finds the

smallest sphere enclosing all the normal data. Given

normal

data







with

Rx

, the original optimization problem of

SVDD is given by

min

.. , 1,2, ,

0, 1, 2, ,

tRiN







 





a ξ

xa 



(1)

where

C is the trade-off parameter, R is the radius of the

enclosing sphere,



is slack variable, and a is the center of

the enclosing sphere. The optimization problem (1) can be

solved by the Lagrange multiplier method. Furthermore, we

can obtain the following dual optimization problem with

nonlinear kernels by replacing the inner products in the dual

optimization problem of (1) with kernel functions

719

2015 4th International Conference on Computer Science and Network Technology (ICCSNT 2015)

Harbin, China

下载后可阅读完整内容，剩余4页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

weixin_38581777

粉丝: 4

信息论学习下SVDD的选择性集成提升方法

基于声纹deep-SVDD深度学习的检测电机运转是否异常的检测工具（pyqt5+python）

基于matlab开发的SVDD的Matlab工具箱及程序，包括工具箱及扩展工具箱和一个简易程序.rar

基于蚁群SVDD和聚类方法的旋转机械智能诊断

深度学习与SVDD结合的特定类别图像分类新方法

SVM鲁棒性分析：异常值处理与影响评估的策略

对旋轴流风机毕业设计说明书.doc

【人工智能导论】支持向量机算法分析预习报告：深入理解SVM原理与鸢尾花数据集分类实验设计

步进式推刚机设计说明书.doc

遗传算法优化BP神经网络提升交通流量预测精度的技术实现与应用

电镀生产线自动化：基于组态王6.53与S7-200PLC的仿真系统设计与实现

最新资源