最小二乘核集合回归在再现希尔伯特空间中的优化策略

133 浏览量更新于2024-08-26 收藏 827KB PDF 举报

本文探讨了在再现核希尔伯特空间（Reproducing Kernel Hilbert Space, RKHS）框架下的一种创新最小二乘核集合回归（Least Squares Kernel Ensemble Regression, LSKER）方法。核方法在机器学习中具有广泛应用，尤其是支持向量机（SVM）和核函数的使用，能有效地处理非线性问题并保持数据的内在结构。通常，单个回归模型可能受到噪声或局部最优的影响，而集合回归则是通过集成多个模型来提高预测准确性和稳定性。在文中，作者Xiang-Jun Shen等人提出了一种新的回归策略，它将最小二乘损失函数最小化应用于多个RKHS中。这种最小二乘核集合回归方法旨在利用不同RKHS的特性，每个空间可以捕捉输入数据的不同方面，从而组合多个核函数的优势，减少过拟合风险，并提高整体预测性能。相比于传统的单一核函数回归，这种方法更加强调模型间的协作和互补性，以实现更好的泛化能力。研究过程包括了理论分析和实证验证两个部分。首先，作者们定义了最小二乘核集合回归的数学框架，包括如何选择合适的核函数、如何在多个RKHS中构建模型以及如何通过优化算法求解最小化总平方误差的问题。然后，他们通过一系列实验展示了新方法在各种数据集上的表现，包括比较与传统核方法（如径向基函数核RBF）的性能差异，以及对复杂非线性关系的处理效果。此外，研究还讨论了该方法的潜在应用领域，如图像分类、信号处理和时间序列预测等，特别是在大数据和高维特征空间中的优势。最后，作者们总结了研究成果，强调了最小二乘核集合回归在提高回归模型性能和稳健性方面的优势，为实际问题提供了有效的解决方案。这篇研究论文不仅介绍了最小二乘核集合回归的基本概念和原理，而且还提供了实施细节和实验结果，展示了其在再现核希尔伯特空间中的有效性。对于那些关注核方法和集成学习的科研人员来说，这是一项值得深入理解和借鉴的研究成果。

Neurocomputing 311 (2018) 235–244

Contents lists available at ScienceDirect

Neurocomputing

journal homepage: www.elsevier.com/locate/neucom

Least squares kernel ensemble regression in Reproducing Kernel

Hilbert Space

Xiang-Jun Shen

a , ∗

, Yong Dong

, Jian-Ping Gou

, Yong-Zhao Zhan

, Jianping Fan

School of Computer Science and Communication Engineering, Jiangsu University, Jiangsu 212013 China

Department of Computer Science, University of North Carolina at Charlotte, NC 28223, USA

a r t i c l e i n f o

Article history:

Received 21 September 2017

Revised 24 January 2018

Accepted 23 May 2018

Available online 28 May 2018

Communicated by Dr. K. Li

Keywords:

Least squares method

Ensemble regression

Kernel regression

a b s t r a c t

Ensemble regression method shows better performance than single regression since ensemble regression

method can combine several single regression methods together to improve accuracy and stability of a

single regressor. In this paper, we propose a novel kernel ensemble regression method by minimizing

total least square loss in multiple Reproducing Kernel Hilbert Spaces (RKHSs). Base kernel regressors are

co-optimized and weighted to form an ensemble regressor. In this way, the problem of ﬁnding suitable

kernel types and their parameters in base kernel regressor is solved in the ensemble regression frame-

work. Experimental results on several datasets, such as artiﬁcial datasets, UCI regression and classiﬁca-

tion datasets, show that our proposed approach achieves the lowest regression loss among comparative

regression methods such as ridge regression, support vector regression (SVR), gradient boosting, decision

tree regression and random forest.

Introduction

In many real-world applications, it is important to predict the

value of one feature depending on other measured features [1] .

Regression is one of the most fundamental statistical techniques

to solve such problems [2,3] , which helps to explore the relation-

ship between inputs and outputs from example data in continuous

space. There are many methods [4] which use different strategies

to carry out the regression process. These strategies are mainly di-

vided into two categories: single regression models and ensemble

regression models [5,6] .

Single regression models can also be categorized into non-

kernel and kernel methods. Representative methods in non-kernel

methods are ridge regression and lasso regression. For example,

Pan et al. [7] proposed an approach based on ridge regression for

image reconstruction in computed tomography. Liu et al. [8] ex-

tracted plant characteristic gene set, based on lasso logistic regres-

sion. Meanwhile kernel methods, such as, kernel ridge regression

and support vector regression (SVR), are widely used for their the-

oretical or experimental results. For example, Burnaev and Nazarov

[9] provided a detailed description of a computationally eﬃcient

conformal procedure for kernel ridge regression. Zhang and Liu

∗

Corresponding author.

E-mail address: xjshen@ujs.edu.cn (X.-J. Shen).

[10] presented a new online SVR method called online Laplacian-

regularized SVR (online LapSVR).

While in the second broad category of ensemble regression

models, the base regression models are combined together for

improving accuracy and stability of a single regressor. This has

achieved success in many real-world applications, such as random

forest regression, gradient boosting regression and decision tree re-

gression. For example, Xu et al. [11] proposed a prediction model

based on random forest to analyze some readily available indicator

effects on diabetes. Jiang et al. [12] implemented a gradient boost-

ing tree system in the production cluster of Tencent Inc. Wu et al.

[13] investigated the nonlinear relationship between land surface

temperature and vegetation abundance with the use of decision

tree regression approach.

Generally, in single regression methods, the kernel regres-

sion methods have better performance than non-kernel regression

methods, because of their nonlinear usage of the Reproducing Ker-

nel Hilbert Space (RKHS) [14,15] . Yukawa [16] proposed adaptive

multiple RKHSs learning algorithm by applying Cartesian product.

Lv et al. [17] presented a new RKHS sparsity-smoothness penalty

with nonlinear function cases. Mitra and Bhatia [18] proposed a

novel ﬁnite dictionary technique in the RKHS. However, the se-

lection of parameters has great inﬂuence on the performance of a

single kernel regression method. Therefore, it is a key problem of

kernel regression methods in considering selecting suitable kernel

types and their parameters.

https://doi.org/10.1016/j.neucom.2018.05.065

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38627826

粉丝: 5
资源: 939

最小二乘核集合回归在再现希尔伯特空间中的优化策略

基于多个再生核希尔伯特空间的多角度人脸识别.pdf

尺度核函数在最小二乘支持向量机信号逼近中的应用 (2008年)

calibration:计算机模型功能校准的再生核希尔伯特空间方法

然后基于再生核希尔伯特空间映射分析非线性数据的内部结构

希尔伯特空间

基于最小二乘双支持向量机的配电网短路故障辨识方法研究.docx

二维FIR滤波器的加权最小二乘设计，基于矩阵的广义共轭梯度法

再生核希尔伯特空间在计算机模型校准中的应用

高维数据特征选择：基于再生核希尔伯特空间映射的优化算法

希尔伯特空间中的最小二乘法：理论与应用

最新资源