统计视角下的提升算法：正则化、预测与模型拟合

版权申诉

54 浏览量更新于2024-07-05 收藏 871KB PDF 举报

"提升算法：正则化、预测与模型拟合" 本文主要探讨了提升算法在统计学中的应用，特别是从统计学的角度出发，重点在于估计可能复杂的参数或非参数模型，包括广义线性模型、加性模型以及生存分析的回归模型。作者Peter B¨uhlmann和Torsten Hothorn分别来自ETH Zurich和Universit¨at Erlangen-N¨urnberg。提升算法，如AdaBoost（Adaptive Boosting），最初是由Freund和Schapire提出的分类算法，它通过组合弱学习器形成强学习器，显著提高了分类性能。文章指出，提升算法不仅限于分类任务，也适用于回归和预测问题。在高维特征空间中，提升算法对于正则化和变量选择具有重要价值，因为它们可以有效地处理过拟合问题。文章讨论了自由度的概念，这是评估模型复杂性和进行正则化的重要工具。自由度与Akaiki信息准则（AIC）和贝叶斯信息准则（BIC）紧密相关，这些准则常用于在模型选择中平衡模型复杂度和预测能力。在高维度数据中，这些准则有助于控制模型的复杂性，防止过度拟合。 `mboost`是一个专门为此目的开发的开源软件包，它实现了用于模型拟合、预测和变量选择的函数。这个包的灵活性允许用户自定义损失函数，实现新的提升算法，从而适应各种不同的学习任务。此外，文章还深入探讨了提升算法的实践方面，包括如何通过迭代过程优化模型，如何调整学习率和迭代次数以达到最佳性能，以及如何利用提升算法的特性来处理非平衡数据集。通过对实际数据集的应用，展示了提升算法的有效性和实用性。总结起来，这篇文章提供了一个全面的统计视角，深入解析了提升算法在模型拟合、预测和正则化中的作用，强调了其在高维数据环境中的优势，并通过`mboost`软件包展示了其实用方法，对于理解和应用提升算法在机器学习和人工智能领域具有重要参考价值。

BOOSTING ALGORITHMS AND MODEL FITTING 9

The population minimizer can be shown to be [33, cf.]

∗

log-lik

(x) =

log



p(x)

1 − p(x)



, p(x) = P[Y = 1|X = x].

The loss function in (3.1) is a function of ˜yf, the so-called margin value,

where the function f induces the following classiﬁer for Y :

C(x) =











1 if f(x) > 0

0 if f(x) < 0

undetermined if f(x) = 0.

Therefore, a misclassiﬁcation (including the undetermined case) happens if

and only if

Y f(X) ≤ 0. Hence, the misclassiﬁcation loss is

0-1

(y, f) = I

{˜yf≤0}

,(3.2)

whose population minimizer is equivalent to the Bayes classiﬁer (for

Y ∈

{−1, +1})

∗

0-1

(x) =

(

+1 if p(x) > 1/2

−1 if p(x) ≤ 1/2,

where p(x) = P[Y = 1|X = x]. Note that the 0-1 loss in (3.2) cannot be

used for boosting or FGD: it is non-diﬀerentiable and also non-convex as

a function of the margin value ˜yf. The negative log-likelihood loss in (3.1)

can be viewed as a convex upp er approximation of the (computationally

intractable) non-convex 0-1 loss, see Figure 1. We will describe in Section 3.3

the BinomialBoosting algorithm (similar to LogitBoost [33]) which uses the

negative log-likelihood as loss function (i.e. the surrogate loss which is the

implementing loss function for the algorithm).

Another upper convex approximation of the 0-1 loss function in (3.2) is

the exponential loss

exp

(y, f) = exp(−˜yf),(3.3)

implemented (with notation y ∈ {−1, +1}) in mboost as AdaExp() family.

The population minimizer can be shown to be the same as for the log-

likelihood loss [33, cf.]:

∗

exp

(x) =

log



p(x)

1 − p(x)



, p(x) = P[Y = 1|X = x].

imsart-sts ver. 2005/10/19 file: BuehlmannHothorn_Boosting.tex date: June 4, 2007

剩余51页未读，继续阅读

应用市场

粉丝: 943
资源: 4246

统计视角下的提升算法：正则化、预测与模型拟合

Improved Boosting Algorithms

Computer Age Statistical Inference: Algorithms，Evidence，and Data Science.

BOOSTING实现数据分类附matlab代码.zip

基于Boosting的四种算法简单运用.ipynb

基于boosting实时人脸检测跟踪技术研究.caj

论文研究-Boosting协方差特征的人脸检测方法.pdf

论文研究-基于Boosting RBF神经网络的入侵检测.pdf

融合启发式和Boosting的子图匹配基数估计方法.docx

电信设备-一种基于Boosting分类算法的信息检索方法.zip

The Evolution of Boosting Algorithms From Machine Learning

最新资源