179种分类器性能比较：解决现实问题的最优选择

需积分: 38 20 浏览量更新于2024-07-22 2 收藏 536KB PDF 举报

本文档深入探讨了"神经网络分类器比较"的主题，主要关注在实际问题的分类任务中，是否真的需要众多不同类型的分类器。作者Manuel Fernández-Delgado、Eva Cernadas、Senén Barro和Dinani Amorim合作研究，通过对179种来自17个不同分类方法家族的模型进行详尽的比较分析，这些方法包括但不限于判别分析、贝叶斯分类、神经网络、支持向量机、决策树、基于规则的分类器、增强学习（如Boosting和Bagging）、堆叠（Stacking）、随机森林以及各种集成学习方法、线性模型（如主成分回归和部分最小二乘法）、邻近法（如K-近邻算法）和逻辑与多项式回归，以及适应性多层回归等。研究中，作者采用了横纵两个维度的比较策略：一方面，他们将不同类型的分类器横向对比，观察每种方法在不同数据集上的性能差异；另一方面，他们对同一类型的分类器进行了纵向比较，以探究不同实现或参数设置下的效果。这样做的目的是为了确定是否存在一种或少数几种分类器能够广泛适应各种场景，并在大多数情况下提供最佳的分类性能。通过这项大规模的研究，论文旨在回答"解决现实世界分类问题是否真的需要上百种分类器"这一问题。结果可能会为实际应用中的模型选择提供指导，帮助数据科学家和工程师根据具体的数据集特性和需求，更高效地选择最合适的分类器，从而避免过度复杂化模型并提高效率。此外，文章还可能涵盖了实验设计的细节，比如数据集的选择、评估指标的设定，以及模型调优的方法，这些都是理解分类器性能的关键因素。读者可以从这篇研究中了解到如何权衡模型的复杂性、准确性和计算成本，以便在实际项目中做出明智的决策。这篇文章是一份重要的参考文献，对于理解和评价各类神经网络和其他机器学习分类器在实际任务中的表现具有很高的价值，同时也为提升分类器选择的科学性提供了实用的工具和洞见。

Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?

Discriminant analysis (DA): 20 classiﬁers.

1. lda R, linear discriminant analysis, with the function lda in the MASS package.

2. lda2 t, from the MASS package, which develops LDA tuning the number of components

to retain up to #classes − 1.

3. rrlda R, robust regularized LDA, from the rrlda package, tunes the parameters

lambda (which controls the sparseness of the covariance matrix estimation) and alpha

(robustness, it controls the number of outliers) with values {0.1, 0.01, 0.001} and {0.5,

0.75, 1.0} respectively.

4. sda t, shrinkage discriminant analysis and CAT score variable selection (Ahdesm¨aki

and Strimmer, 2010) from the sda package. It performs LDA or diagonal discriminant

analysis (DDA) with variable selection using CAT (Correlation-Adjusted T) scores.

The best classiﬁer (LDA or DDA) is selected. The James-Stein method is used for

shrinkage estimation.

5. slda t with function slda from the ipred package, which develops LDA based on

left-spherically distributed linear scores (Glimm et al., 1998).

6. stepLDA t uses the function train in the caret package as interface to the function

stepclass in the klaR package with method=lda. It develops classiﬁcation by means of

forward/backward feature selection, without upper bounds in the number of features.

7. sddaLDA R, stepwise diagonal discriminant analysis, with function sdda in the SDDA

package with method=lda. It creates a diagonal discriminant rule adding one input

at a time using a forward stepwise strategy and LDA.

8. PenalizedLDA t from the penalizedLDA package: it solves the high-dimensional

discriminant problem using a diagonal covariance matrix and penalizing the discrimi-

nant vectors with lasso or fussed coeﬃcients (Witten and Tibshirani, 2011). The lasso

penalty parameter (lambda) is tuned with values {0.1, 0.0031, 10

−4

9. sparseLDA R, with function sda in the sparseLDA package, minimizing the SDA

criterion using an alternating method (Clemensen et al., 2011). The parameter

lambda is tuned with values 0,{10

}

−1

. The number of components is tuned from

2 to #classes − 1.

10. qda t, quadratic discriminant analysis (Venables and Ripley, 2002), with function

qda in the MASS package.

11. QdaCov t in the rrcov package, which develops Robust QDA (Todorov and Filz-

moser, 2009).

12. sddaQDA R uses the function sdda in the SDDA package with method=qda.

13. stepQDA t uses function stepclass in the klaR package with method=qda, forward

/ backward variable selection (parameter direction=both) and without limit in the

number of selected variables (maxvar=Inf).

3141

Fern

andez-Delgado, Cernadas, Barro and Amorim

14. fda R, ﬂexible discriminant analysis (Hastie et al., 1993), with function fda in the

mda package and the default linear regression method.

15. fda t is the same FDA, also with linear regression but tuning the parameter nprune

with values 2:3:15 (5 values).

16. mda R, mixture discriminant analysis (Hastie and Tibshirani, 1996), with function

mda in the mda package.

17. mda t uses the caret package as interface to function mda, tuning the parameter

subclasses between 2 and 11.

18. pda t, penalized discriminant analysis, uses the function gen.rigde in the mda package,

which develops PDA tuning the shrinkage penalty coeﬃcient lambda with values from

1 to 10.

19. rda R, regularized discriminant analysis (Friedman, 1989), uses the function rda in

the klaR package. This method uses regularized group covariance matrix to avoid

the problems in LDA derived from collinearity in the data. The parameters lambda

and gamma (used in the calculation of the robust covariance matrices) are tuned with

values 0:0.25:1.

20. hdda R, high-dimensional discriminant analysis (Berg´e et al., 2012), assumes that

each class lives in a diﬀerent Gaussian subspace much smaller than the input space,

calculating the subspace parameters in order to classify the test patterns. It uses the

hdda function in the HDclassif package, selecting the best of the 14 available models.

Bayesian (BY) approaches: 6 classiﬁers.

21. naiveBayes R uses the function NaiveBayes in R the klaR package, with Gaussian

kernel, bandwidth 1 and Laplace correction 2.

22. vbmpRadial t, variational Bayesian multinomial probit regression with Gaussian

process priors (Girolami and Rogers, 2006), uses the function vbmp from the vbmp

package, which ﬁts a multinomial probit regression model with radial basis function

kernel and covariance parameters estimated from the training patterns.

23. NaiveBayes w (John and Langley, 1995) uses estimator precision values chosen from

the analysis of the training data.

24. NaiveBayesUpdateable w uses estimator precision values updated iteratively using

the training patterns and starting from the scratch.

25. BayesNet w is an ensemble of Bayes classiﬁers. It uses the K2 search method, which

develops hill climbing restricted by the input order, using one parent and scores of

type Bayes. It also uses the simpleEstimator method, which uses the training patterns

to estimate the conditional probability tables in a Bayesian network once it has been

learnt, which α = 0.5 (initial count).

26. NaiveBayesSimple w is a simple naive Bayes classiﬁer (Duda et al., 2001) which

uses a normal distribution to model numeric features.

3142

剩余48页未读，继续阅读

qq_24212391

粉丝: 0
资源: 3

179种分类器性能比较：解决现实问题的最优选择

线性分类器与神经网络分类器比较

BP神经网络分类器PGJ压缩包解密

优化神经网络分类器的机器学习算法研究及实验结果分析

神经网络分类器：使用神经网络分类器进行图像分类-matlab开发

神经网络分类器

神经网络分类器：3 个主要神经网络分类器的 Mex 实现。-matlab开发

神经网络BP神经网络分类器.zip

神经网络BP神经网络分类器PGJ.zip

线性分类器和神经网络分类器的对比

神经网络分类器的设计

最新资源