模式识别中的误差估计

下载需积分: 10 | PDF格式 | 9.84MB | 更新于2024-07-18 | 111 浏览量 | 举报

"误差估计在模式识别中的应用" 《误差估计与模式识别》是IEEE Press出版的一本关于模式识别领域的专业书籍，由ULISSES M. BRAGA-NETO和EDWARD R. DOUGHERTY合著。这本书共336页，具有高清的图像和详细的书签目录，方便读者查阅和学习。模式识别是一种广泛应用的技术，主要用于从数据中自动识别模式，例如在图像分析、语音识别、机器学习和人工智能等领域。误差估计在模式识别中扮演着关键角色，因为任何识别系统都无法避免地会产生误差。理解并量化这些误差对于提高识别系统的性能至关重要。书中可能涵盖了以下关键知识点： 1. **误差来源**：讨论了模式识别过程中可能产生的各种误差类型，包括测量误差、模型简化误差、噪声干扰以及算法内在的不确定性等。 2. **误差度量**：介绍了用于评估识别性能的各种误差指标，如误分类率、精确率、召回率、F1分数等，这些度量帮助优化模型并理解其局限性。 3. **统计方法**：可能会涵盖贝叶斯理论、假设检验、置信区间估计等统计工具，这些都是误差估计的基础。 4. **误差建模**：书中可能探讨如何建立误差模型，以便更好地预测和控制识别过程中的误差。 5. **学习理论**：可能涉及学习曲线和泛化能力的概念，解释了为什么训练集和测试集的表现可能存在差距，以及如何通过交叉验证等技术来减小这种差距。 6. **优化策略**：可能包括参数调整、正则化技术以及集成学习等方法，这些都可以用来降低识别错误。 7. **实际应用**：书中可能包含多个案例研究，展示了误差估计在实际模式识别问题（如图像分类、语音识别等）中的应用。 8. **最新进展**：作为一本2015年的出版物，它可能涵盖了当时的最新研究成果和技术趋势，比如深度学习在模式识别中的应用，以及如何通过误差估计来改进这些先进模型。 9. **技术审阅**：由Frank Alexander（来自Los Alamos National Laboratory）进行技术审查，确保了内容的专业性和准确性。这本书对于想要深入理解模式识别系统性能评估、误差控制和优化的学者和从业者来说是一份宝贵的资源。通过学习，读者可以提升对模式识别中误差本质的理解，并掌握如何有效减少和处理这些误差，从而提高系统的整体性能。

展开

xiv PREFACE

regarding the denition of a classier: either it is derived from the feature–label

distribution and therefore is intrinsic to it, as in the case of an optimal classier, or it

is not derived directly from the feature–label distribution, in which case it is extrinsic

to the distribution. In both cases, classication error rates are intrinsic because they

are derived from the distribution, given the classier. Together, a classier with a

stated error rate constitute a pattern recognition model. If the classier is intrinsic

to the distribution, then there is no issue of error estimation because the feature–

label distribution is known and the error rate can be derived. On the other hand, if the

classier is extrinsic to the distribution, then it is generally the case that the distribution

is not known, the classier is constructed from data via a classication rule, and the

error rate of interest is estimated from the data via an error estimation rule. These

rules together constitute a pattern recognition rule. (Feature selection, if present, is

considered to be part of the classication rule, and thus of the pattern recognition rule.)

From a scientic perspective, that is, when a classier is applied to phenomena, as

with any scientic model, the validity of a pattern recognition model is characterized

by the degree of concordance between predictions based on the model and corre-

sponding observations.

The only prediction based on a pattern recognition model is

the percentage of errors the classier will make when it is applied. Hence, validity is

based on concordance between the error rate in the model and the empirical error rate

on the data when applied. Hence, when the distributions are not known and the model

error must be estimated, model validity is completely dependent on the accuracy of

the error estimation rule, so that the salient epistemological problem of classication

is error estimation.

Good classication rules produce classiers with small average error rates. But is a

classication rule good if its error rate cannot be estimated accurately? This is not just

an abstract exercise in epistemology, but it has immediate practical consequences for

applied science. For instance, the problem is so serious in genomic classication that

the Director of the Center for Drug Evaluation and Research at the FDA has estimated

that as much as 75% of published biomarker associations are not replicable. Absent

an accurate error estimation rule, we lack knowledge of the error rate. Thus, the

classier is useless, no matter how sophisticated the classication rule that produced

it. Therefore, one should speak about the goodness of a pattern recognition rule, as

dened above, rather than that of a classication rule in isolation.

A common scenario in the literature these days proceeds in the following manner:

propose an algorithm for classier design; apply the algorithm to a few small-sample

data sets, with no information pertaining to the distributions from which the data

In the words of Richard Feynman, “It is whether or not the theory gives predictions that agree with

experiment. It is not a question of whether a theory is philosophically delightful, or easy to understand, or

perfectly reasonable from the point of view of common sense.” (Feynman, 1985)

From a scientic perspective, the salient issue is error estimation. One can imagine Harald Cram

leisurely sailing on the Baltic off the coast of Stockholm, taking in the sights and sounds of the sea,

when suddenly a gene-expression classier to detect prostate cancer pops into his head. No classication

rule has been applied, nor is that necessary. All that matters is that Cram

er’s imagination has produced a

classier that operates on the population of interest with a sufciently small error rate. Estimation of that

rate requires an accurate error estimation rule.

PREFACE xv

sets have been sampled; and apply an error estimation scheme based on resampling

the data, typically cross-validation. With regard to the third step, we are given no

characterization of the accuracy of the error estimator and why it should provide a

reasonably good estimate. Most strikingly, as we show in this book, we can expect

it to be inaccurate in small-sample cases. Nevertheless, the claim is made that the

proposed algorithm has been “validated.” Very little is said about the accuracy of the

error estimation step, except perhaps that cross-validation is close to being unbiased

if not too many points are held out. But this kind of comment is misleading, given

that a small bias may be of little consequence if the variance is large, which it usually

is for small samples and large feature sets. In addition, the classical cross-validation

unbiasedness theorem holds if sampling is random over the mixture of the populations.

In situations where this is not the case, for example, the populations are sampled

separately, bias is introduced, as it is shown in Chapter 5. These kinds of problems

are especially detrimental in the current era of high-throughput measurement devices,

for which it is now commonplace to be confronted with tens of thousands of features

and very small sample sizes.

The subject of error estimation has in fact a long history and has produced a large

body of literature; four main review papers summarize the major advances in the

eld up to 2000 (Hand, 1986; McLachlan, 1987; Schiavo and Hand, 2000; Toussaint,

1974); recent advances in error estimation since 2000 include work on model selection

(Bartlett et al., 2002), bolstering (Braga-Neto and Dougherty, 2004a; Sima et al.,

2005b), feature selection (Hanczar et al., 2007; Sima et al., 2005a; Xiao et al.,

2007; Zhou and Mao, 2006), condence intervals (Kaariainen, 2005; Kaariainen and

Langford, 2005; Xu et al., 2006), model-based second-order properties (Zollanvari

et al., 2011, 2012), and Bayesian error estimators (Dalton and Dougherty, 2011b,c).

This book covers the classical studies as well as the recent developments. It discusses

in detail nonparametric approaches, but gives special consideration, especially in the

latter part of the book, to parametric, model-based approaches.

Pattern recognition plays a key role in many disciplines, including engineer-

ing, physics, statistics, computer science, social science, manufacturing, materials,

medicine, biology, and more, so this book will be useful for researchers and prac-

titioners in all these areas. This book can serve as a text at the graduate level, can

be used as a supplement for general courses on pattern recognition and machine

learning, or can serve as a reference for researchers across all technical disciplines

where classication plays a major role, which may in fact be all technical disciplines.

The book is organized into eight chapters. Chapters 1 and 2 provide the foundation

for the rest of the book and must be read rst. Chapters 3, 4, and 8 stand on their own

and can be studied separately. Chapter 5 provides the foundation for Chapters 6 and

7, so these chapters should be read in this sequence. For example, chapter sequences

1-2-3-4, 1-2-5-6-7, and 1-2-8 are all possible ways of reading the book. Naturally,

the book is best read from beginning to end. Short descriptions of each chapter are

provided next.

Chapter 1. Classication. To make the book self-contained, the rst chapter

covers basic topics in classication required for the remainder of the text: classiers,

population-based and sample-based discriminants, and classication rules. It denes a

xvi PREFACE

few basic classication rules: LDA, QDA, discrete classication, nearest-neighbors,

SVMs, neural networks, and classication trees, closing with a section on feature

selection.

Chapter 2. Error Estimation. This chapter covers the basics of error estimation:

denitions, performance metrics for estimation rules, test-set error estimation, and

training-set error estimation. It also includes a discussion of pattern recognition

models. The test-set error estimator is straightforward and well-understood; there

being efcient distribution-free bounds on performance. However, it assumes large

sample sizes, which may be impractical. The thrust of the book is therefore on training-

set error estimation, which is necessary for small-sample classier design. We cover

in this chapter the following error estimation rules: resubstitution, cross-validation,

bootstrap, optimal convex estimation, smoothed error estimation, and bolstered error

estimation.

Chapter 3. Performance Analysis. The main focus of the book is on performance

characterization for error estimators, a topic whose coverage is inadequate in most

pattern recognition texts. The fundamental entity in error analysis is the joint dis-

tribution between the true error and the error estimate. Of special importance is the

regression between them. This chapter discusses the deviation distribution between

the true and estimated error, with particular attention to the root-mean-square (RMS)

error as the key metric of error estimation performance. The chapter covers vari-

ous other issues: the impact of error estimation on feature selection, bias that can

arise when considering multiple data sets or multiple rules, and measurement of

performance reproducibility.

Chapter 4. Error Estimation for Discrete Classication. For discrete classi-

cation, the moments of resubstitution and the basic resampling error estimators,

cross-validation and bootstrap, can be represented in nite-sample closed forms, as

presented in the chapter. Using these representations, formulations of various perfor-

mance measures are obtained, including the bias, variance, deviation variance, and

the RMS and correlation between the true and estimated errors. Next, complete enu-

meration schemes to represent the joint and conditional distributions between the true

and estimated errors are discussed. The chapter concludes by surveying large-sample

performance bounds for various error estimators.

Chapter 5. Distribution Theory. This chapter provides general distributional

results for discriminant-based classiers. Here we also introduce the issue of sep-

arate vs. mixture sampling. For the true error, distributional results are in terms of

conditional probability statements involving the discriminant. Passing on to resubsti-

tution and the resampling-based estimators, the error estimators can be expressed in

terms of error-counting expressions involving the discriminant. With these in hand,

one can then go on to evaluate expected error rates and higher order moments of

the error rates, all of which take combinatorial forms. Second-order moments are

included, thus allowing computation of the RMS. The chapter concludes with ana-

lytical expressions for the sampling distribution for resubstitution and leave-one-out

cross-validation.

Chapter 6. Gaussian Distribution Theory: Univariate Case. Historically, much

effort was put into characterizing expected error rates for linear discrimination in the

PREFACE xvii

Gaussian model for the true error, resubstitution, and leave-one-out. This chapter

considers these, along with more recent work on the bootstrap. It then goes on to pro-

vide exact expressions for the rst- and higher-order moments of various error rates,

thereby providing an exact analytic expression for the RMS. The chapter provides

exact expressions for the marginal sampling distributions of the resubstitution and

leave-one-out error estimators. It closes with comments on the joint distribution of

true and estimated error rates.

Chapter 7. Gaussian Distribution Theory: Multivariate Case. This chap-

ter begins with statistical representations for multivariate discrimination, including

Bowkers classical representation for Anderson’s LDA discriminant, Moran’s repre-

sentations for John’s LDA discriminant, and McFarland and Richards’ QDA discrim-

inant representation. A discussion follows on computational techniques for obtaining

the error rate moments based on these representations. Large-sample methods are then

treated in the context of double-asymptotic approximations where the sample size

and feature dimension both approach innity, in a comparable fashion. This leads to

asymptotically-exact, nite-sample approximations to the rst and second moments

of various error rates, which lead to double-asymptotic approximations for the RMS.

Chapter 8. Bayesian MMSE Error Estimation. In small-sample situations it

is virtually impossible to make any useful statements concerning error-estimation

accuracy unless distributional assumptions are made. In previous chapters, these

distributional assumptions were made from a frequentist point of view, whereas this

chapter considers a Bayesian approach to the problem. Partial assumptions lead to

an uncertainty class of feature–label distributions and, assuming a prior distribution

over the uncertainty class, one can obtain a minimum-mean-square-error (MMSE)

estimate of the error rate. In opposition to the classical case, a sample-conditioned

MSE of the error estimate can be dened and computed. In addition to the general

MMSE theory, this chapter provides specic representations of the MMSE error

estimator for discrete classication and linear classication in the Gaussian model.

It also discusses consistency of the estimate and calibration of non-Bayesian error

estimators relative to a prior distribution governing model uncertainty.

We close by noting that for researchers in applied mathematics, statistics, engi-

neering, and computer science, error estimation for pattern recognition offers a wide

open eld with fundamental and practically important problems around every turn.

Despite this fact, error estimation has been a neglected subject over the past few

decades, even though it spans a huge application domain and its understanding is

critical for scientic epistemology. This text, which is believed to be the rst book

devoted entirely to the topic of error estimation for pattern recognition, is an attempt

to correct the record in that regard.

U M. B-N

E R. D

College Station, Texas

April 2015

剩余335页未读，继续阅读

身份认证购VIP最低享 7 折!

30元优惠券

ignite678@126.com

粉丝: 2

模式识别中的误差估计

姿势识别技术：骨架绘制与tf-pose-estimation详解

SAP cProject Sizing Guide for Release 4.5: Collaborative Project Estimation

Pyramid Stereo Matching Network: A 3D CNN Approach for Accurate Depth Estimation

Statistical Pattern Recognition:A Review

Pattern Recognition and Machine Learning (Bishop)

Pattern Recogintion and Machine Learning

Applications of Autocorrelation Function in Signal Processing: Noise Reduction, Pattern Recognition

【监督学习探索】：《Pattern Recognition and Machine Learning》第一章，掌握机器学习初探关键

Error Analysis and System Calibration in MATLAB Signal Processing

MATLAB Signal Enhancement Techniques: Strategies for Improving Signal Clarity

最新资源