机器学习：理论与应用的核心算法探索

需积分: 7 195 浏览量更新于2024-07-22 收藏 17.68MB PDF 举报

机器学习是一门研究如何构建计算机程序以通过经验自我改进的学科。近年来，这一领域取得了显著的成功，催生了一系列应用广泛的产品。这些应用包括能够识别信用卡欺诈行为的数据挖掘系统，能够根据用户喜好进行信息过滤的个性化推荐系统，以及在公路上自主驾驶的智能车辆。这些技术进步不仅依赖于强大的算法，还融合了统计学、人工智能、哲学、信息论、生物学、认知科学、计算复杂性以及控制理论等多个领域的知识。本书旨在介绍构成机器学习核心的算法和理论。作者认为，理解机器学习需要从多个角度出发，比如问题设定、算法设计以及它们背后的假设。过去，由于这些学科之间的界限不清晰，这使得全面掌握机器学习概念相当困难。然而，本书试图打破这种界限，提供一个综合性的视角，让读者能够深入理解每个主题的内在逻辑。机器学习的核心算法包括监督学习（如决策树、支持向量机和神经网络），无监督学习（如聚类和降维），以及强化学习（智能体与环境互动以优化策略）。这些算法的设计往往基于概率模型，利用贝叶斯定理进行推断，或者通过优化目标函数来寻找最佳解。同时，深度学习作为机器学习的一个分支，特别是深度神经网络的兴起，极大地推动了图像识别、自然语言处理等复杂任务的性能提升。此外，数据预处理和特征工程也是机器学习流程中的关键环节，它们决定了输入数据的质量和模型的表现。评估和选择合适的性能指标，如准确率、召回率、F1分数和AUC值，对于理解和优化模型至关重要。模型的可解释性和透明度也是现代机器学习研究的重要课题，特别是在涉及到高风险决策或公平性问题时。在理论层面，机器学习涉及统计推断、优化理论、泛化能力和模型选择等概念。随着计算能力的增强和大数据时代的到来，大模型和深度学习方法的研究也在不断扩展我们的认识边界。同时，对于算法效率、复杂度分析和理论保证的理解，帮助我们更好地理解和优化实际应用中的学习过程。机器学习是一门综合多学科知识的强大工具，它的发展不仅改变了科技行业的面貌，也正在深刻地影响着我们的日常生活。通过深入学习其基础理论和算法，我们可以更好地应用这一技术，解决现实世界中的各种复杂问题。

MACHINE

LEARNING

Artificial intelligence

Learning symbolic representations of concepts. Machine learning as a search problem. Learning

an approach to improving problem solving. Using prior knowledge together with training data

to guide learning.

Bayesian methods

Bayes' theorem as the basis for calculating probabilities of hypotheses. The naive Bayes classifier.

Algorithms for estimating values of unobserved variables.

Computational complexity theory

Theoretical bounds on the inherent complexity of different learning tasks, measured in terms of

the computational effort, number of training examples, number of mistakes, etc. required in order

to learn.

Control theory

Procedures that learn to control processes in order to optimize predefined objectives and that learn

to predict the next state of the process they are controlling.

Information theory

Measures of entropy and information content. Minimum description length approaches to learning.

Optimal codes and their relationship to optimal training sequences for encoding a hypothesis.

Philosophy

Occam's razor, suggesting that the simplest hypothesis is the best. Analysis of the justification for

generalizing beyond observed data.

Psychology

and

neurobiology

The power law of practice, which states that over a very broad range of learning problems,

people's response time improves with practice according to a power law. Neurobiological studies

motivating artificial neural network models of learning.

Statistics

Characterization of errors (e.g., bias and variance) that occur when estimating the accuracy of a

hypothesis based on a limited sample of data. Confidence intervals, statistical tests.

TABLE

1.2

Some disciplines and examples of their influence on machine learning.

Training experience

a database of handwritten words with given classi-

fications

robot driving learning problem:

Task T: driving on public four-lane highways using vision sensors

Performance measure

average distance traveled before an error (as judged

by human overseer)

Training experience

a sequence of images and steering commands record-

ed while observing a human driver

Our

definition of learning is broad enough to include most tasks that we

would conventionally call "learning" tasks, as we use the word in everyday lan-

guage. It is also broad enough to encompass computer programs that improve

from experience in quite straightforward ways. For example, a database system

CHAFTlB

INTRODUCTION

that allows users to update data entries would fit our definition of a learning

system: it improves its performance at answering database queries, based on the

experience gained from database updates. Rather than worry about whether this

type of activity falls under the usual informal conversational meaning of the word

"learning," we will simply adopt our technical definition of the class of programs

that improve through experience. Within this class we will find many types of

problems that require more or less sophisticated solutions. Our concern here is

not to analyze the meaning of the English word "learning" as it is used in ev-

eryday language. Instead, our goal is to define precisely a class of problems that

encompasses interesting forms of learning, to explore algorithms that solve such

problems, and to understand the fundamental structure of learning problems and

processes.

1.2 DESIGNING A LEARNING SYSTEM

In order to illustrate some of the basic design issues and approaches to machine

learning, let us consider designing a program to learn to play checkers, with

the goal of entering it in the world checkers tournament. We adopt the obvious

performance measure: the percent of games it wins in this world tournament.

1.2.1 Choosing the Training Experience

The first design choice we face is to choose the type of training experience from

which our system will learn. The type of training experience available can have a

significant impact on success or failure of the learner. One key attribute is whether

the training experience provides direct or indirect feedback regarding the choices

made by the performance system. For example, in learning to play checkers, the

system might learn from

direct training examples consisting of individual checkers

board states and the correct move for each. Alternatively, it might have available

only

indirect information consisting of the move sequences and final outcomes

of various games played. In this later case, information about the correctness

of specific moves early

the game must

inferred indirectly from the fact

that the game was eventually won or lost. Here the learner faces an additional

problem of

credit assignment, or determining the degree to which each move

the sequence deserves credit or blame for the final outcome. Credit assignment can

be a particularly difficult problem because the game can be lost even when early

moves are optimal, if these are followed later by poor moves. Hence, learning from

direct

training feedback is typically easier than learning from indirect feedback.

A second important attribute of the training experience is the degree to which

the learner controls the sequence of training examples. For example, the learner

might rely on the teacher to select informative board states and to provide the

correct move for each. Alternatively, the learner might itself propose board states

that it finds particularly confusing and ask the teacher for the correct move.

the

learner may have complete control over both the board states and (indirect) training

classifications, as

does when it learns by playing against itself with no teacher

present. Notice in this last case the learner may choose between experimenting

with novel board states that it has not yet considered, or honing its skill by playing

minor variations of lines of play it currently finds most promising. Subsequent

chapters consider a number of settings for learning, including settings in which

training experience is provided by a random process outside the learner's control,

settings in which the learner may pose various types of queries to an expert teacher,

and settings in which the learner collects training examples by autonomously

exploring its environment.

third important attribute of the training experience is how well it repre-

sents the distribution of examples over which the final system performance

must

measured. In general, learning is most reliable when the training examples fol-

low

distribution similar to that of future test examples. In our checkers learning

scenario, the performance metric

is the percent of games the system wins in

the world tournament. If its training experience

consists only of games played

against itself, there is an obvious danger that this training experience might not

be fully representative of the distribution of situations over which it will later be

tested. For example, the learner might never encounter certain crucial board states

that are very likely to

played by the human checkers champion. In practice,

it is often necessary to learn from a distribution of examples that is somewhat

different from those on which the final system will be evaluated (e.g., the world

checkers champion might not be interested in teaching the program!). Such situ-

ations are problematic because mastery of one distribution of examples will not

necessary lead to strong performance over some other distribution. We shall see

that most current theory of machine learning rests on the crucial assumption that

the distribution of training examples is identical to the distribution of test ex-

amples. Despite our need to make this assumption in order to obtain theoretical

results, it is important to keep in mind that this assumption must often

violated

in practice.

To proceed with our design, let us decide that our system will train by

playing games against itself. This has the advantage that no external trainer need

be present, and it therefore allows the system to generate as much training data

as time permits. We now have a fully specified learning task.

checkers learning problem:

Task

playing checkers

Performance measure

percent of games won in the world tournament

Training experience

games played against itself

In order to complete the design of the learning system, we must now choose

the exact type of knowledge to be,learned

a representation for this target knowledge

a learning mechanism

CHAFTER

INTRODUCTION

1.2.2

Choosing the Target Function

The next design choice is to determine exactly what type of knowledge will be

learned and how this will be used by the performance program. Let us begin with

a checkers-playing program that can generate the

legal

moves from any board

state. The program needs only to learn how to choose the

best

move from among

these legal moves. This learning task is representative of a large class of tasks for

which the legal moves that define some large search space are known a priori, but

for which the best search strategy is not known. Many optimization problems fall

into this class, such as the problems of scheduling and controlling manufacturing

processes where the available manufacturing steps are well understood, but the

best strategy for sequencing them is not.

Given this setting where we must learn to choose among the legal moves,

the most obvious choice for the type of information to be learned is a program,

or function, that chooses the best move for any given board state. Let us call this

function

ChooseMove

and use the notation

ChooseMove

to indicate

that this function accepts as input any board from the set of legal board states

and produces as output some move from the set of legal moves

Throughout

our discussion of machine learning we will find it useful to reduce the problem

of improving performance

at task

to the problem of learning some particu-

lar

targetfunction

such as

ChooseMove.

The choice of the target function will

therefore be a key design choice.

Although

ChooseMove

is an obvious choice for the target function in our

example, this function will turn out to be very difficult to learn given the kind of in-

direct training experience available to our system. An alternative target function-

and one that will

turn

out to be easier to learn in this setting-is an evaluation

function that assigns a numerical score to any given board state. Let us call this

target function

and again use the notation

to denote that

maps

any

legal board state from the set

to some real value (we use

to denote the set

of real numbers). We intend for this target function

to assign higher scores to

better board states.

the system can successfully learn such a target function

then it can easily use it to select the best move from any current board position.

This can be accomplished by generating the successor board state produced by

every legal move, then using

to choose the best successor state and therefore

the best legal move.

What exactly should

the value of the target function

for any given

board state? Of course any evaluation function that assigns higher scores to better

board states will do. Nevertheless, we will find it useful to define one particular

target function

among the many that produce optimal play. As we shall see,

this will make it easier to design a training algorithm. Let us therefore define the

target value

V(b)

for an arbitrary board state

follows:

is a final board state that is won, then

V(b)

100

a final board state that is lost, then

V(b)

-100

is a final board state that is drawn, then

V(b)

if b is a not a final state in the game, then V(b)

V(bl), where b' is the best

final board state that can be achieved starting from b and playing optimally

until the end of the game (assuming the opponent plays optimally, as well).

While this recursive definition specifies a value of V(b) for every board

state b, this definition is not usable by our checkers player because it is not

efficiently computable. Except for the trivial cases (cases

1-3)

in which the game

has already ended, determining the value of V(b) for a particular board state

requires (case

searching ahead for the optimal line of play, all the way to

the end of the game! Because this definition is not efficiently computable by our

checkers playing program, we say that it is a

nonoperational

definition. The goal

of learning in this case is to discover an

operational

description of

that is, a

description that can be used by the checkers-playing program to evaluate states

and select moves within realistic time bounds.

Thus, we have reduced the learning task in this case to the problem of

discovering an

operational description of the ideal targetfunction

V. It may be

very difficult in general to learn such

operational form of V perfectly. In fact,

we often expect learning algorithms to acquire only some

approximation

to the

target function, and for this reason the process of learning the target function

is often called

function approximation.

In the current discussion we will use the

symbol

? to refer to the function that is actually learned by our program, to

distinguish it from the ideal target function

1.23

Choosing a Representation for the Target Function

Now that we have specified the ideal target function V, we must choose a repre-

sentation that the learning program will use to describe the function

that it will

learn. As with earlier design choices, we again have many options. We could,

for example, allow the program to represent using a large table with a distinct

entry specifying the value for each distinct board state. Or we could allow it to

represent using a collection of rules that match against features of the board

state, or a quadratic polynomial function of predefined board features, or an arti-

ficial neural network. In general, this choice of representation involves a crucial

tradeoff. On one hand, we wish to pick a very expressive representation to allow

representing as close an approximation as possible to the ideal target function

On the other hand, the more expressive the representation, the more training data

the program will require in order to choose among the alternative hypotheses it

can represent. To keep the discussion brief, let us choose a simple representation:

for any given board state, the function

will be calculated as a linear combination

of the following board features:

xl:

the number of black pieces on the board

x2: the number of red pieces on the board

xs:

the number of black kings on the board

x4:

the number of red kings on the board

剩余419页未读，继续阅读

sinat_20440667

粉丝: 0
资源: 1

机器学习：理论与应用的核心算法探索

Julia语言开发的MachineLearning.jl机器学习库

大数据时代的数据挖掘：Machine Learning for Hackers

深度学习入门：动手学Machine Learning

machine learning

machineLearning

MachineLearning

机器学习实战书籍：《Machine Learning Gladiator》

Python开发工具库：mypy_boto3_machinelearning

简单的基于 Kotlin 和 JavaFX 实现的推箱子小游戏示例代码

基于simulink建立的PEMFC燃料电池机理模型（国外团队开发的，密歇根大学)，包含空压机模型，空气路，氢气路，电堆等模型 可以正常进行仿真

最新资源

基于simulink建立的PEMFC燃料电池机理模型（国外团队开发的，密歇根大学)，包含空压机模型，空气路，氢气路，电堆等模型可以正常进行仿真