工程师指南：机器学习基础与应用简介

需积分: 9 184 浏览量更新于2024-07-15 收藏 2.26MB PDF 举报

"《工程师视角的机器学习简介》(A Brief Introduction to Machine Learning for Engineers, 2018)是一篇深入浅出的论文，由Osvaldo Simeone撰写，他在国王学院伦敦的计算机科学部门任职，邮箱为osvaldo.simeone@kcl.ac.uk。该文章旨在为工程领域的专业人士提供对机器学习基础的理解和应用指导。第一部分"基础知识"涵盖了以下几个关键主题： 1. 机器学习简介：解释了什么是机器学习，它如何利用数据和统计模型来使系统自主改进，以及何时适合运用机器学习技术。 2. 目标与大纲：明确了论文的学习目标，如理解监督学习、无监督学习和强化学习的基本概念，以及后续章节的主要内容安排。第二部分"通过线性回归进行温柔入门"，着重于监督学习的介绍，包括： - 推断：如何基于输入数据预测输出，如预测房价或股票价格。 - 频率方法：基于大量数据的统计分析，如最小二乘法在回归中的应用。 - 贝叶斯方法：概率论视角下的学习，强调先验知识的作用。 - 最小描述长度（MDL）：信息理论中的一个概念，用于衡量模型复杂度和数据压缩。 - 信息理论度量：如熵和交叉熵，它们在评估模型性能时扮演重要角色。 - 解释与因果关系：讨论了模型解释性与因果关系探索的关系，以及两者之间的权衡。第三部分"概率模型的学习"探讨了： - 概率论基础知识：为理解后续模型做准备。 - 指数族模型：一个广泛使用的模型家族，包括正态分布等。 - 频率学习与贝叶斯学习：两种学习范式的对比，频率学派关注参数估计，而贝叶斯学派关注后验概率。 - 广义线性模型（GLM）在监督学习中的应用：展示了这些理论如何在实际问题中具体实现。 - 最大熵原理：强调选择最简单但又不失准确性的模型。 - 能量模型：非参数模型的一种，利用能量函数来描述数据分布。 - 高级话题：包括深度学习、神经网络和集成学习等内容，展示了机器学习的前沿进展。 - 总结：这部分再次回顾并概括了本节的核心概念。第四部分"监督学习"更深入地聚焦于分类任务，包括初步的随机梯度下降和其他分类算法的基础。整体而言，这篇论文为工程师提供了从基础到进阶的机器学习知识，帮助他们更好地理解和应用这一关键技术于工程实践。"

1.1. What is Machine Learning? 7

This starts with a in-depth analysis of the problem domain, which cul-

minates with the deﬁnition of a mathematical mod el. T he mathemat-

ical model is meant to capture the key features of the problem under

study, and is typically the result of the work of a number of experts.

The mathematical model is ﬁnally leveraged to derive hand-crafted so-

lutions to the problem.

For instance, consider the problem of deﬁning a chemical process

to produce a given molecule. The conventional ﬂow requires chemists

to leverage their knowledge of models that predict the outcome of indi-

vidual chemical r eactions, in order to craft a sequence of suitable steps

that synthesize the desired m olecule. Another example is the design

of speech translation or image/ video compression algorithms. Both of

these tasks involve the deﬁnition of models and algorithms by teams

of experts, such as linguists, psychologists, and signal processing p rac-

titioners, not infrequently during the course of long standardization

meetings.

The engineering design ﬂow outlined above may be to o costly and

ineﬃcient for problems in which f aster or less expensive solutions are

desirable. T he machine learning alternative is to collect large data sets,

e.g., of labelled speech, images or videos, and to us e this information

to train general-purpose learning machines to carry out the desired

task. While the standard engineering ﬂow relies on domain knowledge

and on design optimized for the problem at hand, machine learning

lets large amounts of data dictate algorithms and solutions. To this

end, rather than r equ iring a precise model of the set-up under study,

machine learning requires the speciﬁcation of an objective, of a model

to be trained, and of an optimization technique.

Returning to the ﬁrst example above, a machine learning approach

would proceed by training a general-purpose machine to p redict the

outcome of known chemical reactions based on a large data set, and

by then using the trained algorithm to explore ways to produce more

complex molecules. In a similar manner, large data sets of images or

videos would be used to train a general-purpose algorithm with the aim

of obtaining compressed representations from which the original input

can be recovered with some distortion.

1.2. When to Use Machine Learning? 9

1.2.1 Learning Tasks

We can distinguish among three diﬀerent main types of machine learn-

ing problems, which are brieﬂy introduced below. Th e discussion re-

ﬂects the f ocus of this monograph on parametric prob abilistic models,

as further elaborated on in the next section.

1. Supervised learning: We have N labelled training examples

D={(x

, t

)}

n=1

, where x

represents a covariate, or explanatory vari-

able, while t

is the corresponding label, or response. For instance,

variable x

may represent the text of an email, while the label t

may

be a binary variable indicating whether the email is spam or not. The

goal of supervised learning is to predict the value of the label t for

an input x that is not in the training s et. In other words, supervised

learning aims at generalizing the observations in the d ata set D to new

inputs. For example, an algorithm trained on a set of emails should be

able to classify a new email not present in the data set D.

We can generally distinguish between classiﬁcation problems, in

which the label t is discrete, as in the example above, and regression

problems, in which variable t is continuous. An example of a regression

task is the prediction of tomorrow’s temperature t based on today’s

meteorological observations x.

An eﬀective way to learn a predictor is to identify from the data

set D a predictive distribution p(t|x) from a set of parametrized distri-

butions. The conditional distribution p(t|x) deﬁnes a pr oﬁle of beliefs

over all possible of the label t given the input x. For instance, for tem-

perature pred iction, one could learn mean and variance of a Gaussian

distribution p(t|x) as a function of the input x. As a special case, the

output of a supervised learning algorithm may be in the form of a

deterministic predictive fun ction t =

t(x).

2. Unsupe rvised learning: Suppose now that we have an un-

labelled set of training examples D={x

}

n=1

. L ess well deﬁned than

supervised learning, unsupervised learning generally refers to the task

of learning p roperties of the mechanism that generates this data set.

Speciﬁc tasks and applications include clustering, which is the prob-

lem of grouping similar examples x

; dimensionality reduction, feature

extraction, and representation learning, all related to the problem of

10 Introduction

representing th e data in a smaller or more convenient space; and gen-

erative modelling, which is the problem of learning a generating mech-

anism to produ ce artiﬁcial examples that are similar to available data

in the data set D.

As a generalization of both supervised an d unsupervised learning,

semi-supervised learning refers to s cenarios in which not all examples

are labelled, with the unlabelled examples providing information about

the distribution of the covariates x.

3. Reinforcement learning: Reinforcement learning refers to the

problem of inferring optimal sequential decisions based on rewards or

punishm ents received as a result of previous actions. Under supervised

learning, the “label” t refers to an action to be taken when the learner

is in an informational state about the environment given by a variable

x. Upon taking an action t in a state x, the learner is provided with

feedback on the immediate r eward accrued via this decision, and the

envir on ment moves on to a diﬀerent state. As an example, an agent can

be trained to navigate a given environment in the presence of obstacles

by penalizing decisions that result in collisions.

Reinforcement learning is hence neither supervised, since the learner

is not provided with the optimal actions t to select in a given state x; nor

is it fully unsupervised, given the availability of feedback on the quality

of the chosen action. Reinforcement learning is also distinguished from

supervised and unsupervised learning due to the inﬂuence of previous

actions on future states and rewards.

This monograph focuses on supervised and unsupervised learning.

These general tasks can be further classiﬁed along the following dimen-

sions.

• Passive vs. active learning: A passive learner is given the train-

ing examples, while an active learner can aﬀect the choice of training

examples on the basis of prior observations.

• Oﬄine vs. online learning: Oﬄine learning operates over a batch

of training samples, while online learning processes samples in a stream-

ing fashion. Note that reinforcement learning operates inherently in an

online manner, while supervised and unsupervised learning can be car-

ried out by following either oﬄine or online formulations.

1.3. Goals and Outline 11

This monograph considers only passive and oﬄine learning.

1.3 Goals and Outline

This monograph aims at provid ing an introduction to key concepts, al-

gorithms, and theoretical results in machine learning. The treatment

concentrates on probabilistic models for supervised and unsupervised

learning prob lems. It introduces fundamental concepts and algorithms

by building on ﬁrst principles, while also exposing the reader to more

advanced topics with extensive pointers to the literature, within a un i-

ﬁed notation and mathematical framework. Unlike other texts that are

focused on one particular aspect of the ﬁeld, an eﬀort has been made

here to provide a b road but concise overview in w hich the main ideas

and tech niques are systematically presented. Speciﬁcally, the material

is organized according to clearly d eﬁ ned categories, su ch as discrim-

inative and generative models, frequentist and Bayesian approaches,

exact and approximate inference, as well as directed and undirected

models. This monograph is meant as an entry point for research ers

with a background in probability and linear algebra. A prior exposure

to information theory is useful but not required.

Detailed discussions are provided on basic concepts and ideas, in -

cluding overﬁtting and generalization, Maximum Likelihood and regu-

larization, and Bayesian inference. The text also endeavors to provide

intuitive explanations and pointers to advanced topics and research di-

rections. Sections and subsections containing more advanced material

that m ay be skipped at a ﬁrst r eading are marked with a star (∗).

The reader will ﬁnd here neither discussions on computing platform

or programming frameworks, such as map-red uce, nor details on spe-

ciﬁc applications involving large data sets. These can be easily found

in a vast an d growing body of work. Fu rthermore, rather than pr ovid-

ing exhaustive details on the existing myriad solutions in each speciﬁc

category, techniques have been selected that are useful to illustrate the

most salient aspects. Historical notes have also been provided only for

a few selected milestone events.

Finally, the monograph attempts to s tr ike a balance between the

algorithmic and theoretical viewpoints. In particular, all learning al-

剩余240页未读，继续阅读

surpass2007

粉丝: 0
资源: 12

工程师指南：机器学习基础与应用简介

A Brief Introduction to Machine Learning for Engineers

A_Brief_Introduction_to_Machine_Learning_for_Engineers.pdf

Give a brief introduction to a novel including the following aspects: - genre - characters - plot - setting - theme

Provide a brief introduction to UML and its importance in software development.

bfd and dfs introduction pdf

https://arxiv.org/pdf/1608.04644.pdf

Brief Introduction to the Common Applications of Low Power Wide Area Network Communication Technology

the python workbook: a brief introduction with exercises and solutions, seco

学报的参考文献格式怎么写

最新资源