ProbabilisticInferenceUsingMarkovChainMonteCarloMethods

Markov

3星 · 超过75%的资源需积分: 10 189 浏览量更新于2023-03-03 评论收藏 1.08MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

Probabilistic Inference Using

Markov Chain Monte Carlo Metho ds

Radford M. Neal

Technical Report CRG-TR-93-1

Department of Computer Science

UniversityofToronto

E-mail: radford@cs.toronto.edu

25 September 1993



Abstract

Probabilistic inference is an attractive approach to uncertain reasoning and em-

pirical learning in articial intelligence. Computational diculties arise, however,

because probabilistic models with the necessary realism and exibility lead to com-

plex distributions over high-dimensional spaces.

Related problems in other elds have been tackled using Monte Carlo metho ds based

on sampling using Markovchains, providing a rich arrayoftechniques that can b e

applied to problems in articial intelligence. The \Metropolis algorithm" has been

used to solve dicult problems in statistical physics for over fortyyears, and, in the

last few years, the related metho d of \Gibbs sampling" has b een applied to problems

of statistical inference. Concurrently, an alternative method for solving problems

in statistical physics by means of dynamical simulation has been developed as well,

and has recently been unied with the Metrop olis algorithm to produce the \hybrid

Monte Carlo" method. In computer science, Markovchain sampling is the basis

of the heuristic optimization technique of \simulated annealing", and has recently

been used in randomized algorithms for approximate counting of large sets.

In this review, I outline the role of probabilistic inference in articial intelligence,

present the theory of Markovchains, and describ e various Markovchain Monte

Carlo algorithms, along with a number of supporting techniques. I try to presenta

comprehensive picture of the range of methods that have been developed, including

techniques from the varied literature that have not yet seen wide application in

articial intelligence, but which app ear relevant. As illustrative examples, I use the

problems of probabilistic inference in exp ert systems, discovery of latent classes from

data, and Bayesian learning for neural networks.

Acknowledgements

I thank David MacKay, Richard Mann, Chris Williams, and the members of my

Ph.D committee, Georey Hinton, Rudi Mathon, Demetri Terzop oulos, and Rob

Tibshirani, for their helpful comments on this review. This work was supp orted

by the Natural Sciences and Engineering Research Council of Canada and by the

Ontario Information Technology Research Centre.

1. Intro duction

Probabilityisawell-understo od method of representing uncertain knowledge and reasoning

to uncertain conclusions. It is applicable to low-level tasks such as p erception, and to high-

level tasks such as planning. In the Bayesian framework, learning the probabilistic models

needed for such tasks from empirical data is also considered a problem of probabilistic in-

ference, in a larger space that encompasses various possible models and their parameter

values. To tackle the complex problems that arise in articial intelligence, exible meth-

ods for formulating mo dels are needed. Techniques that have been found useful include

the sp ecication of dep endencies using \b elief networks", approximation of functions using

\neural networks", the introduction of unobservable \latentvariables", and the hierarchical

formulation of models using \hyperparameters".

Such exible mo dels come with a price however. The probability distributions they give rise

to can b e very complex, with probabilities varying greatly over a high-dimensional space.

There maybenoway to usefully characterize such distributions analytically. Often, however,

a sample of p oints drawn from such a distribution can provide a satisfactory picture of it.

In particular, from such a sample we can obtain

Monte Carlo

estimates for the expectations

of various functions of the variables. Suppose

;

...

is the set of random

variables that characterize the situation b eing mo deled, taking on values usually written as

;

...

, or some typographical variation thereon. These variables might, for example,

represent parameters of the model, hidden features of the ob jects mo deled, or features of

ob jects that may b e observed in the future. The expectation of a function

(

;

...

)

| it's average value with respect to the distribution over

| can b e approximated by



;

...

;

)

(

;

...

) (1.1)





(

)

;

...

(

)

) (1.2)

where

(

)

;

...

(

)

are the values for the

-th point in a sample of size

. (As ab ove, I will

often distinguish variables in summations using tildes.) Problems of prediction and decision

can generally be formulated in terms of nding such expectations.

Generating samples from the complex distributions encountered in articial intelligence

applications is often not easy,however. Typically, most of the probability is concentrated

in regions whose volume is a tiny fraction of the total. To generate points drawn from

the distribution with reasonable eciency, the sampling procedure must search for these

relevant regions. It must do so, moreover, in a fashion that does not bias the results.

Sampling metho ds based on

Markov chains

incorporate the required search aspect in a

framework where it can be proved that the correct distribution is generated, at least in

the limit as the length of the chain grows. Writing

(

)

(

)

;

...

(

)

for the set of

variables at step

, the chain is dened by giving an initial distribution for

(0)

and the

transition probabilities for

(

)

given the value for

(



. These probabilities are chosen

so that the distribution of

(

)

converges to that for

increases, and so that the

Markovchain can feasibly b e simulated by sampling from the initial distribution and then,

in succession, from the conditional transition distributions. For a suciently long chain,

equation (1.2) can then b e used to estimate exp ectations.

剩余143页未读，继续阅读

sino2010m

2018-12-13

扫描得很不清楚。

sjxhill

粉丝: 0
资源: 1

会员权益专享

Probabilistic Inference Using Markov Chain Monte Carlo Methods

评论2

会员权益专享

最新资源

Probabilistic Inference Using Markov Chain Monte Carlo Methods

评论2

Probabilistic Inference Using MCMC

Automating Inference, Learning, and Design using probabilistic programming

Practical Probabilistic Programming(Manning,2016)

变分自编码器所涉及的专业词汇

system-level-RUL uncertainty

Python 粒子滤波

ucrl代替ping

inference in Bayes

print(infer.query(['G'],evidence={'D':0,'I':1})['G'])

denoising diffusion probabilistic models

使用diffusion probabilistic model 生成时间序列的python代码

probabilistic heat-map

probabilistic reversal learning task

probabilistic-LoS是什么意思

强化学习pilco算法

Probabilistic Iterative Improvement (PII)

robot programming by demonstration: a probabilistic approach源码

深入解读:从ddim到improved denoising diffusion probabilistic models

probabilistic matrix factorization

会员权益专享

最新资源