没有合适的资源?快使用搜索试试~ 我知道了~
首页A Logic for Inductive Probabilistic Reasoning
A Logic for Inductive Probabilistic Reasoning
需积分: 10 8 下载量 35 浏览量
更新于2023-03-16
评论
收藏 353KB PDF 举报
是关于归纳逻辑和推理逻辑的理论,对开发知识库系统有重要的应用价值,是一本难得的好书!
资源详情
资源评论
资源推荐
A Logic for Inductive Probabilistic Reasoning
Manfred Jaeger
Department for Computer Science, Aalborg University
Fredrik Bajers Vej 7E, DK-9220 Aalborg Ø
jaeger@cs.aau.dk
Abstract
Inductive probabilistic reasoning is understood as the application of in-
ference patterns that use statistical background information to assign (sub-
jective) probabilities to single events. The simplest such inference pattern is
direct inference: from “70% of As are Bs” and “a is an A” infer that a is a
B with probability 0.7. Direct inference is generalized by Jeffrey’s rule and
the principle of cross-entropy minimization. To adequately formalize induc-
tive probabilistic reasoning is an interesting topic for artificial intelligence, as
an autonomous system acting in a complex environment may have to base
its actions on a probabilistic model of its environment, and the probabilities
needed to form this model can often be obtained by combining statistical
background information with particular observations made, i.e. by inductive
probabilistic reasoning.
In this paper a formal framework for inductive probabilistic reasoning is
developed: syntactically it consists of an extension of the language of first-
order predicate logic that allows to express statements about both statistical
and subjective probabilities. Semantics for this representation language are
developed that give rise to two distinct entailment relations: a relation |=
that models strict, probabilistically valid, inferences, and a relation |≈ that
models inductive probabilistic inferences. The inductive entailment relation
is obtained by implementing cross-entropy minimization in a preferred model
semantics. A main objective of our approach is to ensure that for both en-
tailment relations complete proof systems exist. This is achieved by allowing
probability distributions in our semantic models that use non-standard prob-
ability values. A number of results are presented that show that in several
important aspects the resulting logic behaves just like a logic based on real-
valued probabilities alone.
1
1 Introduction
1.1 Inductive Probabilistic Reasoning
Probabilities come in two kinds: as statistical probabilities that describe relative
frequencies, and as subjective probabilities that describe degrees of belief. To both
kinds of probabilities the same rules of probability calculus apply, and notwith-
standing a long and heated philosophical controversy over what constitutes the
proper meaning of probability (de Finetti 1937, von Mises 1951, Savage 1954,
Jaynes 1978), few conceptual difficulties arise when we deal with them one at a
time.
However, in commonsense or inductive reasoning one often wants to use both
subjective and statistical probabilities simultaneously in order to infer new prob-
abilities of interest. The simplest example of such a reasoning pattern is that of
direct inference (Reichenbach 1949, §72),(Carnap 1950, §94), illustrated by the
following example: from
2.7% of drivers whose annual mileage is between 10,000 and 20,000
miles will be involved in an accident within the next year
(1)
and
Jones is a driver whose annual mileage is between 10,000 and
20,000 miles
(2)
infer
The probability that Jones will be involved in an accident within
the next year is 0.027.
(3)
The percentage 2.7 in (1) is a statistical probability: the probability that a driver
randomly selected from the set of all drivers with an annual mileage between
10,000 and 20,000 will be involved in an accident. The probability in (3), on the
other hand, is attached to a proposition that, in fact, is either true or false. It
describes a state of knowledge or belief, for which reason we call it a subjective
probability.
1
Clearly, the direct inference pattern is very pervasive: not only does an insur-
ance company make (implicit) use of it in its computation of the rate it is willing to
offer a customer, it also underlies some of the most casual commonsense reasoning
(“In very few soccer matches did a team that was trailing 0:2 at the end of the
first half still win the game. My team is just trailing 0:2 at halftime. Too bad”.),
as well as the use of probabilistic expert systems. Take a medical diagnosis system
implemented by a Bayesian network (Pearl 1988, Jensen 2001), for instance: the
distribution encoded in the network (whether specified by an expert or learned
from data) is a statistical distribution describing relative frequencies in a large
1
Other names for this type of probability are “probability of the single case”(Reichenbach
1949), “probability
1
”(Carnap 1950), “propositional probability”(Bacchus 1990b).
2
number of past cases. When using the system for the diagnosis of patient Jones,
the symptoms that Jones exhibits are entered as evidence, and the (statistical)
probabilities of various diseases conditioned on this evidence are identified with
the probability of Jones having each of these diseases.
Direct inference works when for some reference class C and predicate P we
are given the statistical probability of P in C, and for some singular object e all
we know is that e belongs to C. If we have more information than that, direct
inference may no longer work: assume in addition to (1) and (2) that
3.1% of drivers whose annual mileage is between 15,000 and 25,000
miles will be involved in an accident within the next year
(4)
and
Jones is a driver whose annual mileage is between 15,000 and
25,000 miles
. (5)
Now direct inference can be applied either to (1) and (2), or to (4) and (5), yielding
the two conflicting conclusions that the probability of Jones having an accident
is 0.027 and 0.031. Of course, from (1),(2), (4), and (5) we would infer neither,
and instead ask for the percentage of drivers with an annual mileage between
15,000 and 20,000 that are involved in an accident. This number, however, may
be unavailable, in which case direct inference will not allow us to derive any
probability bounds for Jones getting into an accident. This changes if, at least,
we know that
Between 2.7% and 3.1% of drivers whose annual mileage is between
15,000 and 20,000 miles will be involved in an accident within the
next year.
(6)
From (1),(2), and (4)-(6) we will at least infer that the probability of Jones having
an accident lies between 0.027 and 0.031. This no longer is direct inference proper,
but a slight generalization thereof.
In this paper we will be concerned with inductive probabilistic reasoning as a
very broad generalization of direct inference. By inductive probabilistic reason-
ing, for the purpose of this paper, we mean the type of inference where statis-
tical background information is used to refine already existing, partially defined
subjective probability assessments (we identify a categorical statement like (2)
or (5) with the probability assessment: “with probability 1 is Jones a driver
whose. . . ”). Thus, we here take a fairly narrow view of inductive probabilistic
reasoning, and, for instance, do not consider statistical inferences of the following
kind: from the facts that the individuals jones
1
, jones
2
, . . . , jones
100
are drivers,
and that jones
1
, . . . , jones
30
drive less and jones
31
, . . . , jones
100
more than 15,000
miles annually, infer that 30% of drivers drive less than 15,000 miles. Generally
speaking, we are aiming at making inferences only in the direction from statis-
tical to subjective probabilities, not from single-case observations to statistical
probabilities.
3
Problems of inductive probabilistic reasoning that go beyond the scope of
direct inference are obtained when the subjective input-probabilities do not express
certainties:
With probability 0.6 is Jones a driver whose annual mileage is
between 10,000 and 20,000 miles.
(7)
What are we going to infer from (7) and the statistical probability (1) about the
probability of Jones getting into an accident? There do not seem to be any sound
arguments to derive a unique value for this probability; however, 0.6 · 0.027 =
0.0162 appears to be a sensible lower bound. Now take the subjective input
probabilities
With probability 0.6 is Jones’s annual mileage between 10,000 and
20,000 miles, and with probability 0.8 between 15,000 and 25,000
miles.
(8)
Clearly, it’s getting more and more difficult to find the right formal rules that
extend the direct inference principle to such general inputs.
In the guise of inductive probabilistic reasoning as we understand it, these
generalized problems seem to have received little attention in the literature. How-
ever, the mathematical structure of the task we have set ourselves is essentially
the same as that of probability updating: in probability updating we are given a
prior (usually subjective) probability distribution representing a state of knowl-
edge at some time t, together with new information in the form of categorical
statements or probability values; desired is a new posterior distribution describ-
ing our knowledge at time t + 1, with the new information taken into account.
A formal correspondence between the two problems is established by identifying
the statistical and subjective probability distributions in inductive probabilistic
inference with the prior and posterior probability distribution, respectively, in
probability updating.
The close relation between the two problems extends beyond the formal simi-
larity, however: interpreting the statistical probability distribution as a canonical
prior (subjective) distribution, we can view inductive probabilistic reasoning as a
special case of probability updating. Methods that have been proposed for prob-
ability updating, therefore, also are candidates to solve inductive probabilistic
inference problems.
For updating a unique prior distribution on categorical information, no viable
alternative exists to conditioning: the posterior distribution is the prior condi-
tioned on the stated facts
2
. Note that conditioning, seen as a rule for inductive
reasoning, rather than probability updating, is just direct inference again.
As our examples already have shown, this basic updating/inductive reasoning
problem can be generalized in two ways: first, the new information may come
2
Lewis (1976) proposes imaging as an alternative to conditioning, but imaging requires a
similarity measure on the states of the probability space, which usually cannot be assumed as
given.
4
in the form of probabilistic constraints as in (7), not in the form of categorical
statements; second, the prior (or statistical) information may be incomplete, and
only specify a set of possible distributions as in (6), not a unique distribution. The
problem of updating such partially defined beliefs has received considerable atten-
tion, e.g. (Dempster 1967, Shafer 1976, Walley 1991, Dubois & Prade 1997, Gilboa
& Schmeidler 1993, Moral & Wilson 1995, Grove & Halpern 1998). The simplest
approach is to apply an updating rule for unique priors to each of the distribu-
tions that satisfy the prior constraints, and to infer as partial posterior beliefs
only probability assignments that are valid for all updated possible priors. In-
ferences obtained in this manner can be quite weak, and other principles have
been explored where updating is performed only on a subset of possible priors
that are in some sense maximally consistent with the new information (Gilboa &
Schmeidler 1993, Dubois & Prade 1997). These methods are more appropriate
for belief updating than for inductive probabilistic reasoning in our sense, because
they amount to a combination of prior and new information on a more or less sym-
metric basis. As discussed above, this is not appropriate in our setting, where the
new single case information is not supposed to have any impact on the statistical
background knowledge. Our treatment of incompletely specified priors, therefore,
follows the first approach of taking every possible prior (statistical distribution)
into account. See section 4.1 for additional comments on this issue.
The main problem we address in the present paper is how to deal with new
(single case) information in the form of general probability constraints. For this
various rules with different scope of application have previously been explored. In
the case where the new constraints prescribe the probability values p
1
, . . . , p
k
of
pairwise disjoint alternatives A
1
, . . . , A
k
, Jeffrey’s rule (Jeffrey 1965) is a straight-
forward generalization of conditioning: it says that the posterior should be the
sum of the conditional distributions given the A
i
, weighted with the prescribed
values p
i
. Applying Jeffrey’s rule to (1) and (7), for instance, we would obtain
0.6 · 0.027 + 0.4 · r as the probability for Jones getting into an accident, where r is
the (unspecified) statistical probability of getting into an accident among drivers
who do less than 10,000 or more than 20,000 miles.
When the constraints on the posterior are of a more general form than permit-
ted by Jeffrey’s rule, there no longer exist updating rules with a similarly intuitive
appeal. However, a number of results indicate that cross-entropy minimization is
the most appropriate general method for probability updating, or inductive proba-
bilistic inference (Shore & Johnson 1980, Paris & Vencovsk´a 1992, Jaeger 1995b).
Cross-entropy can be interpreted as a measure for the similarity of two prob-
ability distributions (originally in an information theoretic sense (Kullback &
Leibler 1951)). Cross-entropy minimization, therefore, is a rule according to which
the posterior (or the subjective) distribution is chosen so as to make it as sim-
ilar as possible within the given constraints to the prior (resp. the statistical)
distribution.
Inductive probabilistic reasoning as we have explained it so far clearly is a
topic with its roots in epistemology and the philosophy of science rather than in
5
剩余59页未读,继续阅读
东南大学崇志宏
- 粉丝: 131
- 资源: 41
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- ExcelVBA中的Range和Cells用法说明.pdf
- 基于单片机的电梯控制模型设计.doc
- 主成分分析和因子分析.pptx
- 共享笔记服务系统论文.doc
- 基于数据治理体系的数据中台实践分享.pptx
- 变压器的铭牌和额定值.pptx
- 计算机网络课程设计报告--用winsock设计Ping应用程序.doc
- 高电压技术课件:第03章 液体和固体介质的电气特性.pdf
- Oracle商务智能精华介绍.pptx
- 基于单片机的输液滴速控制系统设计文档.doc
- dw考试题 5套.pdf
- 学生档案管理系统详细设计说明书.doc
- 操作系统PPT课件.pptx
- 智慧路边停车管理系统方案.pptx
- 【企业内控系列】企业内部控制之人力资源管理控制(17页).doc
- 温度传感器分类与特点.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0