深度证据回归：网络不确定性估计新方法

PDF格式 | 3.48MB | 更新于2024-06-14 | 163 浏览量 | 举报

"TrustGeo参文2深入探讨了深度证据回归(Deep Evidential Regression)，这是一种用于网络不确定性测量的新方法，旨在使确定性神经网络在安全关键领域的应用中提供可靠、鲁棒且高效的不确定性估计。该论文由来自MIT CSAIL和哈佛生物物理学研究生院的研究人员共同撰写，提出了一种非贝叶斯神经网络训练策略，能够同时估计连续目标及其相关的证据，以学习两种类型的不确定性—— aleatoric（内在）和epistemic（模型）不确定性。" 深度证据回归(Deep Evidential Regression)是针对确定性神经网络的不确定性量化技术。传统的神经网络通常只能预测一个确定的输出，但在实际应用中，尤其是那些对安全性要求极高的领域，例如自动驾驶、医疗诊断等，了解预测结果的不确定性至关重要。Aleatoric不确定性源于数据本身的随机性，而epistemic不确定性则源于模型对未知数据的不完全理解。本论文提出的方法通过在原始高斯似然函数上施加证据先验，让神经网络学习证据分布的超参数，从而实现对不确定性估计。同时，在训练过程中，通过引入正则化项，当模型预测的证据与正确输出不一致时，可以对模型进行约束，提高其不确定性估计的准确性。一个显著的优点是，这种方法在推断阶段不需要采样，也不依赖于异常分布(OOD)样本进行训练，这意味着它可以更高效且可扩展地进行不确定性学习。实验结果显示，该方法能学习到良好校准的不确定性，这对于决策制定和风险评估来说是至关重要的。此外，深度证据回归的实现还可能涉及到概率编程和优化技术，如变分推理和最大似然估计，以及可能的损失函数设计，如Kullback-Leibler散度，来衡量模型预测的证据分布与实际数据的吻合程度。这些技术的综合运用，使得模型在保持预测性能的同时，也能提供有关其预测不确定性的有用信息。深度证据回归为深度学习模型提供了更全面的不确定性表示，有助于提升模型在复杂环境中的可靠性，并为实际应用提供更安全的决策支持。这一研究为未来不确定性建模和机器学习的进一步发展奠定了坚实的基础。

展开

For example, in Fig. 2A we visualize different evidential

NIG

distributions with varying model

parameters. We illustrate that by increasing the evidential parameters (i.e.

υ, α

) of this distribution,

the p.d.f. becomes tightly concentrated about its inferred likelihood function. Considering a single

parameter realization of this higher-order distribution (Fig. 2B), we can subsequently sample many

lower-order realizations of our likelihood function, as shown in Fig. 2C.

In this work, we use neural networks to infer, given an input, the hyperparameters,

, of this

higher-order, evidential distribution. This approach presents several distinct advantages compared to

prior work. First, our method enables simultaneous learning of the desired regression task, along with

aleatoric and epistemic uncertainty estimation, by enforcing evidential priors and without leveraging

any out-of-distribution data during training. Second, since the evidential prior is a higher-order NIG

distribution, the maximum likelihood Gaussian can be computed analytically from the expected

values of the

(µ, σ

)

parameters, without the need for sampling. Third, we can effectively estimate

the epistemic or model uncertainty associated with the network’s prediction by simply evaluating the

variance of our inferred evidential distribution.

3.2 Prediction and uncertainty estimation

The aleatoric uncertainty, also referred to as statistical or data uncertainty, is representative of

unknowns that differ each time we run the same experiment. The epistemic (or model) uncertainty,

describes the estimated uncertainty in the prediction. Given a

NIG

distribution, we can compute the

prediction, aleatoric, and epistemic uncertainty as

E[µ] = γ

| {z }

prediction

, E[σ

] =

α−1

| {z }

aleatoric

, Var[µ] =

υ(α−1)

| {z }

epistemic

. (5)

Complete derivations for these moments are available in Sec. S1.1. Note that

Var[µ] = E[σ

]/υ

which is expected as υ is one of our two evidential virtual-observation counts.

3.3 Learning the evidential distribution

Having formalized the use of an evidential distribution to capture both aleatoric and epistemic

uncertainty, we next describe our approach for learning a model to output the hyperparameters of this

distribution. For clarity, we structure the learning process as a multi-task learning problem, with two

distinct parts: (1) acquiring or maximizing model evidence in support of our observations and (2)

minimizing evidence or inﬂating uncertainty when the prediction is wrong. At a high level, we can

think of (1) as a way of ﬁtting our data to the evidential model while (2) enforces a prior to remove

incorrect evidence and inﬂate uncertainty.

(1) Maximizing the model ﬁt.

From Bayesian probability theory, the “model evidence”, or marginal

likelihood, is deﬁned as the likelihood of an observation,

, given the evidential distribution parame-

ters m and is computed by marginalizing over the likelihood parameters θ:

p(y

|m) =

p(y

|θ, m)p(θ|m)

p(θ|y

, m)

∞

µ=−∞

p(y

|µ, σ

)p(µ, σ

|m) dµ dσ

(6)

The model evidence is, in general, not straightforward to evaluate since computing it involves

integrating out the dependence on latent model parameters. However, in the case of placing a

NIG

evidential prior on our Gaussian likelihood function an analytical solution does exist:

p(y

|m) = St



; γ,

β(1 + υ)

υ α

, 2α



. (7)

where



y; µ

, σ

, υ



is the Student-t distribution evaluated at

with location

, scale

, and

degrees of freedom. We denote the loss,

NLL

(w)

, as the negative logarithm of model evidence

NLL

(w) =

log





− α log(Ω) +



α +



log((y

− γ)

υ + Ω) + log



Γ(α)

Γ(α+

)



(8)

where

Ω = 2β(1 + υ)

. Complete derivations for Eq. 7 and Eq. 8 are provided in Sec. S1.2. This

loss provides an objective for training a NN to output parameters of a

NIG

distribution to ﬁt the

observations by maximizing the model evidence.

下载后可阅读完整内容，剩余19页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

路由跳变

粉丝: 3361

深度证据回归：网络不确定性估计新方法

evidential-deep-learning:使用神经网络学习快速，可扩展且经过校准的不确定性度量！

掌握测量的不确定性因素

Understanding evidential reasoning

计及非定量不确定性的多种群遗传电网扩展规划 (2000年)

State Estimation Method Based on Evidential Reasoning Rule

Evidential method to identify influentialnodes in complex networks

Evidential-regressiontbm_reg.zip_zip

An evidential DEMATEL method to identify critical success factors in emergency management

Weapon System Capability Assessment under uncertainty based on the evidential reasoning approach

D-S证据理论(D-S Evidential Theory)相关资料源码打包

最新资源