亚马逊公平与可解释AI白皮书：提升模型透明度与公平性

需积分: 21 199 浏览量更新于2024-07-09 2 收藏 1.57MB PDF 举报

"亚马逊人工智能公平性与可解释性白皮书" 随着科技的飞速发展，机器学习和数据驱动系统在金融服务、医疗保健、教育和人力资源等多个领域发挥着关键作用。这些技术通过提升决策的精确性、效率和成本效益，展现出强大的潜力。然而，伴随着这种进步，确保人工智能（AI）的公平性和可解释性变得尤为重要。亚马逊发布了这份白皮书，作为Amazon SageMaker Clarify的一项技术参考文档，目的是协助AWS客户在使用该平台功能时，识别潜在的模型偏见，并理解模型预测背后的逻辑。 SageMaker Clarify是一个强大的工具，它在数据预处理、模型训练后以及部署阶段都能发挥作用。它关注用户指定的特征，例如检查年龄相关偏见是否存在于原始数据集或训练后的模型中。该工具提供详尽的报告，量化不同类型的潜在偏差，帮助用户全面评估模型的公正性。此外，它还包含了特性重要性图表，这些图表有助于解释模型预测的依据，为内部演示或找出需要改进的地方提供支持。白皮书特别针对数据科学家和机器学习工程师设计，但其内容的广泛性使得产品管理者、合规团队和其他利益相关者也能从中获益。它详细探讨了公平性和可解释性的概念，以及如何在实际应用中实现这两个目标。白皮书中还包含了关于限制和最佳实践的深入讨论，以帮助用户更好地理解和优化他们的AI解决方案，确保它们不仅高效，而且公正且透明。在当今AI盛行的时代，理解并管理模型的公平性和可解释性不仅是道德责任，也是法律要求。这份白皮书对于任何希望在利用AI力量的同时避免潜在风险的组织来说，都是一个不可或缺的参考资料。通过遵循白皮书中的指南和最佳实践，企业可以增强其AI项目的信任度，推动可持续的增长和创新。"

diﬀerent the distribution P

is from P

. KL is essentially a measure of entropy. For all three categories, denoted by x, we

compute the ratio log[P

(x)/P

(x)], which is a measure of distance between the probability distributions for a label x.

We then take the probability weighted sum of this measure, weighted by the probability of P

(x), which gives the KL

measure of divergence of class a from class d. The measure is also a label imbalance metric and is denoted as

This metric is non-negative, KL >= 0.

(x) >0 for all x, else the metric is not deﬁned.

We say that it gives the distance of P

from P

.

This measure is not symmetric (reversing the distributions

gives a diﬀerent result), but is still a meaningful measure

of diﬀerence in label distributions.

[4] Jensen-Shannon divergence (JS): Denoting the average of the label distributions of the two classes as P, we can

compute the JS divergence as the average of the KL divergence of the probability distribution of the ﬁrst class vs. P

andthe KL divergence of the probability distribution of the second class vs. P. This is an extension of the KL divergence

measure for label imbalance. In contrast to the KL divergence, this metric is symmetric and bounded above. The

measure is computed as

This metric is non-negative, and bounded above by ln(2),

that is, 0 <= JS <= ln(2), assuming that natural logarithm

is used in the KL divergence computation.

It provides a symmetric diﬀerence between label

distributions P

and P

.

[5] L

-Norm (LP): Another measure of distance in label distributions is the normed direct distance between the

distributions. For every label category, e.g., x = {rejected, wait-listed, accepted} in college admissions, we take the

diﬀerence and take the p-polynomial mean, as follows

This metric is non-negative, LP >= 0.

In this example, there will be three items to be summed

up.

[6] Total variation distance (TVD): this is half the L

-norm of the diﬀerence between the probability distribution of

labels of the ﬁrst class and the probability distribution of labels of the second class.

This metric is non-negative, TVD >= 0.

This is a special case of the LP metric where p=1.

[7] Kolmogorov-Smirnov (KS), two-sample approximated version: This metric evaluates the KS statistical test between

the probability distribution of labels of the ﬁrst class and the probability distribution of labels of the second class.This

metric indicates whether there is a big divergence in one of the labels across classes. It complements the other

measures by zoning in on the most imbalanced label.

This statistic is in the the range (0,1).

[8] Conditional Demographic Disparity in Labels (CDDL): This metric examines disparity of outcomes (labels) between

two classes, 1 and 2, but it also examines this disparity in subgroups, by stratifying the data using a “group”

variable.The metric examines whetherthe second class has a bigger proportion of the rejected outcomes than

theproportion of accepted outcomes for the same class. For example, in the case of college admissions, if class 2

剩余16页未读，继续阅读

syp_net

粉丝: 158
资源: 1187

亚马逊公平与可解释AI白皮书：提升模型透明度与公平性

Aws_Amazon安全白皮书

WAIC 2020世界人工智能大会-亚马逊白皮书合集.zip

【亚马逊云科技】人工智能与数据要素竞赛白皮书2023

【IDC&亚马逊云科技】2024生成式AI白皮书.pdf

人工智能的全方位部署：亚马逊AI（15页）.zip

亚马逊AWS人工智能与深度学习服务概览.pdf

亚马逊AWS云安全白皮书概述

YOLO训练集动态更新的道德考量：确保模型的公平性和可解释性，打造负责任的模型

MXNet中的神经网络可解释性与模型解释

可解释人工智能具体应用案例

最新资源