差异性隐私的界限：理论到实践的探索

需积分: 5 39 浏览量更新于2024-08-16 收藏 1.7MB PDF 举报

"探索差异性隐私的隐私约束：从理论到实践" 本文主要探讨了差异性隐私（Differential Privacy）这一隐私保护的新范式，并着重分析了其隐私约束的理论与实践应用。差异性隐私旨在在保护个体数据的同时，允许数据分析和挖掘的进行，无论对手具有何种先验知识。近年来，尽管已经提出了一些方法来确定特定隐私风险的上限，但这些方法仍存在局限性，如依赖数据规模或提供的保护可能过于宽松，这限制了它们在实际中的应用。作者Xianmang He、Yuan Hong和Yindong Chen分别来自东莞理工大学、伊利诺伊理工学院和汕头大学的计算机科学与工程领域，他们提出了一种创新的方法，将差异性隐私中的隐私约束（epsilon，通常表示为ε）转换为普通用户可以理解的隐私风险。这种方法有望解决现有方法的不足，使得数据所有者能更好地理解和选择适合的ε值。在理论部分，文章深入研究了如何通过数学模型和统计分析来量化ε的边界，以及它如何影响数据发布后的隐私泄露程度。此外，作者还可能讨论了ε的选择对数据可用性和隐私保护之间的权衡，这对于在保护个人隐私的同时确保数据集的实用性至关重要。在实践方面，文章可能涉及了如何在真实世界的数据发布场景中应用这种新的转换方法，包括如何评估和调整ε值以适应不同情境下的隐私需求。此外，还可能介绍了如何通过实验和案例研究验证该方法的有效性，以及它在实际数据处理中的性能和效果。这篇研究论文为数据所有者提供了一个更直观的框架，帮助他们理解差异性隐私的隐私约束，并做出更为明智的决策，以达到理想的隐私保护水平。同时，这也为未来的研究和开发提供了新的思路，以便进一步改进和优化差异性隐私的实施策略。

Exploring the Privacy Bound for Differential Privacy: From Theory to Practice

probability of successfully identifying the individuals

from the counting queries to be no greater than

ρ ≤

(maximum inference probability). In order to

achieve this protection goal with the existing theoretical

result [15] (more details will be given in Section

3), we thus have: the upper bound of diﬀerential

privacy budget  should satisfy  ≤

∆f

∆v

(n−1)ρ

1−ρ

, where

n represents the number of records, ∆f is the sensitivity

of query, ∆v is the maximum distance between function

values of every possible world (the same information

needed to calculate global sensitivity) [15], and ρ is the

maximum inference probability. Then, if given n =

600, 000 records, the bound yields

 ≤

(600, 000 − 1)(

)

1 −

≈ 11.1 (1)

where ∆v is no greater than 1 for count queries.

In the above example, the upper bound of  would

be 11.1, which might exceed our expectation. In

other words, such large  can satisfy ρ ≤

in their

interpretive inference model but can be vulnerable in

other cases (for instance, in our proposed interpretive

inference model, =11.1 would result in ρ >

) (higher

privacy risks than the data owners’ demand).

Furthermore, in the interpretive inference model

proposed in [15],  is proportional to ln(n) where n is

data size. As n increases, the upper bound of  also

increases. In case of a large or small n, the derived

bound would be meaningless (unbounded or negative).

From the above examples, we can see that existing

solutions have their inherent drawbacks. Motivated

by such observations, we propose a novel interpretive

inference model, which can be used to evaluate the

probability or conﬁdence that the adversary will

be able to identify any individual from the noise-

injected queries over the dataset. This enables us to

understand the privacy implications of diﬀerentially

private techniques in a much clearer way.

1.3. Our Contributions

The major contributions of this paper are summarized

as follows.

• This paper presents a novel interpretive inference

model to convert the privacy bound  in

diﬀerential privacy to inference probabilities that

any individual is included in the input data

(for queries). The proposed interpretive inference

model and converted inference probabilities have

addressed the drawbacks of the existing models

[15, 24].

• Based on the proposed interpretive inference

model, we present an instantiation for choosing

appropriate  (maximum privacy bound in dif-

ferential privacy), which should eﬀectively bound

the risks of inferring the presence/absence of indi-

viduals (given the maximum inference probabil-

ity) in generic diﬀerentially private algorithms.

• An in-depth theoretical analysis of our approach

is provided, and a set of experiments are

conducted to conﬁrm the eﬀectiveness of our

approach.

The rest of the paper is organized as follows. In

Section 2, we describe the preliminaries for diﬀerential

privacy. In Section 3, we present the analysis for two

representative existing works. Then, in Section 4, we

propose our interpretive inference model and the upper

bound for  in diﬀerential privacy (given the maximum

inference probability). Section 5 demonstrates the

experimental results, and Section 6 reviews related

work. Finally, Section 7 gives the concluding remarks.

2. Preliminaries

In this section, we will ﬁrst describe the basic

mechanism of diﬀerential privacy, and then present

the Laplace distribution which contributes to a generic

diﬀerentially private approach.

2.1. Diﬀerential Privacy

The most commonly-used deﬁnition of diﬀerential

privacy is -diﬀerential privacy, which guarantees that

any individual tuple has negligible inﬂuence on the

published statistical results, in a probabilistic sense.

Speciﬁcally, a randomized algorithm A satisﬁes -

diﬀerential privacy if and only if for any two databases

, D

that diﬀer in exactly one record, and any possible

output O of A, the ratio between the probability that A

outputs O on D

and the probability that A outputs O

on D

is bounded by a constant. Formally, we have

|P rob(A(D

) = O)|

|P rob(A(D

) = O)|

≤ e



(2)

where  is a constant speciﬁed by the user, D

, D

diﬀer in at most one element, and e is the base of

the natural logarithms. Intuitively, given the output

O of A, it is hard for the adversary to infer whether

the original data is D

or D

, if the parameter  is

suﬃciently small. Similarly, -diﬀerential privacy also

provides any individual with plausible deniability that

her/his record was in the databases.

The earliest and most widely-adopted approach

for enforcing -diﬀerential privacy is the Laplace

mechanism [4], which works by injecting random noise

x ∝ lap(λ) that follows a Laplace distribution into

the output of the original O, and the deterministic

algorithm A obtains its randomized version O + x, that

is, A(D) = O + x, where λ =



EAI Endorsed Transactions on

Security and Safety

12 2018 - 01 2019 | Volume 5 | Issue 18 | e2

剩余10页未读，继续阅读

weixin_38735887

粉丝: 3

差异性隐私的界限：理论到实践的探索

维护隐私保护差异性隐私的单调性的摄动范式

大数据安全与隐私保护.pdf

【MySQL数据库设计：从理论到实践】：理论与案例的深度结合

【神经网络优化技术深度分析】：从理论到实践

数据挖掘与机器学习：从理论到实践的完整流程

软件项目管理秘诀大公开：从理论到实践的完美蜕变

【物联网中的最优估计】：从理论到实践的全攻略

【操作系统项目指导】：从理论到实践的项目开发全过程

【供应链优化实战】：从理论到实践，Lingo的应用精髓

多模态学习的12大关键策略：从理论到实践的终极指南

最新资源