SLOMS：面向多敏感属性的隐私保护数据发布方法

需积分: 9 164 浏览量更新于2024-09-15 收藏 546KB PDF 举报

SLOMS: A Privacy Preserving Data Publishing Method for Multiple Sensitive Attributes Microdata 本文主要探讨了在处理多维度敏感数据时面临的挑战，特别是在保护隐私的同时保持数据可用性的问题。传统的多维桶化方法是匿名化多个敏感属性的一种常用手段，但当微数据包含众多敏感属性时，这种方法可能导致数据效用降低。此外，这些方法通常不考虑对准值标识符（Quasi-Identifiers, QI）的泛化，使得匿名数据容易受到关联攻击。为了解决这些问题，研究人员提出了SLOMS (Sensitive Lightweight Multi-Attribute anonymization System) 方法。SLOMS的核心思想是将多个敏感属性水平分割成若干个独立的表，每个表分别进行桶化处理以实现l-多样性(l-diversity)，这是一种旨在确保同一桶内的个体具有至少l种不同特征的策略，从而增加匿名性。同时，SLOMS还致力于对准值标识符进行泛化，通过满足k-anonymity原则来进一步增强数据的安全性，即每个匿名群体至少有k个个体，使得攻击者无法通过单个个体的QI确定其身份。为了实现SLOMS在带有多个敏感属性的微数据上的匿名化，论文还提出了MSB-KACA (Multi-Sensitive Bucket-based k-Anonymity-preserving Clustering Algorithm) 算法。该算法结合了垂直分割、桶化和泛化QI的技术，确保在保护隐私的同时尽可能地保持数据的有用性。MSB-KACA通过细致的聚类过程，有效地实现了对多维度敏感信息的匿名发布，降低了数据泄露的风险。实验部分展示了MSB-KACA的有效性和效率，通过对比分析，证明了SLOMS在处理多敏感属性场景下优于传统方法，能够在保护用户隐私和保持数据质量之间找到一个较好的平衡。这对于大数据时代中的隐私保护和数据共享至关重要，为企业和个人提供了一种更为稳健的数据发布策略。

SLOMS: A Privacy Preserving Data Publishing

Method for Multiple Sensitive Attributes

Microdata

Jianmin Han

Department of Computer Science and Technology

Zhejiang Normal University, Jinhua, 321004, Zhejiang, PRC

hanjm@zjnu.cn

Fangwei Luo，Jianfeng Lu and Hao Peng

Department of Computer Science and Technology

Zhejiang Normal University, Jinhua, 321004, Zhejiang, PRC

Abstract—Multi-dimension bucketization is a typical

method to anonymize multiple sensitive attributes. However,

the method leads to low data utility when microdata have

more sensitive attributes. In addition, the methods do not

generalize quasi-identifiers, which make the anonymous

data vulnerable to suffer from linked attacks. To address

the problems, the paper proposes a SLOMS method. The

method vertically partitions the multiple sensitive attributes

into several tables and bucketizes each sensitive attribute

table to implement l-diversity. At the same time, it

generalizes the quasi-identifiers to implement k-anonymity.

The paper also proposes a MSB-KACA algorithm to

anonymize microdata with multiple sensitive attributes by

SLOMS. Experiments show that SLOMS can generate

anonymous tables with less suppression ratio and less

distortion compared with generalization and MSB.

Index Terms—k-anonymity, l-diversity, multi-dimension

bucketization method, SLOMS

I. INTRODUCTION

Microdata play an increasingly important role in data

analysis and scientific research. However, publishing and

sharing microdata will threaten individuals’ privacy.

Therefore, some anonymity models have been proposed

to protect individual’s privacy for microdata publish

recently. k-anonymity [1] is a simple and effective

method to protect privacy in microdata, which requires

that each tuple has at least k indistinguishable tuples with

respect to quasi-identifier in the released data. But it

cannot resist homogeneity attack and background

knowledge attack, so some other enhanced anonymity

models have been proposed, such as l-diversity [4] and t-

closeness [5].

Several techniques have also been proposed to

implement the above anonymity models. Generalization

[1-3] is a typical one to implement anonymity model,

whose idea is to replace real value of quasi-identifier with

less specific but semantically consistent value.

Generalization distorts original data, which is

disadvantageous to data mining. Anatomy [6] is also a

fine method to anonymize microdata, whose idea is to

release all the quasi-identifier and sensitive values

directly in two separate tables. However, releasing the

QI-values directly may suffer from a higher breach

probability than generalization. To overcome these

drawbacks, Tao et al. [7] proposed ANGEL, a new

anonymization method that is as effective as

generalization in privacy protection, which can retain

higher data utility. Leela et al. [8] applied Angelization to

preserve privacy in re-publication of dynamic microdata

after insertions or deletions. Li et al. [9] proposed slicing,

which anonymizes microdata by partitioning microdata

horizontally and vertically. Neha et al. [10] concluded

that slicing preserves data utility better than

generalization, in addition, it also prevents membership

disclosure.

All of above works focus on microdata with single

sensitive attribute. These methods will lead to much low

data utility when they are directly used for microdata with

multiple sensitive attributes. At present, there is only a

few work concentrated on microdata with multiple

sensitive attributes. Yang et al. [11] proposed a Multiple

Sensitive Bucketization(MSB) approach. But the MSB

method is only suitable to deal with microdata with less

sensitive attributes, e.g, 2 to 3 sensitive attributes. For

microadata with more sensitive attributes, MSB would

result in high suppression ratios. For example, table I is

an original dataset. We assume that {Gender, ZipCode,

Age} are quasi-identifier attributes and {Occupation,

Salary, Physician, Disease} are sensitive attributes. We

can achieve a 3-diversity table by MSB, seeing table II.

The anonymity table only has one group with tuples {t

, t

} presented in table II, the rest tuples are all

suppressed. The suppression ratio is 6/9, which greatly

degrades the quality of data publishing.

3096

JOURNAL OF SOFTWARE, VOL. 8, NO. 12, DECEMBER 2013

doi:10.4304/jsw.8.12.3096-3104

下载后可阅读完整内容，剩余8页未读，立即下载

qq_30454051

粉丝: 0
资源: 1

SLOMS：面向多敏感属性的隐私保护数据发布方法

wx494社区门诊管理系统小程序-php+vue+uniapp.zip（可运行源码+sql文件+文档）

HTML+CSS+JS+JQ+Bootstrap的家具风格趋势展示响应式网页.7z

高分项目，基于Python+OpenCV的实时疲劳驾驶检测系统，内含源码+演示视频+部署教程

Privacy-Preserving Machine Learning Using Federated Learning and Secure Aggregation

keil报错startup_CMSDK_CM0.s(40): error: A1859E: Flag preserving form of this instruction not available

基于区块链的电子病历存储国外研究现状 文献

ros中edge_aware_filter

在文献Efficient and Privacy-Preserving Multi-Party Skyline Queries Over Encrypted Data中SMSQ协议的内容

sklearn.manifold

Pedersen (k, n) protocol

最新资源

基于区块链的电子病历存储国外研究现状文献