BlockFLA：区块链驱动的问责联邦学习机制

版权申诉

85 浏览量更新于2024-08-19 收藏 1.48MB PDF 举报

BlockFLA: Accountable Federated Learning通过混合区块链架构实现可信联邦学习 BlockFLA是一项旨在增强联邦学习（FL）隐私性和安全性的重要研究，特别针对数据保护需求高的场景。FL原本是一种分布式机器学习方法，允许多个参与方在不共享数据的情况下协作训练模型，从而维护用户数据隐私。然而，这一优势也带来了潜在风险，攻击者可能利用模型训练过程中数据的隐蔽性植入后门，使得模型在后续任务中出现误分类。为了应对这种威胁，BlockFLA提出了一种创新的方法，即利用区块链技术来增强联邦学习的透明度和问责制。区块链以其分布式、去中心化和不可篡改的特性，可以作为一种安全的信任基础设施。它能追踪模型训练过程中的所有参与者行为，记录每个阶段的模型更新，以及验证数据共享的合规性。 BlockFLA的核心在于采用混合区块链架构，结合了公有链（如比特币或以太坊）的透明性与私有链（如联盟链）的高效性和隐私保护。公有链用于存储基础的交易信息和模型历史，确保所有参与者都能看到模型训练的公开记录，同时保护敏感信息不被未经授权访问。而私有链则用于实际的模型训练和数据交互，仅限授权的参与者访问，以提高性能并保护数据隐私。研究人员哈什·比马尔·德赛、穆斯塔法·萨法·奥兹达伊和默特·坎塔尔西奥卢在得克萨斯大学达拉斯分校共同开发了这个系统。他们强调，尽管现有的工作试图通过设计更健壮的聚合函数来对抗后门攻击，但随着时间推移，攻击手段可能会变得更复杂。因此，引入区块链作为问责机制，有助于建立一个更加安全、透明的联邦学习环境，使得模型训练过程中的任何异常行为都能被追踪和识别，从而有效防止恶意攻击。 BlockFLA为联邦学习提供了一种新的解决方案，通过区块链技术确保数据隐私的同时，增强了系统的安全性和可追溯性，是保障未来联邦学习应用可靠性和信任的重要一步。随着区块链与AI的深度融合，这样的研究对于推动行业标准和最佳实践的发展具有深远意义。

BlockFLA: Accountable Federated Learning

via Hybrid Blockchain Architecture Conference’17, July 2017, Washington, DC, USA

desired performance metric (e.g., accuracy) on a validation dataset

maintained by the server.

2.2 Backdoor Attacks and Model Poisoning

Training time attacks against machine learning models can roughly

be classied into two categories: targeted [

], and untar-

geted attacks [

]. In untargeted attacks, the adversarial task is to

make the model converge to a sub-optimal minima, or to make the

model completely diverge. Such attacks have been also referred as

convergence attacks, and to some extend, they are easily detectable

by observing the model’s accuracy on a validation data.

On the other hand, in targeted attacks, adversary wants the

model to misclassify only a set of chosen samples while minimally

aecting its performance on the main task. Such targeted attacks

are also known as backdoor attacks. A prominent way of carrying

backdoor attacks is through trojans [

]. A trojan is a carefully

crafted pattern that is leveraged to cause the desired misclassica-

tion. For example, consider a classication task over cars and planes

and let the adversarial task be making the model classify blue cars

as planes. Then, adversary could craft a brand logo, put it on some

of the blue car samples in the training dataset, and only mislabel

those as plane. Then, potentially, model would learn to classify blue

cars with the brand logo as plane. At the inference time, adversary

can present a blue car sample with the logo to the model to activate

the backdoor. Ideally, since the model would behave correctly on

blue cars that do not have the trojan, it would not be easy to detect

the backdoor on a clean validation dataset.

In FL, the training data is decentralized and the aggregation

server is only exposed to model updates. Given that, backdoor

attacks are typically carried by constructing malicious updates. That

is, adversary tries to create an update that encodes the backdoor

in a way such that, when it is aggregated with other updates, the

aggregated model exhibits the backdoor. This has been referred

as model poisoning attack [

]. For example, an adversary

could control some of the participating agents in a FL instance, and

train their local models on trojaned datasets to construct malicious

updates.

2.3 Blockchain

Blockchain was rst introduced by Nakamato as the underlying

ledger of the now famous Bitcoin cryptocurrency [

]. Briey, a

blockchain is an append-only, distributed and replicated database.

It allows the participants of a network to collectively maintain a

sequence of data in a tamper-resilient way. More importantly, it

does so without a requirement for a trusted third party by invoking

a consensus mechanism.

Informally, a blockchain network operates as follows: partic-

ipants broadcast their data, and certain nodes called miners (or

validators) gather, and store the data they receive in wrapper struc-

tures called blocks. Through a consensus mechanism, the network

elects a leader miner in a decentralized fashion for a sequence of

epochs. The epoch leader broadcast his block to the network and,

having received the leaders block, other nodes store it in their local

memory where each block maintains a hash-link to the previous

block.

The consensus algorithm that the blockchain network deploys

may depend on whether or not the network is public. For example,

Bitcoin operates on a public network, where anyone is free to join

and there is no uniform view of the network across participants. It

utilizes a cryptographic puzzle called Proof-of-Work [

] to achieve

consensus. This makes tampering with the order of blocks compu-

tationally infeasible when the majority of the network participants

follow the protocol honestly. In private networks however, partic-

ipants can employ more ecient consensus algorithms, such as

PBFT [

]. This is because the identity and number of participants

are known to every party, as access to the such networks can be

arbitrarily restricted.

We provide examples for a private, and a public blockchain below,

and note that there exists also hybrid architectures (as in this work),

that combine both public, and private blockchains.

2.3.1 Private Blockchain: Hyperledger Fabric. Hyperledger [

] is

the umbrella project for many open source blockchains. Hyper-

ledger Fabric, a permissioned blockchain is one amongst many

blockchains that holds properties like identiable participants, high

transaction throughput performance [

], low latency of trans-

action [

] conrmation alongside privacy and condentiality of

transactions. Hyperledger promotes the usage of smart contracts

called chaincode and pluggable consensus models for the conr-

mation of the underlying transactions committed on the ledger.

The transaction orders are maintained and are visible to all peers

participating on the network.

2.3.2 Public Blockchain: Ethereum. Ethereum [

], also possess

the capability to host smart contracts. However, the smart con-

tracts published are public due to the permissionless nature of the

blockchain making every transaction transparent. Each ethereum

smart contract and participant have an account of its own. Ether,

being the hosted cryptocurrency on the ethereum chain is required

to publish contracts, call functions and send transactions over the

chain. This currency is stored in a wallet possessed by every par-

ticipant on the blockchain and is spent in the form of Gas to make

smart contract calls. Ethereum, however, oers low transaction

throughput and high latency on transaction conrmation.

3 SYSTEM ARCHITECTURE

In this paper, we propose a practical system architecture that allows

any Federated Learning algorithm to run eciently and securely

while enabling auditability. Our solution maintains a multi-factor

approach to securely detect the potential trojan introduced in the

model over time and penalize the oending parties. There are many

components to the system, each playing a critical role to accomplish

the comprehensive goal.

3.1 Framework Setup

The overall BlockFLA framework assumes each participant trains

the model on their local machine or on a separate Virtual Machine

in the cloud. This assumption eliminates the expense of training the

model on the chain and enhances data privacy. Alongside training

the model locally, we consider the network to be an established TCP

connection between the participants and the aggregation server,

thus eliminating the overhead for establishing a connection every

剩余12页未读，继续阅读

万千之喜

粉丝: 30
资源: 13

BlockFLA：区块链驱动的问责联邦学习机制

Ever Accountable：打击色情的Chrome扩展工具

"新视野3 第一单元单词用法详解：justice, wreck及例句翻译

项目管理实战：RACI图解决团队职责不清

RASCI模型.pdf

JAVA常用单词.pdf

英语高级词汇.pdf

软件开发常用词汇.pdf

2012考研英语答案.pdf

NIST SP800-39.pdf

NIST SP800-37.pdf

最新资源