1
When Federated Learning Meets Blockchain: A
New Distributed Learning Paradigm
Chuan Ma, Member, IEEE, Jun Li, Senior Member, IEEE, Ming Ding, Senior Member, IEEE,
Long Shi, Member, IEEE, Taotao Wang, Member, IEEE,
Zhu Han, Fellow, IEEE, and H. Vincent Poor, Fellow, IEEE
Abstract—Motivated by the explosive computing capabilities
at end user equipments, as well as the growing privacy con-
cerns over sharing sensitive raw data, a new machine learning
paradigm, named federated learning (FL) has emerged. By
training models locally at each client and aggregating learning
models at a central server, FL has the capability to avoid sharing
data directly, thereby reducing privacy leakage. However, the
traditional FL framework heavily relies on a single central server
and may fall apart if such a server behaves maliciously. To
address this single point of failure issue, this work investigates a
blockchain assisted decentralized FL (BLADE-FL) framework,
which can well prevent the malicious clients from poisoning
the learning process, and further provides a self-motivated and
reliable learning environment for clients. In detail, the model
aggregation process is fully decentralized and the tasks of training
for FL and mining for blockchain are integrated into each
participant. In addition, we investigate the unique issues in this
framework and provide analytical and experimental results to
shed light on possible solutions.
Index Terms—Federated Learning, Blockchain, Privacy and
Security
I. INTRODUCTION
Future wireless networks are featured by low latency and
high reliability. Thus, machine learning (ML) embedded in
each device is a ravishing solution that each user equipment
(UE) has the capability to make decisions by its local data,
even when it loses connectivity to the wireless system. Since
the data at each device is limited, the training of on-device ML
models always requires the data exchange among UEs [1].
However, directly exchanging data among UEs may cause
serious risks in privacy leakage and information hijacking
[2]. To reduce this risk, federated learning (FL) is proposed,
which is a new ML framework that trains an AI model across
multiple UEs holding local datasets. In details, FL allows to
train machine learning models locally at distributed UEs; after
that, the UEs share the parameters of the locally trained models
to a central server (i.e., the aggregator) where a global model
is aggregated. Therefore, the UEs under the FL framework
C. Ma, J. Li and L. Shi are with the School of Electronic and Optical
Engineering, Nanjing University of Science and Technology, Nanjing, China
(e-mail: {chuan.ma, jun.li}@njust.edu.cn and slong1007@gmail.com).
M. Ding is with Data61, CSIRO, 2015, Australia (e-mail:
Ming.Ding@data61.csiro.au).
T. Wang is with the College of Electronics and Information Engineering,
Shenzhen University, Shenzhen 518060, China (e-mail: ttwang@szu.edu.cn).
Z. Han is with the Department of Electrical and Computer Engineering,
University of Houston, Houston, TX 77004, USA (e-mail: zhan2@uh.edu)
H. V. Poor is with the Department of Electrical Engineering, Princeton
University, Princeton, NJ 08544, USA (e-mail: poor@princeton.edu).
have the capability to cooperatively learn a global model
without exchanging their data directly. Moreover, FL has been
applied to real-world applications, including health care and
autonomous driving [3].
Although FL shows its effectiveness in preserving privacy,
it still endures several limitations. First, in the FL process, the
single centralized aggregator is assumed to be trustworthy and
it shall make fair decisions in terms of the user selection and
aggregation. However, this assumption is not always appropri-
ate, especially in the real-world operations. This is because a
biased aggregator can intentionally emerge prejudice to a few
selected UEs, thereby damaging the learning performance [1].
Second, the aim of FL is restricted to applications orchestrated
by the centralized aggregator. As a result, the resiliency of an
aggregator depends on the robustness of the central server,
and a failure in the aggregator could collapse the entire FL
network. Then, although local data is not explicitly shared
in the original format, it is still possible for adversaries
to reconstruct the raw data approximately, especially in the
aggregation process. In particular, privacy leakage may happen
during model aggregating by outsider attacks. Lastly, the
existing design is vulnerable to the malicious clients that might
upload poisonous models to attack the FL network [4].
As a secure technology, blockchain has the capability to
tolerate single point failure with distributed consensus, and
it can further implement incentive mechanisms to encourage
participants to effectively contribute to the system [5]. There-
fore, blockchain is introduced to FL to solve its limitations
mentioned above. In [5], a blockchained FL architecture was
developed to verify the uploaded parameters and it investigated
the related system performances, such as the learning delay
and the block generation rate. Moreover, work [6] proposed
a privacy-aware architecture that uses blockchain to enhance
security when sharing parameters of machine learning models
with other UEs. In addition, the authors in [7] proposed a
high-level but complicated framework by enabling encryption
during model transmission and providing incentives from
participants, and the work [8] further applied this framework in
the defensive military network. With the advanced features of
blockchain such as tamper-proof, anonymity and traceability,
an immutable audit trail of ML models can be created for
greater trustworthiness in tracking and proving provenance [9].
In addition, security and privacy issues of the decentralized FL
framework are investigated in [6], [10], [11], which delegate
the responsibility of storing ML models to a trust community
in the blockchain. However, the assumption on the trust
arXiv:2009.09338v2 [cs.NI] 4 Jun 2021