提升联邦学习效率：异步模型更新与时间加权聚合

需积分: 0 15 浏览量更新于2024-08-05 收藏 1.47MB PDF 举报

本文主要探讨了"2019年高效的异步通信模型更新与时间加权聚合在联邦深度学习中的应用"（Communication-Efficient Federated Deep Learning with Asynchronous Model Update and Temporally Weighted Aggregation）。该研究由Yang Chen、Xiaoyan Sun和Yaochu Jin三位作者共同完成。联邦学习是一种分布式机器学习方法，其核心在于允许多个客户端在其本地数据上训练模型，然后将模型参数上传至中央服务器进行聚合，从而避免了直接传输原始数据，保护了用户的隐私。然而，由于客户端设备通常具有有限的通信带宽，降低客户端与服务器之间的通信成为联邦学习的一个关键挑战。为解决这一问题，论文提出了一种增强的联邦学习策略，即引入异步学习模式。异步学习将深度神经网络的不同层区分为浅层和深层，其中深层参数的更新频率低于浅层。这种策略旨在优化资源利用，特别是对于那些通信能力较弱的客户端，它们可以更灵活地安排模型更新，减少了对实时同步的需求。另一方面，论文还提出了一种时间加权聚合策略。服务器在接收到来自各个客户端的模型更新后，并非简单地平均所有参数，而是根据每个模型的训练时间或性能给予不同的权重。这样做的好处在于，随着时间的推移，服务器可以更好地利用之前训练过但可能性能稳定的旧模型，同时考虑到新模型可能带来的改进，从而实现更有效率的模型融合。这项工作通过结合异步学习和时间加权聚合技术，显著降低了联邦学习中的通信开销，提高了系统的整体效率，使得在资源受限的环境下也能实现高质量的模型训练。这种方法对于推动大规模分布式学习在实际应用中的普及和优化具有重要意义。

Communication-Efﬁcient Federated Deep Learning

with Asynchronous Model Update and Temporally

Weighted Aggregation

Yang Chen, Xiaoyan Sun, Yaochu Jin,

Abstract—Federated learning obtains a central model on

the server by aggregating models trained locally on clients.

As a result, federated learning does not require clients to

upload their data to the server, thereby preserving the data

privacy of the clients. One challenge in federated learning

is to reduce the client-server communication since the end

devices typically have very limited communication bandwidth.

This paper presents an enhanced federated learning technique

by proposing a synchronous learning strategy on the clients

and a temporally weighted aggregation of the local models

on the server. In the asynchronous learning strategy, different

layers of the deep neural networks are categorized into shallow

and deeps layers and the parameters of the deep layers are

updated less frequently than those of the shallow layers.

Furthermore, a temporally weighted aggregation strategy is

introduced on the server to make use of the previously trained

local models, thereby enhancing the accuracy and convergence

of the central model. The proposed algorithm is empirically on

two datasets with different deep neural networks. Our results

demonstrate that the proposed asynchronous federated deep

learning outperforms the baseline algorithm both in terms of

communication cost and model accuracy.

Index Terms—Federated learning, Deep neural network,

aggregation, asynchronous learning, temporally weighted ag-

gregation

I. INTRODUCTION

Smart phones, wearable gadgets, and distributed wireless

sensors usually generate huge volumes of privacy sensitive

data. In many cases, service providers are interested in

mining information from these data to provide personalized

services, for example, to make more relevant recommenda-

tions to clients. However, the clients are usually not willing

to allow the service provider to access the data for privacy

reasons.

Federated learning is a recently proposed privacy-

preserving machine learning framework [1]. The main idea

is to train local models on the clients, send the model

parameters to the server, and then aggregate the local models

on the server. Since all local models are trained upon

This work is supported by the National Natural Science Foundation of

China with Grant No.61473298 and 61876184. (Corresponding author:

Yaochu Jin)

Y. Chen and Xi. Sun are with the School of Information and Control

Engineering, China University of Mining and Technology, Xuzhou 221116,

China. Y. Chen and X. Sun contributed equally to this work and are co-ﬁrst

authors.(e-mail: fedora.cy@gmail.com; xysun78@hotmail.com)

Y. Jin is with the Department of Computer Science, University of Surrey,

Guildford, GU2 7XH, United Kingdom. (Email: yaochu.jin@surrey.ac.uk)

data that are locally stored in clients, the data privacy can

be perserved. The whole process of the typical federated

learning is divided into communication rounds, in which

the local models on the clients are trained on the local

datasets. For the k-th client, where k ∈ S, and S refers

to the participating subset of m clients, its training samples

are denoted as P

and the trained local model is represented

by the model parameter vector ω

. In each communication

round, only models of the clients belonging to the subset S

will download the parameters of the central model from the

server ans use them as the initial values of the local models.

Once the local training is completed, the participating clients

send the updated parameters back to the server. Conse-

quently, the central model can be updated by aggregating

the updated local models, i.e. ω = Agg(ω

) [2], [3], [1].

In this setting, the local models of each client can be

any type of machine learning models, which can be chosen

according to the task to be accomplished. In most exist-

ing work on federated learning [1], deep neural networks

(DNNs), e.g., long short-term memory (LSTM), are em-

ployed to conduct text-word/text-character prediction tasks.

In recent years, DNNs have been successfully applied to

many complex problem-solvings, including text classiﬁca-

tion, image classiﬁcation, and speech recognition [4], [5],

[6]. Therefore, DNNs are widely adopted as the local model

in federated learning, and the stochastic gradient descent

(SGD) is the most popular learning algorithm for training

the local models.

As aforementioned, one communication round includes

parameter download (on clients), local training (on clients),

trained parameter upload (on clients), and model aggregation

(on the server). Such a framework appears to be similar to

distributed machine learning algorithm [7], [8], [9], [10],

[11], [12]. In federated learning, however, only the models’

parameters are uploaded and downloaded between the clients

and server, and the data of local clients are not uploaded to

the server or exchanged between the clients. Accordingly,

the data privacy of each client can be preserved.

Compared with other machine leanring paradiagms, fed-

erated learning are subject to the following challenges [1],

[13]:

1) Unbalanced data: The data amount on different

clients may be highly imbalanced because there are

light and heavy users.

arXiv:1903.07424v1 [cs.LG] 18 Mar 2019

下载后可阅读完整内容，剩余9页未读，立即下载

王佛伟

粉丝: 21
资源: 319

提升联邦学习效率：异步模型更新与时间加权聚合

论文研究-Web异步通信研究与应用 .pdf

C51单片机的串口异步通信和同步通信的区别

A-Designer-s-Guide-to-Asynchronous-VLSI_vlsi_asynchronous_

网络游戏-异步通信网络和同步通信网络之间的切换方法.zip

行业分类-设备装置-分布式异步传输模式中继媒体网关间实现通信的方法.zip

行业资料-电子功用-借助异步传输方式信元的信息传输方法和无线电通信系统的介绍分析.rar

pika-asynchronous-consumer:异步RabbitMQ使用者

mobc-具有异步/等待支持的通用连接池-Rust开发

async-communication:客户端和服务器之间的异步通信

基于Multi-Agent异步深度强化学习的居民住宅能耗在线优化调度研究.pdf

最新资源