Federated Learning for Smart Healthcare: A Survey • 7
but this incurs long data processing delays. A possible solution is to exchange the data between medical
sites to support the data training, but given the institutional policies and growing user privacy concerns, it
is not easy to obtain data from other sites to train the AI model [
27
]. Hence, how to solve the issues of
dataset shortage is of paramount importance for smart healthcare system designs.
•
Limited Health Data Training Performance: Due to the lack of datasets, the training at an single medical site
cannot achieve the desired degree of accuracy, e.g., disease classication accuracy. The reasons behind this
observation can come from the imbalance of data features and the insucient data sizes. One can use data
augmentation techniques such as generative adversarial networks (GANs) [
28
] to solve these issues, but it
may still not have good diversity to build a comprehensive dataset for ecient data training. This is also
one of the most critical challenge in applying AI in healthcare, where the training becomes more dicult
due to the limited datasets.
•
High Costs in Health Data Training: In the traditional AI-based smart healthcare systems, the ooading
of health data to the cloud for execution incurs excessive network latency [
29
], especially when medical
data often have large sizes (e.g., audio, images). Moreover, the health data transfer also consumes much
network bandwidth, which is likely to cause network congestion when the number of devices increases.
The ooading process also requires transmit power of medical devices which in turn poses new challenges
on battery and hardware designs on devices.
3.1.2 Benefits of FL in Smart Healthcare. Based on the innovative operational concept, FL is able to bring many
attractive benets to advance smart healthcare, as explained below:
•
Data Privacy Improvement: In the FL-based smart healthcare system, only the local updates such as model
gradients are required by the central server for the AI training, while the local health data are kept at local
medical sites and devices. This would reduce the risks of the leakage of sensitive user information to the
external third-party, and thus providing a higher degree of user privacy [
30
]. Following the increasingly
stringent health data privacy protection legislation, the capability of preserving health user information of
FL is important for building sustainable and safe smart healthcare systems
1
.
•
Reasonable Trade-o between Accuracy and Utility: Compared with conventional centralized learning, FL is
able to oer a reasonable trade-o between accuracy and utility along with privacy enhancement. Moreover,
FL training retains the model generalizability at the cost of nominal accuracy loss. In return, FL can enhance
the scalability of the smart healthcare system thanks to its distributed learning feature.
•
Low-cost Health Data Training: By avoiding the ooading of huge data volumes to the server, FL can
help reduce signicantly communication costs, e.g., latency and transmit power, consumed by raw data
transmission, as the model gradients generally have much smaller sizes compared to their actual datasets
[
32
]. As a result, FL also save much network bandwidth and mitigate possibility of network congestion in
massive healthcare networks.
3.2 Requirements
To realize the full potential of FL in smart healthcare, several requirements should be met as highlighted below:
3.2.1 Trusted Server. One of the most important entities in FL is the central server that is used to aggregate local
model gradients to build the global model in each communication round. Although the FL concept can provide
privacy protection by allowing users to keep their data at local sites during the training, it has been proven that
the model updates might still contain health user-related information such as data features and image resolution
that can be re-constructed by the curious global server [
33
]. As a result, user privacy can be put at risks during the
1
However, it should be noted that FL cannot fully address the privacy problem in smart healthcare [
31
]. Dedicated privacy protection
mechanisms thus need to be designed to enhance FL in healthcare networks.
ACM Comput. Surv., Vol. 1, No. 1, Article . Publication date: November 2021.