"

5.6 Defences for Federated Learning

Federated Learning (FL) enables training a global model using data distributed across multiple clients. However, this decentralized approach also introduces unique vulnerabilities, especially with regard to poisoning attacks from malicious clients. Unlike traditional centralized learning settings, FL requires application-specific defences. These defences safeguard the global model against adversarial actions, including robust federated aggregation algorithms, training protocols, and post-training measures.

Robust Federated Aggregation

Robust federated aggregation algorithms are designed to mitigate the impact of malicious updates during the aggregation process. These methods can be broadly categorized into two types:

  • Identifying and Down-weighting Malicious Updates: These algorithms focus on detecting and diminishing the influence of malicious client updates during aggregation.
  • Resistant Aggregation Without Malicious Client Identification: These methods do not attempt to identify malicious clients. Instead, they aim to aggregate the updates in a naturally resistant manner to poisoning. A key approach in this category is to compute a “true center” of the model updates rather than relying on a simple weighted average.

Robust Federated Training

In addition to robust federated aggregation, several Federated Learning (FL) protocols are designed to protect the training process from poisoning attacks. These protocols focus on making the training process more resilient to malicious updates.

  • Clipping the norm of model updates and adding Gaussian noise. Gaussian noise refers to random noise that follows a Gaussian distribution, also known as a normal distribution. This type of noise is commonly added to data or models to simulate real-world imperfections, test robustness, or improve generalization.
  • BaFFLe adds an extra validation phase to each training round by using global models that have been trained in earlier rounds as a reference to the next round so any major changes can be detected. In this phase, a randomly chosen group of clients checks whether the current global model is poisoned. These clients use their own private data to evaluate the model and compute a validation function. This function compares the misclassification rates of the current model with those from previous models to see if there are any significant differences. If the misclassification rate is unusually high, this could indicate a backdoor attack, and the model is flagged as potentially poisoned.

If the validation clients find a problem, the server can reject the current global model and prevent the poisoning attack from spreading further.


Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses by Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein is licensed under an Attribution 4.0 International, except where otherwise noted.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Winning the Battle for Secure ML Copyright © 2025 by Bestan Maaroof is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.