5.3 Backdoor Attack Scenarios
In this section, we review possible attack scenarios in which a Backdoor attack can be executed, each with its unique challenges and implications:
Outsourced Training:
- Scenario: A user aims to train a model using a training dataset but outsources the training process to an external trainer. The trainer returns the trained model, which the user verifies using a validation dataset.
- Attack: A malicious trainer returns a backdoored model that meets the accuracy requirements on the validation set but misclassifies inputs containing the backdoor trigger.
- Implications: The user may unknowingly deploy a compromised model, leading to potential security breaches.
Transfer Learning:
- Scenario: A user downloads a pre-trained model from an online repository and fine-tunes it for a new application using a private validation set.
- Attack: The pre-trained model is backdoored, and the fine-tuned model inherits the backdoor, causing misclassification of triggered inputs while maintaining high accuracy on clean data.
- Implications: The user’s application may be compromised, leading to incorrect predictions and potential security risks.
Federated Learning:
- Scenario: Multiple participants collaboratively train a model without sharing their private data. The central server aggregates updates from participants to improve the model.
- Attack: A malicious participant submits poisoned updates, embedding a backdoor into the joint model. The model behaves correctly on clean data but misclassifies triggered inputs.
- Implications: The integrity of the federated learning process is compromised, and the model may be used to carry out targeted attacks.