5.4 Types of Backdoor Attacks
Trigger poisoning:
- Patch Trigger: The trigger is a small patch added to the input data. For example, a sticker or graffiti on a stop sign could cause an autonomous vehicle to misclassify it.
- Clean-label Backdoors: The attacker does not change the labels of the poisoned samples, making the attack stealthier. This requires more sophisticated techniques to ensure the model learns the trigger.
- Dynamic Backdoors: The trigger’s location or appearance varies across different samples, making it harder to detect.
- Functional Triggers: The trigger is embedded throughout the input or changes based on the input. For example, a steganographic trigger is hidden within an image.
Figure 5.4.1 Clean image with the blended Hello Kitty pattern. Image by Ruitao Hou, Teng Huang
Hongyang Yan and Lishan Ke, FDEd (CAN).
- Semantical Triggers: This is a physical perceptible trigger and, hence, is plausible. In other words, modifications retain the input’s overall meaning, such as adding a sunglasses trigger to a face, altering facial expressions while keeping identity intact, adding a bird in the sky, or a dog image with a ball trigger.