5.4 Types of Backdoor Attacks

Trigger poisoning:

Patch Trigger: The trigger is a small patch added to the input data. For example, a sticker or graffiti on a stop sign could cause an autonomous vehicle to misclassify it.
Clean-label Backdoors: The attacker does not change the labels of the poisoned samples, making the attack stealthier. This requires more sophisticated techniques to ensure the model learns the trigger.
Dynamic Backdoors: The trigger’s location or appearance varies across different samples, making it harder to detect.
Functional Triggers: The trigger is embedded throughout the input or changes based on the input. For example, a steganographic trigger is hidden within an image.

Figure 5.4.1 Clean image with the blended Hello Kitty pattern. Image by Ruitao Hou, Teng Huang
Hongyang Yan and Lishan Ke, FDEd (CAN).

Semantical Triggers: This is a physical perceptible trigger and, hence, is plausible. In other words, modifications retain the input’s overall meaning, such as adding a sunglasses trigger to a face, altering facial expressions while keeping identity intact, adding a bird in the sky, or a dog image with a ball trigger.

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License