2.8 Chapter Summary
Key Takeaways
- Growing Importance of ML Security
- The rapid deployment of machine learning (ML) models in real-world and critical infrastructure has elevated the need for robust security measures.
- Threat Modelling in ML
- Understanding potential threats involves identifying attack vectors and threat actors and developing both proactive and reactive strategies.
- Nature of ML Attacks
- Attacks can be causative (affecting training data) or exploratory (probing trained models).
- Adversaries may aim to manipulate outputs, extract confidential data, or exploit vulnerabilities.
- Attacks can be targeted (specific goal) or indiscriminate (general disruption).
- Defence Strategies
- Includes adversarial machine learning techniques and secure-by-design approaches.
- Security requires continuous innovation to stay ahead of evolving attack methods.
- Need for Cross-Sector Collaboration
- Effective AI security relies on cooperation among academia, industry, and government to create resilient systems.
- Three Golden Rules of ML Security
- Know Your Adversary
- Be Proactive
- Protect Yourself
OpenAI. (2025). ChatGPT. [Large language model]. https://chat.openai.com/chat
Prompt: Can you generate key takeaways for this chapter content?
Key Terms
- Attack: Accessing training data to manipulate the number of samples in a way that degrades the accuracy of the classifier when retraining the model.
- Attack Surfaces: Entry points where adversaries can target a machine learning system.
- Availability Attack: Increases the number of false-positive cases instead of increasing the number of false-negative cases.
- Causative attack, the attacker has the capability to modify the distribution of training data.
- Dictionary Attack: A technique based on dictionary words to attack the model.
- Exploratory Attack: Gaining information about the training and test datasets to identify the decision boundary model.
- Focused Attack: Typically focused on one type of text.
- Indiscriminate Attack: Targets all types of instances of a particular class, intending to degrade the model’s performance.
- Integrity Attack: An Attack whose main intention is to increase the number of false negative cases.
- Reactive Security: Designers respond by updating security measures as attackers develop methods to bypass defences.
- Targeted Attack: Targets one particular case and tries to degrade the model’s performance in that particular case.
- Threat Actors: Entities that exploit machine learning system vulnerabilities.