├── images ├── .gitkeep ├── Tay.png ├── Bosch1.PNG ├── Msft1.PNG ├── OpenAI.png ├── mitre.png ├── msft2.png ├── AdvML101.PNG ├── cylance.png ├── AttackOnMT.png ├── ClearviewAI.png ├── ProofPoint.png ├── VirusTotal.png ├── color_advml.png ├── color_cyber.png ├── paloalto1.png ├── paloalto2.png ├── AdvML101_Client.PNG ├── AdvML101_Inference.PNG ├── AdvML101_Traintime.PNG ├── AdvMLThreatMatrix.jpg └── FacialRecognitionANT.png ├── pages ├── things-to-keep-in-mind-before-you-use-the-matrix.md ├── contributors.md ├── feedback.md ├── structure-of-adversarial-ml-threat-matrix.md ├── why-adversarial-ml-threat-matrix.md ├── adversarial-ml-101.md ├── case-studies-page.md └── adversarial-ml-threat-matrix.md └── readme.md /images/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /images/Tay.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/Tay.png -------------------------------------------------------------------------------- /images/Bosch1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/Bosch1.PNG -------------------------------------------------------------------------------- /images/Msft1.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/Msft1.PNG -------------------------------------------------------------------------------- /images/OpenAI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/OpenAI.png -------------------------------------------------------------------------------- /images/mitre.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/mitre.png -------------------------------------------------------------------------------- /images/msft2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/msft2.png -------------------------------------------------------------------------------- /images/AdvML101.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AdvML101.PNG -------------------------------------------------------------------------------- /images/cylance.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/cylance.png -------------------------------------------------------------------------------- /images/AttackOnMT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AttackOnMT.png -------------------------------------------------------------------------------- /images/ClearviewAI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/ClearviewAI.png -------------------------------------------------------------------------------- /images/ProofPoint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/ProofPoint.png -------------------------------------------------------------------------------- /images/VirusTotal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/VirusTotal.png -------------------------------------------------------------------------------- /images/color_advml.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/color_advml.png -------------------------------------------------------------------------------- /images/color_cyber.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/color_cyber.png -------------------------------------------------------------------------------- /images/paloalto1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/paloalto1.png -------------------------------------------------------------------------------- /images/paloalto2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/paloalto2.png -------------------------------------------------------------------------------- /images/AdvML101_Client.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AdvML101_Client.PNG -------------------------------------------------------------------------------- /images/AdvML101_Inference.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AdvML101_Inference.PNG -------------------------------------------------------------------------------- /images/AdvML101_Traintime.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AdvML101_Traintime.PNG -------------------------------------------------------------------------------- /images/AdvMLThreatMatrix.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/AdvMLThreatMatrix.jpg -------------------------------------------------------------------------------- /images/FacialRecognitionANT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/advmlthreatmatrix/master/images/FacialRecognitionANT.png -------------------------------------------------------------------------------- /pages/things-to-keep-in-mind-before-you-use-the-matrix.md: -------------------------------------------------------------------------------- 1 | ## Things to keep in mind before you use the framework: 2 | 1. This is a **first cut attempt** at collating known adversary techniques against ML Systems. We plan to iterate on the framework based on feedback from the security and adversarial machine learning community (please engage with us and help make the matrix better!). Net-Net: This is a *living document* that will be routinely updated. 3 | 2. Only known bad is listed in the Matrix. Adversarial ML is an active area of research with new classes constantly being discovered. If you find a technique that is not listed, please enlist it in the framework (see section on Feedback) 4 | 3. We are not prescribing definitive defenses at this point - The world of adversarial. We are already in conversations to add best practices in future revisions such as adversarial training for adversarial examples, restricting the number of significant digits in confidence score for model stealing. 5 | 4. This is not a risk prioritization framework - The Threat Matrix only collates the known techniques; it does not provide a means to prioritize the risks. 6 | -------------------------------------------------------------------------------- /pages/contributors.md: -------------------------------------------------------------------------------- 1 | ## Contributors 2 | 3 | Want to get involved? See [Feedback and Contact Information](/readme.md#feedback-and-getting-involved) 4 | 5 | | **Organization** | **Contributors** | 6 | | :--- | :--- | 7 | | Microsoft | Ram Shankar Siva Kumar, Hyrum Anderson, Suzy Shapperle, Blake Strom, Madeline Carmichael, Matt Swann, Nick Beede, Kathy Vu, Andi Comissioneru, Sharon Xia, Mario Goertzel, Jeffrey Snover, Abhishek Gupta | 8 | | MITRE | Mikel D. Rodriguez, Christina E Liaghati, Keith R. Manville, Michael R Krumdick, Josh Harguess | 9 | | Bosch | Manojkumar Parmar | 10 | | IBM | Pin-Yu Chen | 11 | | NVIDIA | David Reber Jr., Keith Kozo, Christopher Cottrell, Daniel Rohrer | 12 | | Airbus | Adam Wedgbury | 13 | | Deep Instinct | Nadav Maman | 14 | | TwoSix | David Slater | 15 | | University of Toronto | Adelin Travers, Jonas Guan, Nicolas Papernot | 16 | | Cardiff University | Pete Burnap | 17 | | Software Engineering Institute/Carnegie Mellon University | Nathan M. VanHoudnos | 18 | | Berryville Institute of Machine Learning | Gary McGraw, Harold Figueroa, Victor Shepardson, Richie Bonett| 19 | -------------------------------------------------------------------------------- /pages/feedback.md: -------------------------------------------------------------------------------- 1 | ## Feedback and Contact Information 2 | 3 | The Adversarial ML Threat Matrix is a first-cut attempt at collating a knowledge base of how ML systems can be attacked. We need your help to make it holistic and fill in the missing gaps! 4 | 5 | Please submit a Pull Request with suggested changes! We are excited to make this system better with you! 6 | 7 | **Join our Adversarial ML Threat Matrix Google Group** 8 | 9 | - For discussions around Adversarial ML Threat Matrix, we invite everyone to join our Google Group [here](https://groups.google.com/forum/#!forum/advmlthreatmatrix/join) 10 | - If you want to access this forum using your corporate email (as opposed to your gmail) 11 | - Open your browser in Incognito mode. 12 | - Once you sign up with your corporate, and complete captcha, you may 13 | - Get an error, ignore it! 14 | - Also note, emails from Google Forums generally go to "Other"/"Spam" 15 | folder. So, you may want to create a rule to go into your inbox 16 | instead 17 | 18 | **Want to work with us on the next iteration of the framework?** 19 | - We are partnering with Defcon's AI Village to open up the framework to all community members to get feedback and make it better. Current thinking is to have this event circa 20 | - Jan/Feb 2021. 21 | - Please register here for the workshop for more hands on feedback 22 | session 23 | 24 | **Contact Information** 25 | - If you have questions/comments that you'd like to discuss privately, 26 | please email: and 27 | -------------------------------------------------------------------------------- /pages/structure-of-adversarial-ml-threat-matrix.md: -------------------------------------------------------------------------------- 1 | ## Structure of Adversarial ML Threat Matrix 2 | Because it is fashioned after [ATT&CK Enterprise](https://attack.mitre.org/matrices/enterprise/), the Adversarial ML Threat Matrix retains the terminologies: for instance, the column heads of the Threat Matrix are called "Tactics" and the individual entities are called "Techniques". 3 | 4 | However, there are two main differences: 5 | 6 | 1. ATT&CK Enterprise is generally designed for corporate network which is composed of many sub components like workstation, bastion hosts, database, network gear, active directory, cloud component and so on. The tactics of ATT&CK enterprise (initial access, persistence, etc) are really a short hand of saying initial access to *corporate network;* persistence *in corporate network.* In Adversarial ML Threat Matrix, we acknowledge that ML systems are part of corporate network but wanted to highlight the uniqueness of the attacks. 7 | 8 | **Difference:** In the Adversarial ML Threat Matrix, the "Tactics" should be read as "reconnaissance of ML subsystem", "persistence in ML subsystem", "evading the ML subsystem" 9 | 10 | 2. When we analyzed real-world attacks on ML systems, we found out that attackers can pursue different strategies: Rely on traditional cybersecurity technique only; Rely on Adversarial ML techniques only; or Employ a combination of traditional cybersecurity techniques and ML techniques. 11 | 12 | **Difference:** In Adversarial ML Threat Matrix, "Techniques" come in two flavors: 13 | - Techniques in orange are specific to ML systems 14 | - Techniques in white are applicable to both ML and non-ML systems and come directly from Enterprise ATT&CK 15 | 16 | Note: The Adversarial ML Threat Matrix is not yet part of the ATT&CK matrix. 17 | 18 | 19 | -------------------------------------------------------------------------------- /pages/why-adversarial-ml-threat-matrix.md: -------------------------------------------------------------------------------- 1 | ## Why Adversarial ML Threat Matrix? 2 | 1. In the last three years, major companies such as [Google](https://www.zdnet.com/article/googles-best-image-recognition-system-flummoxed-by-fakes/), [Amazon](https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds), [Microsoft](https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai- chatbot-gets-a-crash-course-in-racism-from-twitter), and [Tesla](https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on- road-can-steer-tesla-autopilot-into-oncoming-lane), have had their ML systems tricked, evaded, or misled. 3 | 2. This trend is only set to rise: According to [Gartner report](https://www.gartner.com/doc/3939991). 30% of cyberattacks by 2022 will involve data poisoning, model theft or adversarial examples. 4 | 3. However, industry is underprepared. In a [survey](https://arxiv.org/pdf/2002.05646.pdf) of 28 organizations spanning small as well as large organizations, 25 of the 28 organizations did not know how to secure their ML systems. 5 | 6 | Unlike traditional cybersecurity vulnerabilities that are tied to specific software and hardware systems, adversarial ML vulnerabilities are enabled by inherent limitations underlying ML algorithms. As a result, data can now be weaponized in new ways requiring that we extend the way we model cyber adversary behavior, reflecting emerging threat vectors and the rapidly evolving adversarial machine learning attack lifecycle. 7 | 8 | This threat matrix came out of partnership with 12 industry and academic research groups with the goal of empowering security analysts to orient themselves in this new and upcoming threats. **We are seeding this framework with a curated set of vulnerabilities and adversary behaviors that Microsoft and MITRE have vetted to be effective against production ML systems** Since the primary audience is security analysts, we used ATT&CK as template to position attacks on ML systems given its popularity and wide adoption in the industry. 9 | -------------------------------------------------------------------------------- /pages/adversarial-ml-101.md: -------------------------------------------------------------------------------- 1 | # Adversarial Machine Learning 101 2 | Informally, Adversarial ML is “subverting machine learning systems for fun and profit”. The methods underpinning the production machine learning systems are systematically vulnerable to a new class of vulnerabilities across the machine learning supply chain collectively known as Adversarial Machine Learning. Adversaries can exploit these vulnerabilities to manipulate AI systems in order to alter their behavior to serve a malicious end goal. 3 | 4 | Consider a typical ML pipeline shown below that is gated behind an API, wherein the only way to use the model is to send a query and observe a response. In this example, we assume a blackbox setting: the attacker does NOT have direct access to the training data, no knowledge of the algorithm used and no source code of the model. The attacker only queries the model and observes the response. We will look at two broad categories of attacks: 5 | 6 | **Train time vs Inference time:** 7 | 8 | Training refers to the process by which data is modeled. This process includes collecting and processing data, training a model, validating the model works, and then finally deploying the model. An attack that happens at "train time" is an attack that happens while the model is learning prior its deployment. After a model is deployed, consumers of the model can submit queries and receive outputs (inferences). 9 | An attack that happens at "inference time" is an attack where the learned state of the model does not change and the model is just providing outputs. In practice, a model could be re-trained after every new query providing an attacker with some interesting scenarios by which they could use an inference endpoint to perform a "train-time" attack. In any case, the delineation is useful to describe how an attacker could be interacting with a target model. 10 | 11 | With this in mind, we can jump into the attacks on ML systems. 12 | 13 | ![Adversarial ML 101](/images/AdvML101.PNG) 14 | 15 | 16 | # Machine Learning Attacks 17 | Attacks on machine learning systems can be categorized as follows. 18 | 19 | | Attack | Overview | Type | 20 | | :--- | :--- | :---:| 21 | | Model Evasion | Attacker modifies a query in order to get a desired outcome. These attacks are performed by iteratively querying a model and observing the output. | Inference | 22 | | Functional Extraction | Attacker is able to recover a functionally equivalent model by iteratively querying the model. This allows an attacker to examine the offline copy of the model before further attacking the online model. | Inference | 23 | | Model Poisoning | Attacker contaminates the training data of an ML system in order to get a desired outcome at inference time. With influence over training data an attacker can create "backdoors" where an arbitrary input will result in a particular output. The model could be "reprogrammed" to perform a new undesired task. Further, access to training data would allow the attacker to create an offline model and create a Model Evasion. Access to training data could also result in the compromise of private data. | Train | 24 | | Model Inversion | Attacker recovers the features used to train the model. A successful attack would result in an attacker being able to launch a Membership inference attack. This attack could result in compromise of private data. | Inference | 25 | | Traditional Attacks | Attacker uses well established TTPs to attain their goal. | Both | 26 | 27 | 28 | 29 | # Attack Scenarios 30 | ## Attack Scenario #1: Inference Attack 31 | Consider the most common deployment scenario where a model is deployed as an API endpoint. In this blackbox setting an attacker can only query the model and observe the response. The attacker controls the input to the model, but the attacker does not know how it is processed. 32 | 33 | ![Adversarial ML 101](/images/AdvML101_Inference.PNG) 34 | 35 | ---- 36 | ## Attack Scenario #2: Training Time Attack 37 | Consider that an attacker has control over training data. This flavor of attack is shown in [Tay Poisoning Case Study](/pages/case-studies-page.md#tay-poisoning) where the attacker was able to compromise the training data via the feedback mechanism. 38 | 39 | 40 | ![Adversarial ML 101](/images/AdvML101_Traintime.PNG) 41 | 42 | 43 | ## Attack Scenario #3: Attack on Edge/Client 44 | Consider that a model exists on a client (like a phone) or on the edge (such as IoT) . An attacker might have access to model code through reversing the service on the client. This flavor of attack is shown in [Bosch Case Study with EdgeAI](/pages/case-studies-page.md#bosch---edge-ai). 45 | 46 | ![Adversarial ML 101](/images/AdvML101_Client.PNG) 47 | 48 | 49 | # Important Notes 50 | 1. This does not cover all kinds of attacks -- adversarial ML is an active area of research with new classes of attacks constantly being discovered. 51 | 2. Though the illustration shows blackbox attacks, these attacks have also been shown to work in whitebox (where the attacker has access to either model architecture, code or training data) settings. 52 | 3. Though we were not specific about what kind of data – image, audio, time series, or tabular data - research has shown that of these attacks are data agnostic. 53 | 54 | # Next Recommended Reading 55 | Head over to [Adversarial ML Threat Matrix](/pages/adversarial-ml-threat-matrix.md#adversarial-ml-threat-matrix) page to see a compendium of attacks in ATT&CK style. 56 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # Annoucing ATLAS! 2 | We are excited to announce the new and interactive release of the AdvML Threat Matrix under a newly branded name: **ATLAS** - Adversarial Threat Landscape for Artificial-Intelligence Systems! 3 | 4 | Please visit our new website at https://atlas.mitre.org for the new interactive matrix, new case studies, and a tailored ATLAS Navigator! 5 | 6 | # Adversarial ML Threat Matrix - Table of Contents 7 | 1. [Adversarial ML 101](/pages/adversarial-ml-101.md#adversarial-machine-learning-101) 8 | 2. [Adversarial ML Threat Matrix](pages/adversarial-ml-threat-matrix.md#adversarial-ml-threat-matrix) 9 | 3. [Case Studies](/pages/case-studies-page.md#case-studies-page) 10 | 4. [Contributors](#contributors) 11 | 5. [Feedback and Getting Involved](#feedback-and-getting-involved) 12 | - [Join Our Mailing List](#join-our-mailing-list) 13 | 6. [Contact Us](#contact-us) 14 | ---- 15 | 16 | The goal of this project is to position attacks on machine learning (ML) systems in an [ATT&CK](https://attack.mitre.org/)-style framework so that security analysts can orient themselves 17 | to these new and upcoming threats. 18 | 19 | If you are new to how ML systems can be attacked, we suggest starting at this no-frills [Adversarial ML 101](/pages/adversarial-ml-101.md#adversarial-machine-learning-101) aimed at security analysts. 20 | 21 | Or if you want to dive right in, head to [Adversarial ML Threat Matrix](/pages/adversarial-ml-threat-matrix.md#adversarial-ml-threat-matrix). 22 | 23 | ## Why Develop an Adversarial ML Threat Matrix? 24 | - In the last three years, major companies such as [Google](https://www.zdnet.com/article/googles-best-image-recognition-system-flummoxed-by-fakes/), [Amazon](https://www.fastcompany.com/90240975/alexa-can-be-hacked-by-chirping-birds), [Microsoft](https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai-chatbot-gets-a-crash-course-in-racism-from-twitter), and [Tesla](https://spectrum.ieee.org/cars-that-think/transportation/self-driving/three-small-stickers-on-road-can-steer-tesla-autopilot-into-oncoming-lane), have had their ML systems tricked, evaded, or misled. 25 | - This trend is only set to rise: According to a [Gartner report](https://www.gartner.com/doc/3939991). 30% of cyberattacks by 2022 will involve data poisoning, model theft or adversarial examples. 26 | - Industry is underprepared. In a [survey](https://arxiv.org/pdf/2002.05646.pdf) of 28 organizations spanning small as well as large organizations, 25 organizations did not know how to secure their ML systems. 27 | 28 | Unlike traditional cybersecurity vulnerabilities that are tied to specific software and hardware systems, adversarial ML vulnerabilities are enabled by inherent limitations underlying ML algorithms. Data can be weaponized in new ways which requires an extension of how we model cyber adversary behavior, to reflect emerging threat vectors and the rapidly evolving adversarial machine learning attack lifecycle. 29 | 30 | This threat matrix came out of partnership with 12 industry and academic research groups with the goal of empowering security analysts to orient themselves to these new and upcoming threats. **The framework is seeded with a curated set of vulnerabilities and adversary behaviors that Microsoft and MITRE have vetted to be effective against production ML systems**. We used ATT&CK as a template since security analysts are already familiar with using this type of matrix. 31 | 32 | We recommend digging into [Adversarial ML Threat Matrix](/pages/adversarial-ml-threat-matrix.md#adversarial-ml-threat-matrix). 33 | 34 | To see the Matrix in action, we recommend seeing the curated case studies 35 | 36 | - [Evasion of Deep Learning detector for malware C&C traffic](/pages/case-studies-page.md#evasion-of-deep-learning-detector-for-malware-cc-traffic) 37 | - [Botnet Domain Generation Algorithm (DGA) Detection Evasion](/pages/case-studies-page.md#botnet-domain-generation-algorithm-dga-detection-evasion) 38 | - [VirusTotal Poisoning](/pages/case-studies-page.md#virustotal-poisoning) 39 | - [Bypassing Cylance's AI Malware Detection](/pages/case-studies-page.md#bypassing-cylances-ai-malware-detection) 40 | - [Camera Hijack Attack on Facial Recognition System](/pages/case-studies-page.md#camera-hijack-attack-on-facial-recognition-system) 41 | - [Attack on Machine Translation Service - Google Translate, Bing Translator, and Systran Translate](/pages/case-studies-page.md#attack-on-machine-translation-service---google-translate-bing-translator-and-systran-translate) 42 | - [ClearviewAI Misconfiguration](/pages/case-studies-page.md#clearviewai-misconfiguration) 43 | - [GPT-2 Model Replication](/pages/case-studies-page.md#gpt-2-model-replication) 44 | - [ProofPoint Evasion](/pages/case-studies-page.md#proofpoint-evasion) 45 | - [Tay Poisoning](/pages/case-studies-page.md#tay-poisoning) 46 | - [Microsoft - Azure Service - Evasion](/pages/case-studies-page.md#microsoft---azure-service) 47 | - [Microsoft Edge AI - Evasion](/pages/case-studies-page.md#microsoft---edge-ai) 48 | - [MITRE - Physical Adversarial Attack on Face Identification](/pages/case-studies-page.md#mitre---physical-adversarial-attack-on-face-identification) 49 | 50 | 51 | ![alt text](images/AdvMLThreatMatrix.jpg) 52 | 53 | 54 | 55 | 56 | ## Contributors 57 | 58 | | **Organization** | **Contributors** | 59 | | :--- | :--- | 60 | | Microsoft | Ram Shankar Siva Kumar, Hyrum Anderson, Suzy Schapperle, Blake Strom, Madeline Carmichael, Matt Swann, Mark Russinovich, Nick Beede, Kathy Vu, Andi Comissioneru, Sharon Xia, Mario Goertzel, Jeffrey Snover, Derek Adam, Deepak Manohar, Bhairav Mehta, Peter Waxman, Abhishek Gupta, Ann Johnson, Andrew Paverd, Pete Bryan, Roberto Rodriguez, Will Pearce | 61 | | MITRE | Mikel Rodriguez, Christina Liaghati, Keith Manville, Michael Krumdick, Josh Harguess, Virginia Adams, Shiri Bendelac, Henry Conklin, Poomathi Duraisamy, David Giangrave, Emily Holt, Kyle Jackson, Nicole Lape, Sara Leary, Eliza Mace, Christopher Mobley, Savanna Smith, James Tanis, Michael Threet, David Willmes, Lily Wong | 62 | | Bosch | Manojkumar Parmar | 63 | | IBM | Pin-Yu Chen | 64 | | NVIDIA | David Reber Jr., Keith Kozo, Christopher Cottrell, Daniel Rohrer | 65 | | Airbus | Adam Wedgbury | 66 | |PricewaterhouseCoopers |Michael Montecillo| 67 | | Deep Instinct | Nadav Maman, Shimon Noam Oren, Ishai Rosenberg| 68 | | Two Six Labs | David Slater | 69 | | University of Toronto | Adelin Travers, Jonas Guan, Nicolas Papernot | 70 | | Cardiff University | Pete Burnap | 71 | | Software Engineering Institute/Carnegie Mellon University | Nathan M. VanHoudnos | 72 | | Berryville Institute of Machine Learning | Gary McGraw, Harold Figueroa, Victor Shepardson, Richie Bonett| 73 | | Citadel AI | Kenny Song | 74 | | McAfee | Christiaan Beek | 75 | | Unaffiliated | Ken Luu | 76 | | Ant Group | Henry Xuef | 77 | | Palo Alto Networks | May Wang, Stefan Achleitner, Yu Fu, Ajaya Neupane, Lei Xu | 78 | 79 | ## Feedback and Getting Involved 80 | 81 | The Adversarial ML Threat Matrix is a first-cut attempt at collating a knowledge base of how ML systems can be attacked. We need your help to make it holistic and fill in the missing gaps! 82 | 83 | ### Corrections and Improvement 84 | 85 | - For immediate corrections, please submit a Pull Request with suggested changes! We are excited to make this system better with you! 86 | - For a more hands on feedback session, we are partnering with Defcon's AI Village to open up the framework to all community members to get feedback and make it better. Current thinking is to have this workshop circa 87 | Jan/Feb 2021. Please register [here](https://docs.google.com/forms/d/e/1FAIpQLSdqtuE0v7qBRsGUUWDrzUEenHCdv-HNP1IiLil67dgpXtHqQw/viewform). 88 | 89 | ### Contribute Case Studies 90 | 91 | We are especially excited for new case-studies! We look forward to contributions from both industry and academic researchers. Before submitting a case-study, consider that the attack: 92 | 1. Exploits one or more vulnerabilities that compromises the confidentiality, integrity or availability of ML system. 93 | 2. The attack was against a *production, commercial* ML system. This can be on MLaaS like Amazon, Microsoft Azure, Google Cloud AI, IBM Watson etc or ML systems embedded in client/edge. 94 | 3. You have permission to share the information/published this research. Please follow the proper channels before reporting a new attack and make sure you are practicing responsible disclosure. 95 | 96 | You can email advmlthreatmatrix-core@googlegroups.com with summary of the incident and Adversarial ML Threat Matrix mapping. 97 | 98 | 99 | ### Join our Mailing List 100 | 101 | - For discussions around Adversarial ML Threat Matrix, we invite everyone to join our Google Group [here](https://groups.google.com/forum/#!forum/advmlthreatmatrix/join). 102 | - Note: Google Groups generally defaults to your personal email. If you would rather access this forum using your corporate email (as opposed to your gmail), you can create a Google account using your corporate email before joining the group. 103 | - Also most email clients route emails from Google Groups into "Other"/"Spam"/"Forums" folder. So, you may want to create a rule in your email client to have these emails go into your inbox instead. 104 | 105 | 106 | ## Contact Us 107 | For corrections and improvement or to contribute a case study, see [Feedback](#feedback-and-getting-involved). 108 | 109 | 110 | - For general questions/comments/discussion, our public email group is advmlthreatmatrix-core@googlegroups.com. This emails all the members of the distribution group. 111 | 112 | - For private comments/discussions and how organizations can get involved in the effort, please email: and . 113 | 114 | -------------------------------------------------------------------------------- /pages/case-studies-page.md: -------------------------------------------------------------------------------- 1 | ## Case Studies Page 2 | 3 | - [Evasion of Deep Learning detector for malware C&C traffic](/pages/case-studies-page.md#evasion-of-deep-learning-detector-for-malware-cc-traffic) 4 | - [Botnet Domain Generation Algorithm (DGA) Detection Evasion](/pages/case-studies-page.md#botnet-domain-generation-algorithm-dga-detection-evasion) 5 | - [VirusTotal Poisoning](/pages/case-studies-page.md#virustotal-poisoning) 6 | - [Bypassing Cylance's AI Malware Detection](/pages/case-studies-page.md#bypassing-cylances-ai-malware-detection) 7 | - [Camera Hijack Attack on Facial Recognition System](/pages/case-studies-page.md#camera-hijack-attack-on-facial-recognition-system) 8 | - [Attack on Machine Translation Service - Google Translate, Bing Translator, and Systran Translate](/pages/case-studies-page.md#attack-on-machine-translation-service---google-translate-bing-translator-and-systran-translate) 9 | - [ClearviewAI Misconfiguration](/pages/case-studies-page.md#clearviewai-misconfiguration) 10 | - [GPT-2 Model Replication](/pages/case-studies-page.md#gpt-2-model-replication) 11 | - [ProofPoint Evasion](/pages/case-studies-page.md#proofpoint-evasion) 12 | - [Tay Poisoning](/pages/case-studies-page.md#tay-poisoning) 13 | - [Microsoft - Azure Service - Evasion](/pages/case-studies-page.md#microsoft---azure-service) 14 | - [Microsoft Edge AI - Evasion](/pages/case-studies-page.md#microsoft---edge-ai) 15 | - [MITRE - Physical Adversarial Attack on Face Identification](/pages/case-studies-page.md#mitre---physical-adversarial-attack-on-face-identification) 16 | 17 | 18 | Attacks on machine learning (ML) systems are being developed and released with increased regularity. Historically, attacks against ML systems have been performed in a controlled academic settings, but as these case-studies demonstrate, attacks are being seen in-the-wild. In production settings ML systems are trained on personally identifiable information (PII), trusted to make critical decisions with little oversight, and have little to no logging and alerting attached to their use. The case-studies were selected because of the impact to production ML systems, and each demonstrates one of the following characteristics. 19 | 20 | 1. Range of Attacks: evasion, poisoning, model replication and exploiting traditional software flaws. 21 | 2. Range of Personas: Average user, Security researchers, ML Researchers and Fully equipped Red team. 22 | 3. Range of ML Paradigms: Attacks on MLaaS, ML models hosted on cloud, hosted on-premise, ML models on edge. 23 | 4. Range of Use case: Attacks on ML systems used in both "security-sensitive" applications like cybersecurity and non-security-sensitive applications like chatbots. 24 | 25 | ---- 26 | 27 | ## Evasion of Deep Learning detector for malware C&C traffic 28 | 29 | **Summary of Incident:** Palo Alto Networks Security AI research team tested a deep learning model for malware command and control (C&C) traffic detection in HTTP traffic. Based on the publicly available paper by Le et al. [1] (open source intelligence), we built a model that was trained on a similar dataset as our production model and had performance similar to it. Then we crafted adversarial samples and queried the model and adjusted the adversarial sample accordingly till the model was evaded. 30 | 31 | **Mapping to Adversarial Threat Matrix:** 32 | 33 | - The team trained the model on ~ 33 million benign and ~ 27 million malicious HTTP packet headers 34 | - Evaluation showed a true positive rate of ~ 99% and false positive rate of ~0.01%, on average 35 | - Testing the model with a HTTP packet header from known malware command and control traffic samples was detected as malicious with high confidence (> 99%). 36 | - The attackers crafted evasion samples by removing fields from packet header which are typically not used for C&C communication (e.g. cache-control, connection, etc.) 37 | - With the crafted samples the attackers performed online evasion of the ML based spyware detection model. The crafted packets were identified as benign with >80% confidence. 38 | - This evaluation demonstrates that adversaries are able to bypass advanced ML detection techniques, by crafting samples that are misclassified by an ML model. 39 | 40 | 41 | 42 | **Reported by:** 43 | - Palo Alto Networks (Network Security AI Research Team) 44 | 45 | **Source:** 46 | - [1] Le, Hung, et al. "URLNet: Learning a URL representation with deep learning for malicious URL detection." arXiv preprint arXiv:1802.03162 (2018). 47 | 48 | 49 | ---- 50 | 51 | ## Botnet Domain Generation Algorithm (DGA) Detection Evasion 52 | 53 | **Summary of Incident:** Palo Alto Networks Security AI research team was able to bypass a Convolutional Neural Network (CNN)-based botnet Domain Generation Algorithm (DGA) detection [1] by domain name mutations. It is a generic domain mutation technique which can evade most ML-based DGA detection modules, and can also be used for testing against all DGA detection products in the security industry. 54 | 55 | **Mapping to Adversarial Threat Matrix:** 56 | 57 | - DGA detection is a widely used technique to detect botnets in academia and industry.  58 | - The researchers look into a publicly available CNN-based DGA detection model [1] and tested against a well-known DGA generated domain name data sets, which includes ~50 million domain names from 64 botnet DGA families. 59 | - The CNN-based DGA detection model shows more than 70% detection accuracy on 16 (~25%) botnet DGA families. 60 | - On the DGA generated domain names from 16 botnet DGA families, we developed a generic mutation technique that requires a minimum number of mutations, but achieves a very high evasion rate. 61 | - Experiment results show that, after only one string is inserted once to the DGA generated domain names, the detection rate of all 16 botnet DGA families can drop to less than 25% detection accuracy. 62 | - The mutation technique can evade almost all DGA detections, not limited to CNN-based DGA detection shown in this example. If the attackers add it on top of the existing DGA, most of the DGA detections might fail. 63 | - The generic mutation techniques can also be used to test the effectiveness and robustness of all DGA detection methods developed by security companies in the industry before it is deployed to the production environment. 64 | 65 | 66 | 67 | **Reported by:** 68 | - Palo Alto Networks (Network Security AI Research Team) 69 | 70 | **Source:** 71 | - [1] Yu, Bin, Jie Pan, Jiaming Hu, Anderson Nascimento, and Martine De Cock. "Character level based detection of DGA domain names." In 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1-8. IEEE, 2018. Source code is available from Github: https://github.com/matthoffman/degas 72 | 73 | 74 | ---- 75 | 76 | ## VirusTotal Poisoning 77 | 78 | **Summary of Incident:** An increase in reports of a certain ransomware family that was out of the ordinary was noticed. In investigating the case, it was observed that many samples of that particular ransomware family were submitted through a popular Virus-Sharing platform within a short amount of time. Further investigation revealed that based on string similarity, the samples were all equivalent, and based on code similarity they were between 98 and 74 percent similar. Interestingly enough, the compile time was the same for all the samples. After more digging, the discovery was made that someone used 'metame' a metamorphic code manipulating tool to manipulate the original file towards mutant variants. The variants wouldn't always be executable but still classified as the same ransomware family. 79 | 80 | **Mapping to Adversarial Threat Matrix:** 81 | 82 | - The actor used a malware sample from a prevalent ransomware family as a start to create ‘mutant’ variants. 83 | - The actor uploaded ‘mutant’ samples to the platform. 84 | - Several vendors started to classify the files as the ransomware family even though most of them won’t run. 85 | - The ‘mutant‘ samples poisoned the dataset the ML model(s) use to identify and classify this ransomware family. 86 | 87 | 88 | 89 | **Reported by:** 90 | - Christiaan Beek (@ChristiaanBeek) - McAfee ATR Team 91 | 92 | **Source:** 93 | - McAfee Advanced Threat Research 94 | 95 | 96 | ---- 97 | 98 | ## Bypassing Cylance's AI Malware Detection 99 | 100 | **Summary of Incident:** Researchers at Skylight were able to create a universal bypass string that when appended to a malicious file evades detection by Cylance's AI Malware detector. 101 | 102 | **Mapping to Adversarial Threat Matrix :** 103 | - The researchers read publicly available information and enabled verbose logging to understand the inner workings of the ML model, particularly around reputation scoring. 104 | - The researchers reverse-engineered the ML model to understand which attributes provided what level of positive or negative reputation. Along the way, they discovered a secondary model which was an override for the first model. Positive assessments from the second model overrode the decision of the core ML model. 105 | - Using this knowledge, the researchers fused attributes of known good files with malware. Due to the secondary model overriding the primary, the researchers were effectively able to bypass the ML model. 106 | 107 | Cylance 108 | 109 | **Reported by:** 110 | Research and work by Adi Ashkenazy, Shahar Zini, and SkyLight Cyber team. Notified to us by Ken Luu (@devianz_) 111 | 112 | 113 | **Source:** 114 | - https://skylightcyber.com/2019/07/18/cylance-i-kill-you/ 115 | 116 | 117 | ---- 118 | 119 | ## Camera Hijack Attack on Facial Recognition System 120 | **Summary of Incident:** This type of attack can break through the traditional live detection model and cause the misuse of face recognition. 121 | 122 | **Mapping to Adversarial Threat Matrix:** 123 | - The attackers bought customized low-end mobile phones, customized android ROMs, a specific virtual camera application, identity information and face photos. 124 | - The attackers used software to turn static photos into videos, adding realistic effects such as blinking eyes. Then the attackers use the purchased low-end mobile phone to import the generated video into the virtual camera app. 125 | - The attackers registered an account with the victim's identity information. In the verification phase, the face recognition system called the camera API, but because the system was hooked or rooted, the video stream given to the face recognition system was actually provided by the virtual camera app. 126 | - The attackers successfully evaded the face recognition system and impersonated the victim. 127 | 128 | 129 | 130 | **Reported by:** 131 | - Henry Xuef, Ant Group AISEC Team 132 | 133 | **Source:** 134 | - Ant Group AISEC Team 135 | 136 | 137 | ---- 138 | 139 | ## Attack on Machine Translation Service - Google Translate, Bing Translator, and Systran Translate 140 | 141 | **Summary of Incident:** 142 | Machine translation services (such as Google Translate, Bing Translator, and Systran Translate) provide public-facing UIs and APIs. A research group at UC Berkeley utilized these public endpoints to create an "imitation model" with near-production, state-of-the-art translation quality. Beyond demonstrating that IP can be stolen from a black-box system, they used the imitation model to successfully transfer adversarial examples to the real production services. These adversarial inputs successfully cause targeted word flips, vulgar outputs, and dropped sentences on Google Translate and Systran Translate websites. 143 | 144 | **Mapping to Adversarial Threat Matrix:** 145 | - Using published research papers, the researchers gathered similar datasets and model architectures that these translation services used 146 | - They abuse a public facing application to query the model and produce machine translated sentence pairs as training data 147 | - Using these translated sentence pairs, researchers trained a substitute model (model replication) 148 | - The replicated models were used to construct offline adversarial examples that successfully transferred to an online evasion attack 149 | 150 | 151 | 152 | **Reported by:** 153 | - Work by Eric Wallace, Mitchell Stern, Dawn Song and reported by Kenny Song (@helloksong) 154 | 155 | **Source:** 156 | - https://arxiv.org/abs/2004.15015 157 | - https://www.ericswallace.com/imitation 158 | 159 | 160 | ---- 161 | 162 | ## ClearviewAI Misconfiguration 163 | **Summary of Incident:** Clearview AI's source code repository, though password protected, was misconfigured to allow an arbitrary user to register an account. This allowed an external researcher to gain access to a private code repository that contained Clearview AI production credentials, keys to cloud storage buckets containing 70K video samples, and copies of its applications and Slack tokens. With access to training data, a bad-actor has the ability to cause an arbitrary misclassification in the deployed model. 164 | 165 | **Mapping to Adversarial Threat Matrix :** 166 | - In this scenario, a security researcher gained initial access to via a "Valid Account" that was created through a misconfiguration. No Adversarial ML techniques were used. 167 | - These kinds of attacks illustrate that any attempt to secure ML system should be on top of "traditional" good cybersecurity hygiene such as locking down the system with least privileges, multi-factor authentication and monitoring and auditing. 168 | 169 | ClearviewAI 170 | 171 | **Reported by:** 172 | - Mossab Hussein (@mossab_hussein) 173 | 174 | **Source:** 175 | - https://techcrunch.com/2020/04/16/clearview-source-code-lapse/amp/ 176 | - https://gizmodo.com/we-found-clearview-ais-shady-face-recognition-app-1841961772 177 | 178 | ---- 179 | 180 | ## GPT-2 Model Replication 181 | 182 | **Summary of Incident:** : OpenAI built GPT-2, a powerful natural language model and adopted a staged-release process to incrementally release 1.5 Billion parameter model. Before the 1.5B parameter model could be released by OpenAI eventually, two ML researchers replicated the model and released it to the public. *Note this is an example of model replication NOT model model extraction. Here, the attacker is able to recover a functionally equivalent model but generally with lower fidelity than the original model, perhaps to do reconnaissance (See ProofPoint attack). In Model extraction, the fidelity of the model is comparable to the original, victim model.* 183 | 184 | **Mapping to Adversarial Threat Matrix :** 185 | - Using public documentation about GPT-2, ML researchers gathered similar datasets used during the original GPT-2 training. 186 | - Next, they used a different publicly available NLP model (called Grover) and modified Grover's objective function to reflect 187 | GPT-2's objective function. 188 | - The researchers then trained the modified Grover on the dataset they curated, using Grover's initial hyperparameters, which 189 | resulted in their replicated model. 190 | 191 | GPT2_Replication 192 | 193 | **Reported by:** 194 | - Vanya Cohen (@VanyaCohen) 195 | - Aaron Gokaslan (@SkyLi0n) 196 | - Ellie Pavlick 197 | - Stefanie Tellex 198 | 199 | **Source:** 200 | - https://www.wired.com/story/dangerous-ai-open-source/ 201 | - https://blog.usejournal.com/opengpt-2-we-replicated-gpt-2-because-you-can-too-45e34e6d36dc 202 | 203 | 204 | ---- 205 | 206 | ## ProofPoint Evasion 207 | 208 | **Summary of Incident:** : CVE-2019-20634 describes how ML researchers evaded ProofPoint's email protection system by first building a copy-cat email protection ML model, and using the insights to evade the live system. 209 | 210 | **Mapping to Adversarial Threat Matrix :** 211 | - The researchers first gathered the scores from the Proofpoint's ML system used in email email headers. 212 | - Using these scores, the researchers replicated the ML mode by building a "shadow" aka copy-cat ML model. 213 | - Next, the ML researchers algorithmically found samples that this "offline" copy cat model. 214 | - Finally, these insights from the offline model allowed the researchers to create malicious emails that received preferable scores from the real ProofPoint email protection system, hence bypassing it. 215 | 216 | PFPT_Evasion 217 | 218 | **Reported by:** 219 | - Will Pearce (@moo_hax) 220 | - Nick Landers (@monoxgas) 221 | 222 | **Source:** 223 | - https://nvd.nist.gov/vuln/detail/CVE-2019-20634 224 | - https://github.com/moohax/Talks/blob/master/slides/DerbyCon19.pdf 225 | - https://github.com/moohax/Proof-Pudding 226 | 227 | ---- 228 | 229 | ## Tay Poisoning 230 | 231 | **Summary of Incident:** Microsoft created Tay, a twitter chatbot for 18- to 24- year-olds in the U.S. for entertainment purposes. Within 24 hours of its deployment, Tay had to be decommissioned because it tweeted reprehensible words. 232 | 233 | **Mapping to Adversarial Threat Matrix :** 234 | - Tay bot used the interactions with its twitter users as training data to improve its conversations. 235 | - Average users of Twitter coordinated together with the intent of defacing Tay bot by exploiting this feedback loop. 236 | - As a result of this coordinated attack, Tay's training data was poisoned which led its conversation algorithms to generate more reprehensible material. 237 | 238 | Tay_Poisoning 239 | 240 | 241 | **Source:** 242 | - https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/ 243 | - https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/in-2016-microsofts-racist-chatbot-revealed-the-dangers-of-online-conversation 244 | 245 | ---- 246 | 247 | ## Microsoft - Azure Service 248 | **Summary of Incident:** : The Azure Red Team and Azure Trustworthy ML team performed a red team exercise on an internal Azure service with the intention of disrupting its service. 249 | 250 | **Reported by:** Microsoft 251 | **Mapping to Adversarial Threat Matrix :** 252 | - The team first performed reconnaissance to gather information about the target ML model. 253 | - Then, using a valid account the team found the model file of the target ML model and the necessary training data. 254 | - Using this, the red team performed an offline evasion attack by methodically searching for adversarial examples. 255 | - Via an exposed API interface, the team performed an online evasion attack by replaying the adversarial examples, which helped achieve this goal. 256 | - This operation had a combination of traditional ATT&CK enterprise techniques such as finding Valid account, and Executing code via an API -- all interleaved with adversarial ML specific steps such as offline and online evasion examples. 257 | 258 | ![MS_Azure](/images/Msft1.PNG) 259 | 260 | 261 | **Reported by:** 262 | - Microsoft (Azure Trustworthy Machine Learning) 263 | 264 | **Source:** 265 | - None 266 | 267 | ---- 268 | 269 | ## Microsoft - Edge AI 270 | **Summary of Incident:** The Azure Red Team performed a red team exercise on a new Microsoft product designed for running AI workloads at the Edge. 271 | 272 | **Mapping to Adversarial Threat Matrix:** 273 | - The team first performed reconnaissance to gather information about the target ML model. 274 | - Then, used a publicly available version of the ML model, started sending queries and analyzing the responses (inferences) from the ML model. 275 | - Using this, the red team created an automated system that continuously manipulated an original target image, that tricked the ML model into producing incorrect inferences, but the perturbations in the image were unnoticeable to the human eye. 276 | - Feeding this perturbed image, the red team was able to evade the ML model into misclassifying the input image. 277 | - This operation had one step in the traditional MITRE ATT&CK techniques to do reconnaissance on the ML model being used in the product, and then the rest of the techniques was to use Offline evasion, followed by online evasion of the targeted product. 278 | 279 | ![alt_text](/images/msft2.png) 280 | 281 | **Reported by:** 282 | Microsoft 283 | 284 | **Source:** 285 | None 286 | 287 | ---- 288 | ## MITRE - Physical Adversarial Attack on Face Identification 289 | 290 | **Summary of Incident:** MITRE’s AI Red Team demonstrated a physical-domain evasion attack on a commercial face identification service with the intention of inducing a targeted misclassification. 291 | 292 | **Mapping to Adversarial Threat Matrix:** 293 | 294 | - The team first performed reconnaissance to gather information about the target ML model. 295 | - Using a valid account, the team identified the list of IDs targeted by the model. 296 | - The team developed a proxy model using open source data. 297 | - Using the proxy model, the red team optimized a physical domain patch-based attack using an expectation of transformations. 298 | - Via an exposed API interface, the team performed an online physical-domain evasion attack including the adversarial patch in the input stream which resulted in a targeted misclassification. 299 | - This operation had a combination of traditional ATT&CK enterprise techniques such as finding Valid account, and Executing code via an API – all interleaved with adversarial ML specific attacks. 300 | 301 | ![mitre](/images/mitre.png) 302 | 303 | **Reported by:** 304 | MITRE AI Red Team 305 | 306 | **Source:** 307 | None 308 | 309 | ---- 310 | # Contributing New Case Studies 311 | 312 | We are especially excited for new case-studies! We look forward to contributions from both industry and academic researchers. Before submitting a case-study, consider that the attack: 313 | 1. Exploits one or more vulnerabilities that compromises the confidentiality, integrity or availability of ML system. 314 | 2. The attack was against a *production, commercial* ML system. This can be on MLaaS like Amazon, Microsoft Azure, Google Cloud AI, IBM Watson etc or ML systems embedded in client/edge. 315 | 3. You have permission to share the information/published this research. Please follow the proper channels before reporting a new attack and make sure you are practicing responsible disclosure. 316 | 317 | You can email advmlthreatmatrix-core@googlegroups.com with summary of the incident, Adversarial ML Threat Matrix mapping and associated resources. 318 | -------------------------------------------------------------------------------- /pages/adversarial-ml-threat-matrix.md: -------------------------------------------------------------------------------- 1 | # Adversarial ML Threat Matrix 2 | 3 | ## Things to keep in mind before you use the framework: 4 | 1. This is a **first cut attempt** at collating known adversary techniques against ML Systems. We plan to iterate on the framework based on feedback from the security and adversarial machine learning community (please engage with us and help make the matrix better!). Note: This is a *living document* that will be routinely updated. 5 | - Have feedback or improvements? We want it! See [Feedback](/readme.md#feedback-and-getting-involved). 6 | 2. Only known bad is listed in the Matrix. Adversarial ML is an active area of research with new classes constantly being discovered. If you find a technique that is not listed, please enlist it in the framework (see section on [Feedback](/readme.md#feedback-and-getting-involved)). 7 | 3. We are not prescribing definitive defenses at this point since the field has not reached consensus. We are already in conversations with industry members to add best practices in future revisions such as adversarial training for adversarial examples, restricting the number of significant digits in confidence score for model stealing. 8 | 4. This is not a risk prioritization framework - The Threat Matrix only collates the known techniques; it does not provide a means to prioritize the risks. 9 | 10 | ## Structure of Adversarial ML Threat Matrix 11 | Because the Adversarial ML Threat Matrix is fashioned after [ATT&CK Enterprise](https://attack.mitre.org/matrices/enterprise/), it retains the terminologies: for instance, the column heads of the Threat Matrix are called "Tactics" and the individual entities are called "Techniques". 12 | 13 | However, there are two main differences: 14 | 15 | 1. ATT&CK Enterprise is generally designed for corporate network which is composed of many sub components like workstation, bastion hosts, database, network gear, active directory, cloud component and so on. The tactics of ATT&CK enterprise (initial access, persistence, etc) are really a short hand of saying initial access to *corporate network;* persistence *in corporate network.* In Adversarial ML Threat Matrix, we acknowledge that ML systems are part of corporate network but wanted to highlight the uniqueness of the attacks. 16 | 17 | **Difference:** In the Adversarial ML Threat Matrix, the "Tactics" should be read as "reconnaissance of ML subsystem", "persistence in ML subsystem", "evading the ML subsystem". 18 | 19 | 2. When we analyzed real-world attacks on ML systems, we found out that attackers can pursue different strategies: Rely on traditional cybersecurity technique only; Rely on Adversarial ML techniques only; or Employ a combination of traditional cybersecurity techniques and ML techniques. 20 | 21 | **Difference:** In Adversarial ML Threat Matrix, "Techniques" come in two flavors: 22 | - Techniques in orange are specific to ML systems 23 | - Techniques in white are applicable to both ML and non-ML systems and come directly from Enterprise ATT&CK 24 | 25 | Note: The Adversarial ML Threat Matrix is not yet part of the ATT&CK matrix. 26 | 27 | 28 | ![Adversarial ML Threat Matrix](/images/AdvMLThreatMatrix.jpg) 29 | 30 | |Legend | Description | 31 | |:---: | :--- | 32 | |![Cyber](/images/color_cyber.png) | Techniques that are applicable to both ML and non-ML systems and come directly from Enterprise ATT&CK | 33 | |![AdvML](/images/color_advml.png) | Techniques that are specific to ML systems | 34 | 35 | ## Adversarial ML Matrix - Description 36 | 37 | ### Reconnaissance 38 | 39 | 40 | #### ![AdvML](/images/color_advml.png)Acquire OSINT Information 41 | 42 | Adversaries may leverage publicly available information, or Open Source Intelligence (OSINT), about an organization that could identify where or how machine learning is being used in a system, and help tailor an attack to make it more effective. These sources of information include technical publications, blog posts, press releases, software repositories, public data repositories, and social media postings. 43 | 44 | #### ![AdvML](/images/color_advml.png)ML Model Discovery 45 | 46 | Adversaries may attempt to identify machine learning pipelines that exist on the system and gather information about them, including the software stack used to train and deploy models, training and testing data repositories, model repositories, and software repositories containing algorithms. This information can be used to identify targets for further collection, exfiltration, or disruption, or to tailor and improve attacks. 47 | 48 | > ##### ![AdvML](/images/color_advml.png)Reveal ML Ontology 49 | > By ML ML Ontology, we are referring to specific components of the ML system such as dataset (image, audio, tabular, NLP), features (handcrafted or learned), model / learning algorithm (gradient based or non-gradient based), parameters / weights. Depending on how much information is known, it can be a greybox or whitebox level of attacker knowledge. 50 | > 51 | > 52 | > ##### ![AdvML](/images/color_advml.png)Reveal ML Model Family 53 | > Here the specifics of ML Models are not known and can generally be thought of as blackbox attacks. The attacker is able to only glean the model task, model input and model output. But because of the nature of the blog posts or papers that are published, some mention of the algorithms such as "Deep Learning" may squarely indicate that the underlying algorithm is gradient based. 54 | 55 | 56 | #### ![AdvML](/images/color_advml.png)Gathering Datasets 57 | 58 | Adversaries may collect datasets similar to those used by a particular organization or in a specific approach. Datasets may be identified when [Acquiring OSINT Information](#Acquire-OSINT-Information). This may allow the adversary to replicate a private model's functionality, constituting Intellectual Property Theft, or enable the adversary to carry out other attacks such as an [Evasion Attack](#Evasion-Attack). 59 | 60 | #### ![AdvML](/images/color_advml.png)Exploit Physical Environment 61 | 62 | In addition to the attacks that take place purely in the digital domain, adversaries may also exploit the physical environment for their attacks. Recent work has show successful false positive and evasion attacks using physically printed patterns that are placed into scenes to disrupt and attack machine learning models. MITRE has recently created a dataset based on these [physically printed patterns](https://apricot.mitre.org/) to help researchers and practitioners better understand these attacks. 63 | 64 | #### ![AdvML](/images/color_advml.png)Model Replication 65 | 66 | An adversary may replicate a model's functionality by training a shadow model by exploiting its API, or by leveraging pre-trained weights. 67 | 68 | > ##### ![AdvML](/images/color_advml.png)Exploit API - Shadow Model 69 | > 70 | > An adversary may replicate a machine learning model's functionality by exploiting its inference API. In this case of model replication, the attacker repeatedly queries the victim's inference API and uses it as an oracle to collect combination of data and label. From the combination of (data,label), the attacker builds a shadow model, that effectively functions as the victim model -- but with lower fidelity. This is generally the first step in model evasion. 71 | > 72 | > ##### ![AdvML](/images/color_advml.png)Alter Publicly Available, Pre-Trained Weights 73 | > 74 | > An adversary uses pre-trained weights of one model to replicate a related model's functionality. For instance, researchers wanted to replicated GPT-2, a large language model. So, the researchers used the pre-trained weights of Grover, another NLP model, and modified it using GPT-2's objective function and training data, which effectively resulted in a shadow GPT-2 model (though with lower fidelity). 75 | 76 | #### ![AdvML](/images/color_advml.png)Model Stealing 77 | 78 | Machine learning models' functionality can be stolen exploiting an inference API. There is a difference between Model Extraction and Model Replication: in model extraction attacks, the attacker is able to build a shadow model whose fidelity matches that of the victim model and hence, model stealing/extraction attacks lead to Stolen Intellectual Property. In [Model Replication](#Model-Replication) attacks, the shadow model does not have the same fidelity as that of the victim model. 79 | 80 | ### Initial Access 81 | 82 | #### ![AdvML](/images/color_advml.png)Pre-Trained ML Model with Backdoor 83 | 84 | Adversaries may gain initial access to a system by compromising portions of the ML supply chain. This could include GPU hardware, data and its annotations, parts of the ML software stack, or the model itself. In some instances the attacker will need secondary access to fully carry out an attack using compromised components of the supply chain. 85 | 86 | ### ![Cyber](/images/color_cyber.png) Included ATT&CK Techniques 87 |
88 | Exploit Public-Facing Application 89 | 90 | Adversaries may attempt to take advantage of a weakness in an Internet-facing computer or program using software, data, or commands in order to cause unintended or unanticipated behavior. The weakness in the system can be a bug, a glitch, or a design vulnerability. These applications are often websites, but can include databases (like SQL)(Citation: NVD CVE-2016-6662), standard services (like SMB(Citation: CIS Multiple SMB Vulnerabilities) or SSH), and any other applications with Internet accessible open sockets, such as web servers and related services.(Citation: NVD CVE-2014-7169) Depending on the flaw being exploited this may include [Exploitation for Defense Evasion](https://attack.mitre.org/techniques/T1211). 91 | 92 | If an application is hosted on cloud-based infrastructure, then exploiting it may lead to compromise of the underlying instance. This can allow an adversary a path to access the cloud APIs or to take advantage of weak identity and access management policies. 93 | 94 | For websites and databases, the OWASP top 10 and CWE top 25 highlight the most common web-based vulnerabilities.(Citation: OWASP Top 10)(Citation: CWE top 25) 95 |
96 | 97 |
98 | Valid Accounts 99 | 100 | Adversaries may obtain and abuse credentials of existing accounts as a means of gaining Initial Access, Persistence, Privilege Escalation, or Defense Evasion. Compromised credentials may be used to bypass access controls placed on various resources on systems within the network and may even be used for persistent access to remote systems and externally available services, such as VPNs, Outlook Web Access and remote desktop. Compromised credentials may also grant an adversary increased privilege to specific systems or access to restricted areas of the network. Adversaries may choose not to use malware or tools in conjunction with the legitimate access those credentials provide to make it harder to detect their presence. 101 | 102 | The overlap of permissions for local, domain, and cloud accounts across a network of systems is of concern because the adversary may be able to pivot across accounts and systems to reach a high level of access (i.e., domain or enterprise administrator) to bypass access controls set within the enterprise. (Citation: TechNet Credential Theft) 103 |
104 | 105 |
106 | Phishing 107 | 108 | Adversaries may send phishing messages to elicit sensitive information and/or gain access to victim systems. All forms of phishing are electronically delivered social engineering. Phishing can be targeted, known as spear phishing. In spear phishing, a specific individual, company, or industry will be targeted by the adversary. More generally, adversaries can conduct non-targeted phishing, such as in mass malware spam campaigns. 109 | 110 | Adversaries may send victim’s emails containing malicious attachments or links, typically to execute malicious code on victim systems or to gather credentials for use of [Valid Accounts](https://attack.mitre.org/techniques/T1078). Phishing may also be conducted via third-party services, like social media platforms. 111 |
112 | 113 |
114 | External Remote Services 115 | 116 | Adversaries may leverage external-facing remote services to initially access and/or persist within a network. Remote services such as VPNs, Citrix, and other access mechanisms allow users to connect to internal enterprise network resources from external locations. There are often remote service gateways that manage connections and credential authentication for these services. Services such as [Windows Remote Management](https://attack.mitre.org/techniques/T1021/006/) can also be used externally. 117 | 118 | Access to [Valid Accounts](https://attack.mitre.org/techniques/T1078) to use the service is often a requirement, which could be obtained through credential pharming or by obtaining the credentials from users after compromising the enterprise network. Access to remote services may be used as a redundant or persistent access mechanism during an operation. 119 |
120 | 121 |
122 | Trusted Relationship 123 | 124 | Adversaries may breach or otherwise leverage organizations who have access to intended victims. Access through trusted third party relationship exploits an existing connection that may not be protected or receives less scrutiny than standard mechanisms of gaining access to a network. 125 | 126 | Organizations often grant elevated access to second or third-party external providers in order to allow them to manage internal systems as well as cloud-based environments. Some examples of these relationships include IT services contractors, managed security providers, infrastructure contractors (e.g. HVAC, elevators, physical security). The third-party provider's access may be intended to be limited to the infrastructure being maintained, but may exist on the same network as the rest of the enterprise. As such, [Valid Accounts](https://attack.mitre.org/techniques/T1078) used by the other party for access to internal network systems may be compromised and used. 127 |
128 | 129 | ### Execution 130 | #### ![AdvML](/images/color_advml.png)Execute Unsafe ML Models 131 | 132 | An Adversary may utilize unsafe ML Models that when executed have an unintended effect. The adversary can use this technique to establish persistent access to systems. These models may be introduced via a [Pre-Trained Model with Backdoor](#Pre-Trained-ML-Model-with-Backdoor). 133 | 134 | > ##### ![AdvML](/images/color_advml.png)ML Models from Compromised Sources 135 | > 136 | > TIn Model Zoo such as "Caffe Model Zoo" or "ONNX Model Zoo" a collection of state of the art ML, pre-trained ML models are available so that ML engineers do not have to spend resources training ML models from scratch (hence "pre-trained"). An adversary may be able to compromise the model by checking in malicious code into the repository or perform a Man-in-the-Middle attack as the models are downloaded. 137 | > 138 | > ##### ![AdvML](/images/color_advml.png)Pickle Embedding 139 | > 140 | > Python is one of the most commonly used ML language. Python pickles are used in serializing and de-serializing a Python object structures. ML models are sometimes stored as pickles and shared. An adversary may use pickle embedding to introduce malicious data payloads which may result in remote code execution. 141 | 142 | ### ![Cyber](/images/color_cyber.png) Included ATT&CK Techniques 143 |
144 | Execution via API 145 | For most Machine Learning as a Service (MLaaS), the primary interaction point is via an API. So, attackers interact with the API in three ways: To build an offline copy of the model through Model Stealing or Model Replication; or to do online attacks like Model Inversion, Online Evasion, Membership inference. Execution via API is also possible for causative attacks if the adversary can taint the training data of the model via feedback loop, 146 |
147 | 148 |
149 | Traditional Software Attacks 150 | All ML models exist in code, and thus vulnerable to "traditional software attacks" if the underlying system is not secured appropriately. For instance, [researchers](https://arxiv.org/abs/1711.11008) found that a number of popular ML dev packages like Tensorflow, Caffe, OpenCV had open CVEs against them, making them vulnerable to traditional heap overflow and buffer overflow attacks. 151 | 152 | 153 |
154 | 155 | ### Persistence 156 | #### ![AdvML](/images/color_advml.png)Execute unsafe ML Model Execution 157 | 158 | An Adversary may utilize unsafe ML Models that when executed have an unintended effect. The adversary can use this technique to establish persistent access to systems. These models may be introduced via a [Pre-trained Model with Backdoor](#Pre-Trained-ML-Model-with-Backdoor). An example of this technique is to use pickle embedding to introduce malicious data payloads. 159 | 160 | ### ![Cyber](/images/color_cyber.png) Included ATT&CK Techniques 161 |
162 | Account Manipulation 163 | Adversaries may manipulate accounts to maintain access to victim systems. Account manipulation may consist of any action that preserves adversary access to a compromised account, such as modifying credentials or permission groups. These actions could also include account activity designed to subvert security policies, such as performing iterative password updates to bypass password duration policies and preserve the life of compromised credentials. In order to create or manipulate accounts, the adversary must already have sufficient permissions on systems or the domain. 164 |
165 | 166 |
167 | Implant Container Image 168 | Adversaries may implant cloud container images with malicious code to establish persistence. Amazon Web Service (AWS) Amazon Machine Images (AMI), Google Cloud Platform (GCP) Images, and Azure Images as well as popular container runtimes such as Docker can be implanted or backdoored. Depending on how the infrastructure is provisioned, this could provide persistent access if the infrastructure provisioning tool is instructed to always use the latest image.(Citation: Rhino Labs Cloud Image Backdoor Technique Sept 2019) 169 | 170 | A tool has been developed to facilitate planting backdoors in cloud container images.(Citation: Rhino Labs Cloud Backdoor September 2019) If an attacker has access to a compromised AWS instance, and permissions to list the available container images, they may implant a backdoor such as a [Web Shell](https://attack.mitre.org/techniques/T1505/003).(Citation: Rhino Labs Cloud Image Backdoor Technique Sept 2019) Adversaries may also implant Docker images that may be inadvertently used in cloud deployments, which has been reported in some instances of cryptomining botnets.(Citation: ATT Cybersecurity Cryptocurrency Attacks on Cloud) 171 |
172 | 173 | ### Evasion 174 | 175 | #### ![AdvML](/images/color_advml.png)Evasion Attack 176 | 177 | Unlike poisoning attacks that needs access to training data, adversaries can fool an ML classifier by simply corrupting the query to the ML model. More broadly, the adversary can create data inputs that prevent a machine learning model from positively identifying the data sample. This technique can be used to evade an ML model to correctly classify it in the downstream task. 178 | 179 | 180 | > ##### ![AdvML](/images/color_advml.png)Offline Evasion 181 | > In this case, the attacker has an offline copy of the ML model that was obtained via Model Replication or Model Extraction - depending on the case, the offline copy may be a shadow copy or a faithful reconstruction of the original model. While the goal of the adversary is to evade an online model, having access to an Offline model provides a space for the attacker to evade ML model without the fear of tripwires. Once the sample that evades the ML model is found, the attacker can essentially replay the sample to the victim, online model and be successful in the operation. 182 | 183 | > Now this asks the question - how can an an adversary find the sample algorithmically that evades the offline ML model? There are many strategies at play, and depending on the economics, the attacker may choose one from the following: Simple Transformation of the input (cropping, shearing, translation), Common Corruption (adding white noise in the background), Adversarial Examples (carefully perturbing the input to achieve desired output) and Happy String (wherein the benign input is tacked onto malicious query points). 184 | > 185 | > ##### ![AdvML](/images/color_advml.png)Online Evasion 186 | > 187 | > The same sub techniques like Simple Transformation, Common Corruption, Adversarial Examples, Happy Strings work also in the context of Online evasion attacks. The distinction between Offline and Online is if the model under attack is either stolen/replicated or if it is the live ML model. 188 | 189 | #### ![AdvML](/images/color_advml.png)Model Poisoning 190 | 191 | Adversaries can train machine learning models that are performant, but contain backdoors that produce inference errors when presented with input containing a trigger defined by the adversary. A model with a backdoor can be introduced by an innocent user via a [pre-trained model with backdoor](#Pre-Trained-ML-Model-with-Backdoor) or can be a result of [Data Poisoning](#Data-Poisoning). This backdoored model can be exploited at inference time with an [Evasion Attack](#Evasion-Attack). 192 | 193 | #### ![AdvML](/images/color_advml.png)Data Poisoning 194 | 195 | Adversaries may attempt to poison datasets used by a ML system by modifying the underlying data or its labels. This allows the adversary to embed vulnerabilities in ML models trained on the data that may not be easily detectable. The embedded vulnerability can be activated at a later time by providing the model with data containing the trigger. Data Poisoning can help enable attacks such as [ML Model Evasion](#Evasion-Attack). 196 | 197 | > ###### ![AdvML](/images/color_advml.png)Tainting Data from Acquisition - Label Corruption 198 | > 199 | > Adversaries may attempt to alter labels in a training set causing the model to misclassify. 200 | > 201 | > ###### ![AdvML](/images/color_advml.png)Tainting Data from Open Source Supply Chains 202 | > 203 | > Adversaries may attempt to add their own data to an open source dataset which could create a classification backdoor. For instance, the adversary could cause a targeted misclassification attack only when certain triggers are present in the query; and perform well otherwise. 204 | > 205 | > ###### ![AdvML](/images/color_advml.png)Tainting Data from Acquisition - Chaff Data 206 | > 207 | > Adding noise to a dataset would lower the accuracy of the model, potentially making the model more vulnerable to misclassifications. For instance, researchers showed how they can overwhelm Splunk (and hence the ML models feeding from it), by simply adding potentially corrupted data. See [Attacking SIEM with Fake Logs](https://letsdefend.io/blog/attacking-siem-with-fake-logs/) 208 | > 209 | > ###### ![AdvML](/images/color_advml.png)Tainting Data in Training - Label Corruption 210 | > 211 | > Changing training labels could create a backdoor in the model, such that a malicious input would always be classified to the benefit of the adversary. For instance, the adversary could cause a targeted misclassification attack only when certain triggers are present in the query; and perform well otherwise. 212 | 213 | ### Exfiltration 214 | 215 | #### ![AdvML](/images/color_advml.png)Exfiltrate Training Data 216 | 217 | Adversaries may exfiltrate private information related to machine learning models via their inference APIs. Additionally, adversaries can use these APIs to create copy-cat or proxy models. 218 | 219 | > ##### ![AdvML](/images/color_advml.png)Membership Inference Attack 220 | > 221 | > The membership of a data sample in a training set may be inferred by an adversary with access to an inference API. By simply querying the inference API of the victim model strategically -- and no extra access -- the adversary can cause privacy violations. 222 | > 223 | > ##### ![AdvML](/images/color_advml.png)ML Model Inversion 224 | > 225 | > Machine learning models' training data could be reconstructed by exploiting the confidence scores that are available via an inference API. By simply querying the inference API strategically, an adversary could back out potentially private information embedded within the training data. This could lead to privacy violations if the attacker can reconstruct the data of sensitive features used in the algorithm. 226 | 227 | #### ![AdvML](/images/color_advml.png)ML Model Stealing 228 | 229 | Machine learning models' functionality can be stolen exploiting an inference API. There is a difference between Model Extraction and Model Replication: in model extraction attacks, the attacker is able to build a shadow model whose fidelity matches that of the victim model and hence, model stealing/extraction attacks lead to [Stolen Intellectual Property](#Stolen-Intellectual-Property). In Model Replication attacks, shown above, the shadow model does not have the same fidelity as that of the victim model. 230 | 231 | #### ![Cyber](/images/color_cyber.png) Included ATT&CK Techniques 232 |
233 | Insecure Storage 234 | 235 | Adversaries may exfiltrate proprietary machine learning models or private training and testing data by exploiting insecure storage mechanisms. Adversaries may [discover](#ML-Model-Discovery), and exfiltrate components of a ML pipeline, resulting in Stolen Intellectual Property 236 | 237 |
238 | 239 | ### Impact 240 | 241 | #### ![AdvML](/images/color_advml.png)Defacement 242 | 243 | Adversaries can create data inputs that can be used to subvert the system for fun. This can be acheived corrupting the training data via poisoning as in the case of defacement of [Tay Bot](/pages/case-studies-page.md#tay-poisoning), Evasion or exploiting open CVEs in ML dev packages. 244 | 245 | #### ![AdvML](/images/color_advml.png)Denial of Service 246 | 247 | Adversaries may target different Machine Learning services to conduct a DoS. One example of this type of attack is [Sponge examples](https://arxiv.org/abs/2006.03463) that could cause DoS on production NLP systems by wasting its energy consumption. 248 | 249 | #### ![Cyber](/images/color_cyber.png) Included ATT&CK Techniques 250 |
251 | Stolen Intellectual Property 252 | 253 | Adversaries may steal intellectual property by [Model Replication](#ML-Model-Replication) or [Model Stealing](#ML-Model-Stealing). 254 | 255 |
256 | 257 |
258 | Data Encrypted for Impact 259 | 260 | Adversaries may encrypt data on target systems or on large numbers of systems in a network to interrupt availability to system and network resources. They can attempt to render stored data inaccessible by encrypting files or data on local and remote drives and withholding access to a decryption key. This may be done in order to extract monetary compensation from a victim in exchange for decryption or a decryption key (ransomware) or to render data permanently inaccessible in cases where the key is not saved or transmitted.(Citation: US-CERT Ransomware 2016)(Citation: FireEye WannaCry 2017)(Citation: US-CERT NotPetya 2017)(Citation: US-CERT SamSam 2018) In the case of ransomware, it is typical that common user files like Office documents, PDFs, images, videos, audio, text, and source code files will be encrypted. In some cases, adversaries may encrypt critical system files, disk partitions, and the MBR.(Citation: US-CERT NotPetya 2017) 261 | 262 | To maximize impact on the target organization, malware designed for encrypting data may have worm-like features to propagate across a network by leveraging other attack techniques like [Valid Accounts](https://attack.mitre.org/techniques/T1078), [OS Credential Dumping](https://attack.mitre.org/techniques/T1003), and [SMB/Windows Admin Shares](https://attack.mitre.org/techniques/T1021/002).(Citation: FireEye WannaCry 2017)(Citation: US-CERT NotPetya 2017) 263 |
264 | 265 |
266 | 267 | Stop System Shutdown/Reboot 268 | 269 | Adversaries may shutdown/reboot systems to interrupt access to, or aid in the destruction of, those systems. Operating systems may contain commands to initiate a shutdown/reboot of a machine. In some cases, these commands may also be used to initiate a shutdown/reboot of a remote computer. Shutting down or rebooting systems may disrupt access to computer resources for legitimate users. 270 | 271 | Adversaries may attempt to shutdown/reboot a system after impacting it in other ways, such as [Disk Structure Wipe](https://attack.mitre.org/techniques/T1561/002/) or [Inhibit System Recovery] (https://attack.mitre.org/techniques/T1561/002/), to hasten the intended effects on system availability. 272 | 273 |
274 | 275 | # Next Recommended Reading 276 | See how the matrix can be used via [Case Studies Page](/pages/case-studies-page.md#case-studies-page). 277 | --------------------------------------------------------------------------------