├── Implementation_Details.md ├── Contribute.md ├── aivss_calculatorV1.py ├── aivss_calculatorV2.py ├── AI Threat Taxonomies.md ├── checklist.md ├── Financial-AIVSS.md ├── Healthcare-AIVSS.md ├── AIVSS-Chinese.md └── README.md /Implementation_Details.md: -------------------------------------------------------------------------------- 1 | 2 | ### Simple Python implementation of AIVSS proposal by Ken Huang. 3 | The user can easily assess different vulnerabilities by providing inputs through a series of command-line prompts 4 | 5 | Screenshot 2024-12-18 at 3 08 15 PM 6 | 7 | ### Instructions 8 | - Run the Python file from your terminal using python aivss_calculatorV1/V2.py 9 | - Provide the appropriate values for each parameter being assessed 10 | - This will print the calculated AIVSS score to the console. 11 | 12 | ### Note 13 | This code provides a basic implementation of the AIVSS framework. You can further customize it by: 14 | - Creating a user interface for easier input and output 15 | - Integrating the code with other security tools. 16 | 17 | ### V2 Updates 18 | Key changes: 19 | - Updated parameter values: The scoring rubric for each parameter (AV, AC, PR, etc.) has been updated to match the values in the GitHub repository. 20 | - New parameter value dictionaries: Dictionaries for the new AI-specific metrics (MR, DS, EI, DC, AD) have been added with their corresponding values and descriptions. 21 | - Updated weights: The weights assigned to the base metrics (w1), AI-specific metrics (w2), and impact metrics (w3) have been updated to 0.25, 0.45, and 0.30, respectively, as per the new scoring methodology. 22 | - Interactive input for all parameters: The code now prompts the user to select values for all parameters, including the AI-specific metrics, using the get_user_input function. -------------------------------------------------------------------------------- /Contribute.md: -------------------------------------------------------------------------------- 1 | # Contributing to AIVSS 2 | 3 | Thank you for your interest in contributing to the Artificial Intelligence Vulnerability Scoring System (AIVSS). This is a proposal-stage documentation project aimed at establishing a standardized scoring system for AI vulnerabilities. 4 | 5 | ## How to Contribute 6 | 7 | ### Types of Contributions We're Looking For 8 | 9 | 1. Framework Improvements 10 | - Refinements to scoring metrics 11 | - New AI-specific vulnerability considerations 12 | - Improvements to calculation methodologies 13 | - Additional use cases and examples 14 | 15 | 2. Documentation 16 | - Clarifications to existing documentation 17 | - Additional examples and use cases 18 | - Spelling and grammar fixes 19 | - Translations to other languages 20 | 21 | 3. Research and Analysis 22 | - Literature reviews 23 | - Case studies 24 | - Comparative analysis with other frameworks 25 | - Validation studies 26 | 27 | ### Getting Started 28 | 29 | 1. Fork the repository 30 | 2. Clone your fork: 31 | ```bash 32 | git clone https://github.com/kenhuangus/Artificial-Intelligence-Vulnerability-Scoring-System-AIVSS.git 33 | ``` 34 | 3. Make your changes 35 | 4. Submit a pull request 36 | 37 | ### Pull Request Process 38 | 39 | 1. Create a descriptive pull request that explains your changes 40 | 2. Link any relevant issues 41 | 3. Wait for review from maintainers 42 | 43 | ### Pull Request Template 44 | ```markdown 45 | ## Description 46 | [Describe your changes here] 47 | 48 | ## Type of Contribution 49 | - [ ] Framework improvement 50 | - [ ] Documentation update 51 | - [ ] Research/Analysis 52 | - [ ] Translation 53 | - [ ] Other 54 | 55 | ## Checklist 56 | - [ ] Documentation is clear and concise 57 | - [ ] Changes are properly referenced 58 | - [ ] Spelling and grammar checked 59 | ``` 60 | 61 | ## Style Guidelines 62 | 63 | - Use clear, concise language 64 | - Include examples where appropriate 65 | - Follow markdown formatting 66 | - Provide references when applicable 67 | 68 | ## Questions? 69 | 70 | - Open an issue for questions 71 | - Tag maintainers if needed 72 | - Check existing documentation first 73 | 74 | Thank you for helping improve AIVSS! 75 | 76 | --- 77 | 78 | *This is a proposal-stage project. All contributions are welcome to help shape the framework.* 79 | -------------------------------------------------------------------------------- /aivss_calculatorV1.py: -------------------------------------------------------------------------------- 1 | def calculate_aivss_score(): 2 | """Calculates the AIVSS score interactively based on user input.""" 3 | 4 | # Define the AIVSS parameter values with descriptions for user selection 5 | av_values = { 6 | "1": {"value": 0.85, "description": "Network"}, 7 | "2": {"value": 0.62, "description": "Adjacent"}, 8 | "3": {"value": 0.55, "description": "Local"}, 9 | "4": {"value": 0.2, "description": "Physical"}, 10 | } 11 | ac_values = { 12 | "1": {"value": 0.77, "description": "Low"}, 13 | "2": {"value": 0.44, "description": "High"}, 14 | } 15 | pr_values = { 16 | "1": {"value": 0.85, "description": "None"}, 17 | "2": {"value": 0.62, "description": "Low"}, 18 | "3": {"value": 0.27, "description": "High"}, 19 | } 20 | ui_values = { 21 | "1": {"value": 0.85, "description": "None"}, 22 | "2": {"value": 0.62, "description": "Required"}, 23 | } 24 | s_values = { 25 | "1": {"value": 1.5, "description": "Changed"}, 26 | "2": {"value": 1.0, "description": "Unchanged"}, 27 | } 28 | 29 | # Get user input for each parameter 30 | def get_user_input(parameter_name, values): 31 | print(f"\nSelect {parameter_name}:") 32 | for key, value in values.items(): 33 | print(f"{key}. {value['description']}") 34 | choice = input("Enter your choice: ") 35 | while choice not in values: 36 | print("Invalid choice. Please try again.") 37 | choice = input("Enter your choice: ") 38 | return values[choice]["value"] 39 | 40 | attack_vector = get_user_input("Attack Vector (AV)", av_values) 41 | attack_complexity = get_user_input("Attack Complexity (AC)", ac_values) 42 | privileges_required = get_user_input("Privileges Required (PR)", pr_values) 43 | user_interaction = get_user_input("User Interaction (UI)", ui_values) 44 | scope = get_user_input("Scope (S)", s_values) 45 | 46 | model_robustness = float( 47 | input("\nEnter Model Robustness (MR) (0.0 to 1.0): ") 48 | ) 49 | data_sensitivity = float(input("Enter Data Sensitivity (DS) (0.0 to 1.0): ")) 50 | ethical_impact = float(input("Enter Ethical Impact (EI) (0.0 to 1.0): ")) 51 | decision_criticality = float( 52 | input("Enter Decision Criticality (DC) (0.0 to 1.0): ") 53 | ) 54 | adaptability = float(input("Enter Adaptability (AD) (0.0 to 1.0): ")) 55 | 56 | confidentiality_impact = float( 57 | input("\nEnter Confidentiality Impact (C) (0.0 to 1.0): ") 58 | ) 59 | integrity_impact = float( 60 | input("\nEnter Integrity Impact (I) (0.0 to 1.0): ") 61 | ) 62 | availability_impact = float( 63 | input("\nEnter Availability Impact (A) (0.0 to 1.0): ") 64 | ) 65 | safety_impact = float(input("\nEnter Safety Impact (SI) (0.0 to 1.0): ")) 66 | 67 | # Calculate the base metrics score 68 | base_metrics = ( 69 | attack_vector 70 | * attack_complexity 71 | * privileges_required 72 | * user_interaction 73 | * scope 74 | ) 75 | base_metrics = min(10, base_metrics) # Cap the base metrics score at 10 76 | 77 | # Calculate the AI-specific metrics score 78 | ai_specific_metrics = ( 79 | model_robustness 80 | * data_sensitivity 81 | * ethical_impact 82 | * decision_criticality 83 | * adaptability 84 | ) 85 | 86 | # Calculate the impact metrics score 87 | impact_metrics = ( 88 | confidentiality_impact 89 | + integrity_impact 90 | + availability_impact 91 | + safety_impact 92 | ) / 4 93 | 94 | # Calculate the final AIVSS score 95 | # Note: The weights (w1, w2, w3) and temporal metrics are not included in this 96 | # example. You can adjust the weights and add temporal metrics as needed. 97 | w1 = 0.4 98 | w2 = 0.4 99 | w3 = 0.2 100 | aivss_score = ( 101 | w1 * base_metrics + w2 * ai_specific_metrics + w3 * impact_metrics 102 | ) 103 | 104 | return aivss_score 105 | 106 | 107 | # Example usage 108 | aivss_score = calculate_aivss_score() 109 | print(f"\nThe AIVSS score is: {aivss_score:.2f}") -------------------------------------------------------------------------------- /aivss_calculatorV2.py: -------------------------------------------------------------------------------- 1 | def calculate_aivss_score(): 2 | """Calculates the AIVSS score interactively based on user input and the updated scoring rubric.""" 3 | 4 | # Define the AIVSS parameter values with descriptions for user selection 5 | av_values = { 6 | "1": {"value": 0.85, "description": "Network"}, 7 | "2": {"value": 0.62, "description": "Adjacent Network"}, 8 | "3": {"value": 0.55, "description": "Local"}, 9 | "4": {"value": 0.20, "description": "Physical"}, 10 | } 11 | ac_values = { 12 | "1": {"value": 0.77, "description": "Low"}, 13 | "2": {"value": 0.44, "description": "High"}, 14 | } 15 | pr_values = { 16 | "1": {"value": 0.85, "description": "None"}, 17 | "2": {"value": 0.62, "description": "Low"}, 18 | "3": {"value": 0.27, "description": "High"}, 19 | } 20 | ui_values = { 21 | "1": {"value": 0.85, "description": "None"}, 22 | "2": {"value": 0.62, "description": "Required"}, 23 | } 24 | s_values = { 25 | "1": {"value": 1.00, "description": "Unchanged"}, 26 | "2": {"value": 1.50, "description": "Changed"}, 27 | } 28 | mr_values = { 29 | "1": {"value": 1.00, "description": "Very High"}, 30 | "2": {"value": 0.80, "description": "High"}, 31 | "3": {"value": 0.60, "description": "Medium"}, 32 | "4": {"value": 0.40, "description": "Low"}, 33 | "5": {"value": 0.20, "description": "Very Low"}, 34 | } 35 | ds_values = { 36 | "1": {"value": 1.00, "description": "Very High"}, 37 | "2": {"value": 0.80, "description": "High"}, 38 | "3": {"value": 0.60, "description": "Medium"}, 39 | "4": {"value": 0.40, "description": "Low"}, 40 | "5": {"value": 0.20, "description": "Very Low"}, 41 | } 42 | ei_values = { 43 | "1": {"value": 1.00, "description": "Very High"}, 44 | "2": {"value": 0.80, "description": "High"}, 45 | "3": {"value": 0.60, "description": "Medium"}, 46 | "4": {"value": 0.40, "description": "Low"}, 47 | "5": {"value": 0.20, "description": "Very Low"}, 48 | } 49 | dc_values = { 50 | "1": {"value": 1.00, "description": "Very High"}, 51 | "2": {"value": 0.80, "description": "High"}, 52 | "3": {"value": 0.60, "description": "Medium"}, 53 | "4": {"value": 0.40, "description": "Low"}, 54 | "5": {"value": 0.20, "description": "Very Low"}, 55 | } 56 | ad_values = { 57 | "1": {"value": 1.00, "description": "Very High"}, 58 | "2": {"value": 0.80, "description": "High"}, 59 | "3": {"value": 0.60, "description": "Medium"}, 60 | "4": {"value": 0.40, "description": "Low"}, 61 | "5": {"value": 0.20, "description": "Very Low"}, 62 | } 63 | c_values = { 64 | "1": {"value": 0.00, "description": "None"}, 65 | "2": {"value": 0.22, "description": "Low"}, 66 | "3": {"value": 0.56, "description": "Medium"}, 67 | "4": {"value": 0.85, "description": "High"}, 68 | "5": {"value": 1.00, "description": "Critical"}, 69 | } 70 | i_values = { 71 | "1": {"value": 0.00, "description": "None"}, 72 | "2": {"value": 0.22, "description": "Low"}, 73 | "3": {"value": 0.56, "description": "Medium"}, 74 | "4": {"value": 0.85, "description": "High"}, 75 | "5": {"value": 1.00, "description": "Critical"}, 76 | } 77 | a_values = { 78 | "1": {"value": 0.00, "description": "None"}, 79 | "2": {"value": 0.22, "description": "Low"}, 80 | "3": {"value": 0.56, "description": "Medium"}, 81 | "4": {"value": 0.85, "description": "High"}, 82 | "5": {"value": 1.00, "description": "Critical"}, 83 | } 84 | si_values = { 85 | "1": {"value": 0.00, "description": "None"}, 86 | "2": {"value": 0.22, "description": "Low"}, 87 | "3": {"value": 0.56, "description": "Medium"}, 88 | "4": {"value": 0.85, "description": "High"}, 89 | "5": {"value": 1.00, "description": "Critical"}, 90 | } 91 | 92 | # Get user input for each parameter 93 | def get_user_input(parameter_name, values): 94 | print(f"\nSelect {parameter_name}:") 95 | for key, value in values.items(): 96 | print(f"{key}. {value['description']}") 97 | choice = input("Enter your choice: ") 98 | while choice not in values: 99 | print("Invalid choice. Please try again.") 100 | choice = input("Enter your choice: ") 101 | return values[choice]["value"] 102 | 103 | attack_vector = get_user_input("Attack Vector (AV)", av_values) 104 | attack_complexity = get_user_input("Attack Complexity (AC)", ac_values) 105 | privileges_required = get_user_input("Privileges Required (PR)", pr_values) 106 | user_interaction = get_user_input("User Interaction (UI)", ui_values) 107 | scope = get_user_input("Scope (S)", s_values) 108 | model_robustness = get_user_input("Model Robustness (MR)", mr_values) 109 | data_sensitivity = get_user_input("Data Sensitivity (DS)", ds_values) 110 | ethical_impact = get_user_input("Ethical Impact (EI)", ei_values) 111 | decision_criticality = get_user_input( 112 | "Decision Criticality (DC)", dc_values 113 | ) 114 | adaptability = get_user_input("Adaptability (AD)", ad_values) 115 | confidentiality_impact = get_user_input( 116 | "Confidentiality Impact (C)", c_values 117 | ) 118 | integrity_impact = get_user_input("Integrity Impact (I)", i_values) 119 | availability_impact = get_user_input("Availability Impact (A)", a_values) 120 | safety_impact = get_user_input("Safety Impact (SI)", si_values) 121 | 122 | # Calculate the base metrics score 123 | base_metrics = ( 124 | attack_vector 125 | * attack_complexity 126 | * privileges_required 127 | * user_interaction 128 | * scope 129 | ) 130 | base_metrics = min(10, base_metrics) # Cap the base metrics score at 10 131 | 132 | # Calculate the AI-specific metrics score 133 | ai_specific_metrics = ( 134 | model_robustness 135 | * data_sensitivity 136 | * ethical_impact 137 | * decision_criticality 138 | * adaptability 139 | ) 140 | 141 | # Calculate the impact metrics score 142 | impact_metrics = ( 143 | confidentiality_impact 144 | + integrity_impact 145 | + availability_impact 146 | + safety_impact 147 | ) / 4 148 | 149 | # Calculate the final AIVSS score using updated weights 150 | w1 = 0.25 151 | w2 = 0.45 152 | w3 = 0.30 153 | aivss_score = ( 154 | w1 * base_metrics + w2 * ai_specific_metrics + w3 * impact_metrics 155 | ) 156 | 157 | return aivss_score 158 | 159 | 160 | # Example usage 161 | aivss_score = calculate_aivss_score() 162 | print(f"\nThe AIVSS score is: {aivss_score:.2f}") -------------------------------------------------------------------------------- /AI Threat Taxonomies.md: -------------------------------------------------------------------------------- 1 | 2 | # AI Threat Taxonomies 3 | 4 | This document provides a consolidated overview of ten prominent AI threat taxonomies that are instrumental in shaping the Artificial Intelligence Vulnerability Scoring System (AIVSS). These taxonomies offer a comprehensive understanding of the diverse landscape of vulnerabilities and attacks that can impact AI systems. 5 | 6 | | Taxonomy | Description | Link | 7 | | :--------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | 8 | | **MITRE ATLAS** | A knowledge base of adversary tactics and techniques based on real-world observations, specifically focused on threats to machine learning systems. It provides a framework for understanding the adversarial ML lifecycle and includes case studies of attacks. | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 9 | | **NIST AI 100-2 E2023** | A taxonomy of adversarial machine learning, including attacks, defenses, and consequences. It provides a detailed framework for understanding and categorizing threats to AI systems and offers guidance on risk management. | [https://csrc.nist.gov/pubs/ai/100/2/e2023/final](https://csrc.nist.gov/pubs/ai/100/2/e2023/final) | 10 | | **EU HLEG Trustworthy AI** | Ethics guidelines for trustworthy AI developed by the European Commission's High-Level Expert Group on Artificial Intelligence. It focuses on human-centric AI principles, including fairness, transparency, accountability, and societal well-being. | [https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai](https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai) | 11 | | **ISO/IEC JTC 1/SC 42** | An international standards body developing standards for artificial intelligence. It covers various aspects of AI, including risk management, trustworthiness, bias, and governance. | [https://www.iso.org/committee/6794475.html](https://www.iso.org/committee/6794475.html) | 12 | | **AI Incident Database** | A database of real-world incidents involving AI systems, including failures, accidents, and malicious attacks. It provides valuable data for understanding the risks associated with AI and informing risk management strategies. | [https://incidentdatabase.ai/](https://incidentdatabase.ai/) | 13 | | **DARPA's GARD** | The Guaranteeing AI Robustness against Deception (GARD) program aims to develop defenses against adversarial attacks on AI systems. It focuses on developing robust AI that can withstand attempts to deceive or manipulate it. | [https://www.darpa.mil/research/programs/guaranteeing-ai-robustness-against-deception](https://www.darpa.mil/research/programs/guaranteeing-ai-robustness-against-deception) | 14 | | **OECD AI Principles** | Principles for responsible stewardship of trustworthy AI, adopted by the Organisation for Economic Co-operation and Development (OECD). They cover aspects such as inclusive growth, human-centered values, transparency, robustness, and accountability. | [https://oecd.ai/en/ai-principles](https://oecd.ai/en/ai-principles) | 15 | | **MITRE Atlas Matrix** | Adversarial ML Threat Matrix is a framework that captures the tactics, techniques, and procedures used by adversaries to attack ML systems. It is structured similarly to the ATT&CK framework but specialized for the domain of machine learning. | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 16 | | **CSA LLM Threat Taxonomy** | Defines common threats related to large language models in the cloud. Key categories include model manipulation, data poisoning, sensitive data disclosure, model stealing, and others specific to cloud-based LLM deployments. | [https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy](https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy) | 17 | | **MIT AI Threat Taxonomy** | Comprehensive classification of attack surfaces, adversarial techniques, and governance vulnerabilities of AI. It details various types of attacks and provides mitigation strategies. | [https://arxiv.org/pdf/2408.12622](https://arxiv.org/pdf/2408.12622) | 18 | | **OWASP Top 10 for LLMs** | Highlights the most critical security risks for large language model applications. It covers vulnerabilities like prompt injection, data leakage, insecure output handling, and model denial of service, among others. | [https://owasp.org/www-project-top-10-for-large-language-model-applications/](https://owasp.org/www-project-top-10-for-large-language-model-applications/) | 19 | 20 | 21 | **How to use this file:** 22 | 23 | 1. **Save:** Save the content above as a `.md` file (e.g., `ai_threat_taxonomies.md`). 24 | 2. **View:** You can then open this file in any Markdown viewer or editor (like VS Code, Typora, or directly on GitHub) to see the formatted table. 25 | 3. **Add your new domain specific taxonomy here using PR request** 26 | -------------------------------------------------------------------------------- /checklist.md: -------------------------------------------------------------------------------- 1 | 2 | # AIVSS Implementation Checklist # 3 | 4 | **Phase 1: Preparation** 5 | 6 | * [ ] **1.1 Define Scope:** 7 | * [ ] 1.1.1 Identify the AI system(s) to be assessed. 8 | * [ ] 1.1.2 Determine the assessment boundaries (e.g., specific components, lifecycle stages). 9 | * [ ] 1.1.3 Define the objectives of the assessment (e.g., identify vulnerabilities, improve security posture, compliance). 10 | * [ ] **1.2 Assemble Team:** 11 | * [ ] 1.2.1 Form an assessment team with the necessary expertise: 12 | * [ ] AI Security Specialist/Team Lead 13 | * [ ] AI Developers/Data Scientists 14 | * [ ] Security Engineers 15 | * [ ] Compliance/Risk Officer 16 | * [ ] Ethical AI Officer/Review Board (if applicable) 17 | * [ ] 1.2.2 Define roles and responsibilities for each team member. 18 | * [ ] **1.3 Gather Information:** 19 | * [ ] 1.3.1 Collect documentation on the AI system's architecture, data flows, dependencies, and functionalities. 20 | * [ ] 1.3.2 Identify relevant threat models and AI threat taxonomies (see AIVSS Appendix). 21 | * [ ] 1.3.3 Gather information on applicable regulations, standards, and ethical guidelines. 22 | 23 | **Phase 2: Assessment** 24 | 25 | * [ ] **2.1 Base Metrics Evaluation:** 26 | * [ ] 2.1.1 **Attack Vector (AV):** Determine how an attacker could exploit the vulnerability (Network, Adjacent, Local, Physical). 27 | * [ ] 2.1.2 **Attack Complexity (AC):** Assess the difficulty of executing the attack (Low or High). 28 | * [ ] 2.1.3 **Privileges Required (PR):** Determine the level of access needed (None, Low, High). 29 | * [ ] 2.1.4 **User Interaction (UI):** Assess whether user interaction is required for the attack (None or Required). 30 | * [ ] 2.1.5 **Scope (S):** Determine if the vulnerability impacts other components (Unchanged or Changed). 31 | * [ ] 2.1.6 **Calculate Base Metrics Score** using the formula: `BaseMetrics = min(10, [AV × AC × PR × UI × S] × ScopeMultiplier)` 32 | 33 | * [ ] **2.2 AI-Specific Metrics Evaluation:** 34 | * [ ] 2.2.1 **Model Robustness (MR):** 35 | * [ ] **Evasion Resistance:** Assess susceptibility to evasion attacks. 36 | * [ ] **Gradient Masking/Obfuscation:** Evaluate the use of gradient masking techniques. 37 | * [ ] **Robustness Certification:** Determine if robustness testing or certification has been performed. 38 | * [ ] 2.2.2 **Data Sensitivity (DS):** 39 | * [ ] **Data Confidentiality:** Assess the protection of sensitive data. 40 | * [ ] **Data Integrity:** Evaluate data integrity checks and tamper detection. 41 | * [ ] **Data Provenance:** Determine the level of data lineage tracking. 42 | * [ ] 2.2.3 **Ethical Implications (EI):** 43 | * [ ] **Bias and Discrimination:** Assess the risk of biased or discriminatory outcomes. 44 | * [ ] **Transparency and Explainability:** Evaluate the transparency of the system's decision-making. 45 | * [ ] **Accountability:** Determine the lines of accountability for the system's actions. 46 | * [ ] **Societal Impact:** Assess the potential impact on society. 47 | * [ ] 2.2.4 **Decision Criticality (DC):** 48 | * [ ] **Safety-Critical:** Evaluate if the system is used in safety-critical applications. 49 | * [ ] **Financial Impact:** Assess the potential for financial loss. 50 | * [ ] **Reputational Damage:** Determine the risk of reputational damage. 51 | * [ ] **Operational Disruption:** Evaluate the potential for operational disruption. 52 | * [ ] 2.2.5 **Adaptability (AD):** 53 | * [ ] **Continuous Monitoring:** Assess monitoring for attacks and anomalies. 54 | * [ ] **Retraining Capabilities:** Evaluate the system's ability to be retrained. 55 | * [ ] **Threat Intelligence Integration:** Determine if threat intelligence is used. 56 | * [ ] **Adversarial Training:** Assess the use of adversarial training techniques. 57 | * [ ] 2.2.6 **Adversarial Attack Surface (AA):** 58 | * [ ] **Model Inversion:** Evaluate the risk of model inversion attacks. 59 | * [ ] **Model Extraction:** Assess the risk of model extraction attacks. 60 | * [ ] **Membership Inference:** Determine the risk of membership inference attacks. 61 | * [ ] 2.2.7 **Lifecycle Vulnerabilities (LL):** 62 | * [ ] **Development:** Assess the security of the development environment. 63 | * [ ] **Training:** Evaluate the security of the training environment and data. 64 | * [ ] **Deployment:** Determine the security of the deployment environment. 65 | * [ ] **Operations:** Assess security monitoring and incident response. 66 | * [ ] 2.2.8 **Governance and Validation (GV):** 67 | * [ ] **Compliance:** Evaluate compliance with regulations and standards. 68 | * [ ] **Auditing:** Determine if regular audits are conducted. 69 | * [ ] **Risk Management:** Assess AI-specific risk management processes. 70 | * [ ] **Human Oversight:** Evaluate the level of human oversight. 71 | * [ ] **Ethical Framework Alignment:** Determine alignment with ethical frameworks. 72 | * [ ] 2.2.9 **Cloud Security Alliance LLM Taxonomy (CS):** (Especially relevant for LLMs and cloud deployments) 73 | * [ ] **Model Manipulation:** Assess vulnerability to prompt injection and other manipulations. 74 | * [ ] **Data Poisoning:** Evaluate the risk of data poisoning attacks. 75 | * [ ] **Sensitive Data Disclosure:** Determine the risk of sensitive data leakage. 76 | * [ ] **Model Stealing:** Assess the risk of model theft. 77 | * [ ] **Failure/Malfunctioning:** Evaluate the system's reliability and fault tolerance. 78 | * [ ] **Insecure Supply Chain:** Determine the security of the supply chain. 79 | * [ ] **Insecure Apps/Plugins:** Assess the security of third-party integrations. 80 | * [ ] **Denial of Service (DoS):** Evaluate vulnerability to DoS attacks. 81 | * [ ] **Loss of Governance/Compliance:** Determine the risk of non-compliance. 82 | * [ ] 2.2.10 **Assign scores (0.0-1.0) to each sub-category** based on the detailed rubric (higher score = more severe issue). 83 | * [ ] 2.2.11 **Determine the Model Complexity Multiplier** (1.0-1.5) based on the complexity of the AI model. 84 | * [ ] 2.2.12 **Calculate the AI-Specific Metrics Score:** `AISpecificMetrics = [MR × DS × EI × DC × AD × AA × LL × GV × CS] × ModelComplexityMultiplier` 85 | 86 | * [ ] **2.3 Impact Metrics Evaluation:** 87 | * [ ] 2.3.1 **Confidentiality Impact (C):** Assess the impact on data confidentiality (None, Low, Medium, High, Critical). 88 | * [ ] 2.3.2 **Integrity Impact (I):** Assess the impact on data and system integrity (None, Low, Medium, High, Critical). 89 | * [ ] 2.3.3 **Availability Impact (A):** Assess the impact on system availability (None, Low, Medium, High, Critical). 90 | * [ ] 2.3.4 **Societal Impact (SI):** Assess the broader societal impact (None, Low, Medium, High, Critical). 91 | * [ ] 2.3.5 **Calculate the Impact Metrics Score:** `ImpactMetrics = (C + I + A + SI) / 4` 92 | 93 | * [ ] **2.4 Temporal Metrics Evaluation:** 94 | * [ ] 2.4.1 **Exploitability (E):** Assess the likelihood of the vulnerability being exploited (Not Defined, Unproven, Proof-of-Concept, Functional, High). 95 | * [ ] 2.4.2 **Remediation Level (RL):** Determine the availability of a fix (Not Defined, Official Fix, Temporary Fix, Workaround, Unavailable). 96 | * [ ] 2.4.3 **Report Confidence (RC):** Assess the confidence in the vulnerability report (Not Defined, Unknown, Reasonable, Confirmed). 97 | * [ ] 2.4.4 **Calculate Temporal Metrics Score** using the given values for each subcategory (e.g., average them). 98 | 99 | * [ ] **2.5 Mitigation Effectiveness Evaluation:** 100 | * [ ] 2.5.1 Assess the effectiveness of implemented mitigations in reducing the identified vulnerabilities. 101 | * [ ] 2.5.2 Assign a **Mitigation Multiplier** score: 1.0 (Strong Mitigation), 1.2 (Moderate Mitigation), 1.5 (Weak/No Mitigation). 102 | 103 | **Phase 3: Scoring and Reporting** 104 | 105 | * [ ] **3.1 Calculate AIVSS Score:** 106 | * [ ] 3.1.1 Use the formula: `AIVSS_Score = [(w₁ × BaseMetrics) + (w₂ × AISpecificMetrics) + (w₃ × ImpactMetrics)] × TemporalMetrics × MitigationMultiplier` 107 | * [ ] 3.1.2 Use the suggested weights (w₁ = 0.3, w₂ = 0.5, w₃ = 0.2) or adjust them based on your organization's risk profile. 108 | * [ ] **3.2 Determine Risk Category:** 109 | * [ ] **Critical:** 9.0 - 10.0 110 | * [ ] **High:** 7.0 - 8.9 111 | * [ ] **Medium:** 4.0 - 6.9 112 | * [ ] **Low:** 0.1 - 3.9 113 | * [ ] **None:** 0.0 114 | * [ ] **3.3 Generate Report:** 115 | * [ ] 3.3.1 Document the assessment findings, including the AIVSS score, risk category, and detailed scores for each metric and sub-category. 116 | * [ ] 3.3.2 Provide justification for the assigned scores, referencing the scoring rubric and evidence gathered. 117 | * [ ] 3.3.3 Analyze the key vulnerabilities and their potential impact. 118 | * [ ] 3.3.4 Recommend mitigation strategies, prioritized based on the severity of the vulnerabilities. 119 | * [ ] 3.3.5 Include an executive summary for management and technical details for relevant teams. 120 | 121 | **Phase 4: Remediation and Follow-up** 122 | 123 | * [ ] **4.1 Develop Remediation Plan:** 124 | * [ ] 4.1.1 Prioritize vulnerabilities based on their AIVSS score and risk category. 125 | * [ ] 4.1.2 Assign responsibility for implementing mitigations. 126 | * [ ] 4.1.3 Define timelines for remediation. 127 | * [ ] **4.2 Implement Mitigations:** 128 | * [ ] 4.2.1 Implement the recommended mitigation strategies. 129 | * [ ] 4.2.2 Verify the effectiveness of the mitigations. 130 | * [ ] **4.3 Re-assess:** 131 | * [ ] 4.3.1 Conduct a follow-up assessment to ensure that vulnerabilities have been adequately addressed. 132 | * [ ] 4.3.2 Update the AIVSS score and report as needed. 133 | * [ ] **4.4 Continuous Monitoring:** 134 | * [ ] 4.4.1 Implement continuous monitoring of the AI system for new vulnerabilities and emerging threats. 135 | * [ ] 4.4.2 Regularly review and update the AIVSS assessment to reflect changes in the system, the threat landscape, and the organization's risk profile. 136 | 137 | **Appendix: AI Threat Taxonomies** 138 | https://github.com/kenhuangus/Artificial-Intelligence-Vulnerability-Scoring-System-AIVSS/blob/main/AI%20Threat%20Taxonomies.md 139 | 140 | 141 | **Note:** This checklist is a starting point and should be adapted to the specific needs and context of your organization and the AI system being assessed. It's crucial to involve individuals with the right expertise and to document the assessment process thoroughly. Remember that securing AI systems is an ongoing process that requires continuous monitoring, adaptation, and improvement. 142 | -------------------------------------------------------------------------------- /Financial-AIVSS.md: -------------------------------------------------------------------------------- 1 | # AIVSS for the Financial Industry: An Adaptation Guide # 2 | 3 | **Introduction** 4 | 5 | The Artificial Intelligence Vulnerability Scoring System (AIVSS) provides a general framework for assessing the security risks of AI systems. However, different industries have unique requirements and threat landscapes. This document adapts AIVSS to the specific needs and challenges of the financial industry, taking into account the regulatory environment, common use cases, and prevalent threats in this sector. 6 | 7 | **Why a Financial Industry-Specific AIVSS?** 8 | 9 | Financial institutions are increasingly adopting AI for a variety of applications, including: 10 | 11 | * **Fraud Detection:** Identifying fraudulent transactions and activities. 12 | * **Algorithmic Trading:** Automating trading decisions based on market data. 13 | * **Credit Scoring:** Assessing creditworthiness of individuals and businesses. 14 | * **Risk Management:** Modeling and managing various types of financial risks. 15 | * **Customer Service:** Providing personalized financial advice and support through chatbots. 16 | * **Anti-Money Laundering (AML) and Know Your Customer (KYC):** Detecting and preventing money laundering and ensuring compliance with KYC regulations. 17 | 18 | These applications handle highly sensitive financial data and are often subject to strict regulatory oversight (e.g., GDPR, CCPA, GLBA, SOX). Consequently, security vulnerabilities in AI systems within the financial sector can have severe consequences, including: 19 | 20 | * **Financial Losses:** Direct financial losses due to fraud, theft, or erroneous trading decisions. 21 | * **Regulatory Penalties:** Significant fines and sanctions for non-compliance with data protection and financial regulations. 22 | * **Reputational Damage:** Loss of customer trust and damage to brand reputation. 23 | * **Legal Action:** Lawsuits from affected customers or investors. 24 | 25 | Therefore, a tailored AIVSS adaptation is crucial for financial institutions to effectively assess and manage the security risks associated with their AI systems. 26 | 27 | **Adaptations for the Financial Industry** 28 | 29 | This adaptation focuses on modifying the weights and scoring criteria within the AIVSS framework to reflect the specific priorities and concerns of the financial industry. 30 | 31 | **1. Modified Weights** 32 | 33 | In the financial industry, certain aspects of AI security are particularly critical. We propose the following adjusted weights for the AIVSS components: 34 | 35 | * **w₁ (Base Metrics):** 0.25 (Slightly reduced emphasis as baseline security is generally well-established in finance) 36 | * **w₂ (AI-Specific Metrics):** 0.6 (Increased emphasis due to the unique risks of AI systems) 37 | * **w₃ (Impact Metrics):** 0.15 (Reduced, as financial impacts are captured within the AI-specific metrics) 38 | 39 | **Rationale:** 40 | 41 | * Financial institutions typically have mature cybersecurity programs, so base metrics, while important, are less critical than AI-specific vulnerabilities. 42 | * AI-specific threats, such as data poisoning, model manipulation, and bias, pose significant risks in financial applications and thus require greater attention. 43 | * Impact is still relevant but many of the financial impacts are captured in the AI-specific metrics like data sensitivity and decision criticality. 44 | 45 | **2. AI-Specific Metrics: Tailored Scoring Guidance** 46 | 47 | Here's how the AI-Specific Metrics should be tailored for the financial industry: 48 | 49 | **MR (Model Robustness)** 50 | 51 | * **Evasion Resistance:** 52 | * **Focus:** High emphasis on robustness against evasion attacks that could manipulate fraud detection, algorithmic trading, or credit scoring models. 53 | * **Examples:** 54 | * **0.9:** Model is easily fooled by small changes in transaction data, leading to false negatives in fraud detection. 55 | * **0.5:** Model has some adversarial training but is vulnerable to sophisticated attacks designed to bypass fraud controls. 56 | * **0.1:** Model is robustly trained and tested against a wide range of evasion attacks relevant to financial fraud and market manipulation. 57 | 58 | * **Gradient Masking/Obfuscation:** 59 | * **Focus:** Important for protecting proprietary trading algorithms and preventing model extraction by competitors. 60 | * **Examples:** 61 | * **0.8:** Trading model's gradients are easily accessible, allowing competitors to reverse-engineer trading strategies. 62 | * **0.4:** Some gradient obfuscation techniques are used, but they do not fully prevent reverse-engineering. 63 | * **0.1:** Strong gradient masking is employed, making it computationally infeasible to extract the model's logic. 64 | 65 | * **Robustness Certification:** 66 | * **Focus:** While formal certification is still nascent, rigorous testing against financial industry-specific attack scenarios is crucial. 67 | * **Examples:** 68 | * **0.7:** No robustness testing performed specifically for financial fraud or market manipulation scenarios. 69 | * **0.4:** Model tested against basic adversarial attacks, but not specifically tailored to financial use cases. 70 | * **0.1:** Model rigorously tested against a range of financial industry-specific attacks, with documented results. 71 | 72 | **DS (Data Sensitivity)** 73 | 74 | * **Data Confidentiality:** 75 | * **Focus:** Extremely high emphasis due to the handling of highly sensitive personally identifiable information (PII) and financial data. Compliance with regulations like GDPR, CCPA, and GLBA is paramount. 76 | * **Examples:** 77 | * **1.0:** Customer data, including PII and transaction history, is stored unencrypted and accessible to many employees. 78 | * **0.6:** Data is encrypted at rest, but access controls are weak, and data breaches are possible. 79 | * **0.1:** Data is encrypted at rest and in transit, with strict access controls, regular audits, and compliance with all relevant data privacy regulations. 80 | 81 | * **Data Integrity:** 82 | * **Focus:** Critical to ensure the accuracy of financial data used for training and decision-making. Tampering with data can lead to incorrect financial assessments, fraud, and regulatory penalties. 83 | * **Examples:** 84 | * **0.9:** No data integrity checks in place, making it easy to manipulate transaction data or credit information. 85 | * **0.5:** Basic checksums are used, but there are no mechanisms to prevent or detect sophisticated data tampering. 86 | * **0.1:** Strong integrity checks, such as digital signatures and blockchain technology, are used to ensure data immutability and detect any tampering. 87 | 88 | * **Data Provenance:** 89 | * **Focus:** Important for auditing, regulatory compliance, and understanding the origin and processing of financial data. 90 | * **Examples:** 91 | * **0.8:** The origin and processing history of financial data are not tracked or documented. 92 | * **0.4:** Some information about data sources is available, but the lineage is incomplete and difficult to audit. 93 | * **0.1:** Detailed data lineage is tracked, including all transformations and processing steps, with a clear audit trail for regulatory compliance. 94 | 95 | **EI (Ethical Implications)** 96 | 97 | * **Bias and Discrimination:** 98 | * **Focus:** Extremely high emphasis due to the potential for AI systems to perpetuate or amplify existing biases in lending, credit scoring, and other financial services, leading to unfair or discriminatory outcomes. Compliance with fair lending laws is crucial. 99 | * **Examples:** 100 | * **0.9:** Credit scoring model exhibits significant bias against certain demographic groups, leading to discriminatory lending practices. 101 | * **0.5:** Some fairness metrics are monitored, but no active bias mitigation techniques are employed. 102 | * **0.1:** Model is regularly audited for bias, and techniques like re-weighting or adversarial debiasing are used to mitigate any identified biases. 103 | 104 | * **Transparency and Explainability:** 105 | * **Focus:** High emphasis, especially for regulatory compliance (e.g., explaining adverse credit decisions) and building customer trust. 106 | * **Examples:** 107 | * **0.8:** Algorithmic trading model's decisions are completely opaque, making it impossible to understand the rationale behind trades. 108 | * **0.4:** Limited explainability is provided for credit scoring decisions, but it is not sufficient for regulatory compliance. 109 | * **0.1:** System provides clear and understandable explanations for all decisions, meeting regulatory requirements and enabling customer understanding. 110 | 111 | * **Accountability:** 112 | * **Focus:** Clear lines of accountability are essential for addressing errors, biases, and security incidents in financial AI systems. 113 | * **Examples:** 114 | * **0.7:** It is unclear who is responsible for addressing errors or biases in the AI-driven loan approval process. 115 | * **0.3:** Some roles and responsibilities are defined, but there is no formal accountability framework. 116 | * **0.1:** A comprehensive accountability framework is in place, with clear procedures for incident response, remediation, and audits. 117 | 118 | * **Societal Impact:** 119 | * **Focus:** Consideration of the broader impact of AI on financial inclusion, access to credit, and potential for exacerbating existing inequalities. 120 | * **Examples:** 121 | * **0.7:** The deployment of an AI system for loan approvals has led to a significant decrease in access to credit for underserved communities. 122 | * **0.4:** Some assessment of societal impact has been conducted, but no concrete steps have been taken to address potential negative consequences. 123 | * **0.1:** The AI system is designed to promote financial inclusion and is regularly evaluated for its impact on different communities. 124 | 125 | **DC (Decision Criticality)** 126 | 127 | * **Safety-Critical:** 128 | * **Focus:** While not as prevalent as in other sectors, safety-critical applications may emerge in areas like autonomous vehicles used for financial transactions or high-frequency trading systems where malfunctions could have cascading effects. 129 | * **Examples:** 130 | * **0.7:** A high-frequency trading system has no failsafe mechanisms and could cause significant market disruptions in case of malfunction. 131 | * **0.3:** Basic safety measures are in place, but the system has not undergone rigorous safety testing for financial applications. 132 | * **0.1:** The system meets high safety standards and has multiple failsafe mechanisms to prevent catastrophic failures. 133 | 134 | * **Financial Impact:** 135 | * **Focus:** Extremely high emphasis, as vulnerabilities can lead to direct financial losses, fraud, and regulatory penalties. 136 | * **Examples:** 137 | * **1.0:** A vulnerability in the fraud detection system allows a large-scale fraud scheme to go undetected, resulting in significant financial losses. 138 | * **0.6:** Errors in an algorithmic trading system lead to substantial trading losses. 139 | * **0.2:** System has robust controls and monitoring to prevent and detect errors that could lead to financial losses. 140 | 141 | * **Reputational Damage:** 142 | * **Focus:** High emphasis, as security breaches and ethical lapses can severely damage customer trust and brand reputation in the financial industry. 143 | * **Examples:** 144 | * **0.9:** A data breach exposing sensitive customer financial information leads to a significant loss of customer trust and negative media coverage. 145 | * **0.5:** A biased credit scoring model generates negative publicity and regulatory scrutiny. 146 | * **0.1:** The organization has a strong track record of responsible AI development and deployment, with proactive measures to protect its reputation. 147 | 148 | * **Operational Disruption:** 149 | * **Focus:** High emphasis, as downtime in critical financial systems can have significant consequences for customers, markets, and the institution itself. 150 | * **Examples:** 151 | * **0.8:** A failure in the core banking system powered by AI causes a prolonged outage, preventing customers from accessing their accounts. 152 | * **0.4:** Some redundancy is in place, but failover procedures are not regularly tested, potentially leading to extended downtime. 153 | * **0.1:** The system is designed for high availability with robust failover mechanisms and a comprehensive business continuity plan. 154 | 155 | **AD (Adaptability)** 156 | 157 | * **Continuous Monitoring:** 158 | * **Focus:** Crucial for detecting anomalies, adversarial attacks, and performance degradation in real-time, especially for fraud detection and algorithmic trading. 159 | * **Examples:** 160 | * **0.7:** No real-time monitoring for adversarial attacks or anomalies in transaction data. 161 | * **0.4:** Basic monitoring of system outputs, but limited analysis and no automated alerts for suspicious activity. 162 | * **0.1:** Comprehensive monitoring of system inputs, outputs, and internal states, with anomaly detection algorithms and automated alerts for potential security incidents. 163 | 164 | * **Retraining Capabilities:** 165 | * **Focus:** Important for adapting to evolving fraud patterns, market conditions, and regulatory requirements. 166 | * **Examples:** 167 | * **0.6:** Fraud detection model can only be retrained manually, which is a slow and infrequent process. 168 | * **0.3:** Algorithmic trading model can be retrained automatically, but the process is not triggered by real-time performance degradation. 169 | * **0.1:** Models are automatically retrained on a regular basis or triggered by changes in performance or data distribution, ensuring they remain effective and up-to-date. 170 | 171 | * **Threat Intelligence Integration:** 172 | * **Focus:** Essential for staying ahead of emerging financial fraud schemes, money laundering techniques, and other threats. 173 | * **Examples:** 174 | * **0.8:** No integration with threat intelligence feeds or other sources of information on financial crime. 175 | * **0.4:** Security team occasionally reviews threat intelligence reports but does not systematically incorporate them into security operations. 176 | * **0.1:** System automatically ingests and analyzes threat intelligence feeds related to financial crime, generating alerts and updating models as needed. 177 | 178 | * **Adversarial Training:** 179 | * **Focus:** Important for building robustness against attacks specifically designed to target financial AI systems. 180 | * **Examples:** 181 | * **0.7:** Model is not trained to be resistant to any specific types of financial fraud or market manipulation attacks. 182 | * **0.4:** Model is trained with some basic adversarial examples, but not specifically tailored to financial scenarios. 183 | * **0.1:** Model undergoes continuous adversarial training using a variety of attack techniques relevant to the financial industry. 184 | 185 | **AA (Adversarial Attack Surface)** 186 | 187 | * **Model Inversion:** 188 | * **Focus:** High risk if sensitive financial data can be extracted from models, potentially violating privacy regulations. 189 | * **Examples:** 190 | * **0.8:** An attacker can reconstruct sensitive customer financial information from a credit scoring model's outputs. 191 | * **0.4:** Some measures are in place to limit model output precision, but they do not fully prevent model inversion attacks. 192 | * **0.1:** Model is trained with differential privacy or other techniques that provide strong guarantees against model inversion. 193 | 194 | * **Model Extraction:** 195 | * **Focus:** High risk for proprietary trading algorithms or other models that provide a competitive advantage. 196 | * **Examples:** 197 | * **0.9:** An attacker can create a functional copy of a proprietary trading algorithm by querying its API. 198 | * **0.5:** API access is rate-limited, but an attacker can still extract the model over a longer period. 199 | * **0.1:** Strong defenses against model extraction are in place, such as anomaly detection on API queries and model watermarking. 200 | 201 | * **Membership Inference:** 202 | * **Focus:** Medium to high risk, especially for models trained on sensitive financial data, as it could reveal whether specific individuals or transactions were part of the training set. 203 | * **Examples:** 204 | * **0.7:** An attacker can easily determine if a particular individual's financial data was used to train a fraud detection model. 205 | * **0.4:** Some regularization techniques are used, but they do not fully prevent membership inference attacks. 206 | * **0.1:** Model is trained with differential privacy or other techniques that provide strong guarantees against membership inference. 207 | 208 | **LL (Lifecycle Vulnerabilities)** 209 | 210 | * **Development:** 211 | * **Focus:** Secure development practices are crucial to prevent vulnerabilities from being introduced during the model development phase. 212 | * **Examples:** 213 | * **0.7:** Developers use personal laptops with inadequate security controls to develop and train models on sensitive financial data. 214 | * **0.4:** Some secure coding guidelines are followed, but there is no formal secure development lifecycle (SDL) in place. 215 | * **0.1:** Secure development lifecycle (SDL) practices are strictly followed, including code reviews, static analysis, and vulnerability scanning, with access controls on development resources. 216 | 217 | * **Training:** 218 | * **Focus:** Protecting the training environment and data is essential to prevent data breaches, poisoning attacks, and other security incidents. 219 | * **Examples:** 220 | * **0.8:** Training data is stored on an unsecured server with no encryption or access controls. 221 | * **0.4:** Training data is encrypted at rest, but access controls are not strictly enforced. 222 | * **0.1:** Training is performed in a secure and isolated environment with strict access controls, data encryption, and regular security audits. 223 | 224 | * **Deployment:** 225 | * **Focus:** Secure deployment practices are necessary to protect models from unauthorized access, tampering, and other attacks. 226 | * **Examples:** 227 | * **0.7:** Model is deployed on a public server with no authentication required to access its API. 228 | * **0.4:** Model is deployed behind a firewall, but API keys are shared among multiple users. 229 | * **0.1:** Model is deployed in a secure cloud environment with strong authentication, authorization, and regular security updates. 230 | 231 | * **Operations:** 232 | * **Focus:** Continuous monitoring and incident response are crucial for detecting and responding to security incidents in a timely manner. 233 | * **Examples:** 234 | * **0.8:** No security monitoring or incident response plan is in place for the deployed AI system. 235 | * **0.4:** Basic security monitoring is performed, but there is no dedicated team or process for responding to incidents. 236 | * **0.1:** A dedicated security operations center (SOC) monitors the system 24/7, with automated incident response capabilities and regular security audits. 237 | 238 | **GV (Governance and Validation)** 239 | 240 | * **Compliance:** 241 | * **Focus:** Extremely high emphasis. Financial institutions must comply with a wide range of regulations, including GDPR, CCPA, GLBA, SOX, and fair lending laws. 242 | * **Examples:** 243 | * **0.9:** The AI system collects and processes personal financial data without obtaining proper consent, violating data privacy regulations. 244 | * **0.5:** Some efforts are made to comply with regulations, but there are significant gaps and no formal compliance program. 245 | * **0.1:** The system is fully compliant with all applicable regulations, with a dedicated compliance team, regular audits, and a proactive approach to adapting to new regulations. 246 | 247 | * **Auditing:** 248 | * **Focus:** Regular audits are essential to ensure the security, fairness, and ethical performance of AI systems in the financial industry. 249 | * **Examples:** 250 | * **0.8:** No audits are conducted of the AI system's design, development, deployment, or operation. 251 | * **0.4:** Infrequent or limited audits are performed, focusing only on specific aspects (e.g., code security). 252 | * **0.1:** Regular independent audits are conducted by reputable third parties, covering all aspects of the AI system lifecycle, with clear audit trails and documentation. 253 | 254 | * **Risk Management:** 255 | * **Focus:** AI risks should be fully integrated into the organization's overall enterprise risk management framework. 256 | * **Examples:** 257 | * **0.7:** AI risks are not considered in the organization's risk management processes. 258 | * **0.4:** Basic risk assessments are conducted for AI systems, but they are not integrated into the broader enterprise risk management framework. 259 | * **0.1:** A comprehensive AI risk management framework is in place, with specific processes for identifying, assessing, mitigating, and monitoring AI risks, fully integrated into the organizational risk framework. 260 | 261 | * **Human Oversight:** 262 | * **Focus:** Appropriate human oversight is crucial for critical financial decisions made by AI systems. 263 | * **Examples:** 264 | * **0.8:** Loan applications are automatically approved or denied by an AI system with no human review. 265 | * **0.4:** Limited human oversight is in place, but it is primarily reactive and not well-defined. 266 | * **0.1:** Clear mechanisms for human review and intervention are established for critical financial decisions made by the AI system, with well-defined roles and responsibilities. 267 | 268 | * **Ethical Framework Alignment:** 269 | * **Focus:** Financial institutions should adhere to ethical principles in the design, development, and deployment of AI systems. 270 | * **Examples:** 271 | * **0.7:** No consideration of ethical frameworks or principles in the development of the AI system. 272 | * **0.4:** Basic awareness of ethical guidelines, but limited implementation and no formal ethical review process. 273 | * **0.1:** The AI system's design and operation demonstrably align with established ethical frameworks (e.g., OECD AI Principles) and the organization's own ethical guidelines. 274 | 275 | **CS (Cloud Security Alliance LLM Taxonomy) - Adapted for Finance** 276 | 277 | * **Model Manipulation:** 278 | * **Focus:** High risk, as manipulated LLMs could generate false or misleading financial information, impacting trading or investment decisions. 279 | * **Examples:** 280 | * **0.8:** An LLM used for generating financial reports is vulnerable to prompt injection attacks that could lead to inaccurate or manipulated information being published. 281 | * **0.4:** Some input filtering is in place, but the LLM can still be manipulated with sophisticated prompts. 282 | * **0.1:** Strong defenses against model manipulation are in place, including robust input validation, adversarial training, and output sanitization. 283 | 284 | * **Data Poisoning:** 285 | * **Focus:** High risk, as poisoned training data could lead to biased or inaccurate financial models, impacting credit scoring, fraud detection, or investment decisions. 286 | * **Examples:** 287 | * **0.8:** An attacker successfully poisons the training data for a fraud detection model, causing it to miss fraudulent transactions. 288 | * **0.4:** Some outlier detection is used, but the system is still vulnerable to targeted data poisoning attacks. 289 | * **0.1:** Strong data validation, anomaly detection, and provenance tracking mechanisms are in place to prevent and detect data poisoning. 290 | 291 | * **Sensitive Data Disclosure:** 292 | * **Focus:** Extremely high risk, as LLMs could inadvertently leak confidential financial information or PII in their outputs. 293 | * **Examples:** 294 | * **0.9:** An LLM used for customer service inadvertently reveals sensitive account information in a conversation. 295 | * **0.5:** Some output filtering is in place, but the LLM may still disclose sensitive information under certain circumstances. 296 | * **0.1:** Strong access controls, encryption, and output sanitization mechanisms are in place to prevent sensitive data disclosure. 297 | 298 | * **Model Stealing:** 299 | * **Focus:** High risk for proprietary LLMs used for financial analysis or trading, as theft could lead to significant financial losses and loss of competitive advantage. 300 | * **Examples:** 301 | * **0.8:** An attacker can extract a proprietary LLM used for investment analysis by repeatedly querying its API. 302 | * **0.4:** API access is rate-limited, but a determined attacker could still extract the model over time. 303 | * **0.1:** Strong defenses against model stealing are in place, including anomaly detection on API queries and model watermarking. 304 | 305 | * **Failure/Malfunctioning:** 306 | * **Focus:** High risk, as malfunctions in LLMs used for critical financial operations could lead to significant disruptions, financial losses, and regulatory penalties. 307 | * **Examples:** 308 | * **0.7:** An LLM used for algorithmic trading malfunctions, leading to erroneous trades and significant financial losses. 309 | * **0.4:** Some error handling is in place, but the system may still experience downtime or produce incorrect outputs under certain conditions. 310 | * **0.1:** The LLM is designed for high availability and fault tolerance, with robust error handling, monitoring, and redundancy mechanisms. 311 | 312 | * **Insecure Supply Chain:** 313 | * **Focus:** Medium to high risk, as vulnerabilities in third-party LLMs or libraries could compromise the security of financial applications. 314 | * **Examples:** 315 | * **0.6:** The organization uses a third-party LLM for financial analysis without thoroughly vetting its security. 316 | * **0.3:** Some due diligence is performed on third-party LLMs, but there is no continuous monitoring for vulnerabilities. 317 | * **0.1:** Strong supply chain security practices are in place, including thorough security assessments of third-party LLMs and continuous monitoring for vulnerabilities. 318 | 319 | * **Insecure Apps/Plugins:** 320 | * **Focus:** Medium to high risk, as insecure integrations with third-party apps or plugins could introduce vulnerabilities into financial systems. 321 | * **Examples:** 322 | * **0.6:** An insecure plugin used with an LLM allows an attacker to access sensitive financial data. 323 | * **0.3:** Some security measures are in place for apps/plugins, but they are not comprehensive. 324 | * **0.1:** Strong security guidelines and a rigorous vetting process are in place for all apps/plugins that interact with the LLM. 325 | 326 | * **Denial of Service (DoS):** 327 | * **Focus:** High risk, as DoS attacks on LLMs used for critical financial operations could disrupt services and cause significant financial losses. 328 | * **Examples:** 329 | * **0.7:** An LLM used for customer service is easily overwhelmed by a DoS attack, making it unavailable to customers. 330 | * **0.4:** Some rate limiting is in place, but the system is still vulnerable to sophisticated DoS attacks. 331 | * **0.1:** Strong defenses against DoS attacks are in place, including traffic filtering, rate limiting, and auto-scaling. 332 | 333 | * **Loss of Governance/Compliance:** 334 | * **Focus:** Extremely high risk, as non-compliance with financial regulations can lead to severe penalties and reputational damage. 335 | * **Examples:** 336 | * **0.8:** An LLM used for credit scoring is not compliant with fair lending regulations, leading to discriminatory outcomes. 337 | * **0.4:** Some compliance efforts are made, but there are significant gaps and no formal compliance program for the LLM. 338 | * **0.1:** The LLM is fully compliant with all relevant financial regulations, with a dedicated compliance team, regular audits, and a proactive approach to adapting to new regulations. 339 | 340 | **3. Mitigation Multiplier: Tailored for Finance** 341 | 342 | Given the generally robust security posture of financial institutions, the mitigation multiplier should be adjusted to reflect this: 343 | 344 | * **Strong Mitigation:** 1.0 (Reflects existing strong security controls in many financial institutions) 345 | * **Moderate Mitigation:** 1.1 346 | * **Weak/No Mitigation:** 1.3 347 | 348 | **Example Assessment (Illustrative)** 349 | 350 | *(Refer to the full AIVSS framework for a complete example. This section highlights the financial industry-specific adaptations.)* 351 | 352 | Let's consider a hypothetical AI-powered fraud detection system used by a bank: 353 | 354 | ```python 355 | vulnerability = { 356 | # ... (Base Metrics - assessed using standard security practices) 357 | 'ai_specific_metrics': { 358 | 'model_robustness': { 359 | 'evasion_resistance': 0.7, # High susceptibility to evasion attacks targeting fraud detection 360 | 'gradient_masking': 0.3, # Some gradient masking, but not a primary concern for this application 361 | 'robustness_certification': 0.6 # Basic testing, but not specific to financial fraud scenarios 362 | }, 363 | 'data_sensitivity': { 364 | 'data_confidentiality': 0.8, # Highly sensitive financial data with inadequate protection 365 | 'data_integrity': 0.6, # Some data integrity checks, but not robust enough for financial transactions 366 | 'data_provenance': 0.5 # Basic data source information, but incomplete lineage 367 | }, 368 | 'ethical_impact': { 369 | 'bias_discrimination': 0.4, # Some fairness monitoring, but no active bias mitigation 370 | 'transparency_explainability': 0.6, # Limited explainability, could be a regulatory issue 371 | 'accountability': 0.3, # Some accountability defined 372 | 'societal_impact': 0.4 # Some consideration, but not a primary focus for fraud detection 373 | }, 374 | 'decision_criticality': { 375 | 'safety_critical': 0.2, # Not directly safety-critical in this case 376 | 'financial_impact': 0.9, # High risk of financial loss due to fraud 377 | 'reputational_damage': 0.7, # High risk of reputational damage from a major security breach 378 | 'operational_disruption': 0.6 # Potential for moderate disruption to fraud detection operations 379 | }, 380 | 'adaptability': { 381 | 'continuous_monitoring': 0.6, # Basic monitoring, but limited real-time anomaly detection 382 | 'retraining_capabilities': 0.5, # Manual retraining possible, but not automated or triggered by performance changes 383 | 'threat_intelligence_integration': 0.4, # Some threat intelligence used, but not systematically 384 | 'adversarial_training': 0.5 # Some adversarial training, but not comprehensive for financial fraud attacks 385 | }, 386 | 'adversarial_attack_surface': { 387 | 'model_inversion': 0.6, # Some risk, as financial data is sensitive 388 | 'model_extraction': 0.3, # Lower risk for fraud detection models 389 | 'membership_inference': 0.5 # Moderate risk, as attackers might try to infer if specific transactions were in the training set 390 | }, 391 | 'lifecycle_vulnerabilities': { 392 | 'development': 0.4, # Some secure development practices, but not comprehensive 393 | 'training': 0.5, # Basic security measures for the training environment 394 | 'deployment': 0.4, # Some security measures in place, but not fully robust 395 | 'operations': 0.5 # Basic security monitoring and incident response 396 | }, 397 | 'governance_and_validation': { 398 | 'compliance': 0.7, # Some compliance efforts, but gaps remain 399 | 'auditing': 0.5, # Infrequent or limited audits 400 | 'risk_management': 0.4, # AI risks partially integrated into the organizational risk framework 401 | 'human_oversight': 0.6, # Some human review of flagged transactions 402 | 'ethical_framework_alignment': 0.4 # Basic awareness of ethical guidelines, but limited implementation 403 | }, 404 | 'cloud_security_llm': { 405 | 'model_manipulation': 0.6, # Some defenses against prompt injection, but not specifically tailored for financial scenarios 406 | 'data_poisoning': 0.5, # Some risk of data poisoning, particularly if using external data sources 407 | 'sensitive_data_disclosure': 0.7, # Risk of data leakage through the LLM 408 | 'model_stealing': 0.3, # Lower risk for this application 409 | 'failure_malfunctioning': 0.5, # Moderate risk of service disruption 410 | 'insecure_supply_chain': 0.4, # Some reliance on third-party libraries, but not thoroughly vetted 411 | 'insecure_apps_plugins': 0.3, # Limited use of third-party apps/plugins 412 | 'denial_of_service': 0.6, # Some vulnerability to DoS attacks 413 | 'loss_of_governance_compliance': 0.6 # Moderate risk of non-compliance related to LLM usage 414 | } 415 | }, 416 | # ... (Impact Metrics, Temporal Metrics - assessed using standard practices) 417 | 'mitigation_multiplier': 1.1, # Example: Moderate Mitigation - reflecting generally stronger security in finance 418 | 'model_complexity_multiplier': 1.2 # Example: Moderately Complex Model 419 | } 420 | 421 | # ... (Calculations using the AIVSS formula and adapted weights) 422 | ``` 423 | 424 | **Key Considerations for the Example:** 425 | 426 | * **Context:** This is a simplified example. A real-world assessment would involve a much more detailed evaluation of each sub-category based on the specific fraud detection system and its environment. 427 | * **Financial Industry Focus:** The scores and justifications reflect the specific priorities and concerns of the financial industry, such as the high sensitivity of financial data, the importance of regulatory compliance, and the potential for significant financial losses. 428 | * **Adaptation:** This example demonstrates how the AIVSS framework can be adapted to a specific industry by adjusting weights, tailoring scoring criteria, and providing industry-specific examples. 429 | 430 | **Conclusion** 431 | 432 | This adapted AIVSS framework provides a tailored approach for financial institutions to assess and manage the security risks associated with their AI systems. By focusing on the unique challenges and priorities of the financial sector, this adaptation enables organizations to: 433 | 434 | * **Identify and prioritize AI-specific vulnerabilities** that could lead to financial losses, regulatory penalties, and reputational damage. 435 | * **Develop more effective mitigation strategies** tailored to the specific threats faced by financial institutions. 436 | * **Demonstrate a strong security posture** to regulators, customers, and partners. 437 | * **Promote the responsible and trustworthy development and deployment of AI** in the financial industry. 438 | 439 | This document should be used in conjunction with the full AIVSS framework and the accompanying AI threat taxonomy appendix. Continuous updates and community feedback will be essential to ensure that this adaptation remains relevant and effective in the rapidly evolving landscape of AI security in the financial sector. 440 | -------------------------------------------------------------------------------- /Healthcare-AIVSS.md: -------------------------------------------------------------------------------- 1 | 2 | # AIVSS for the Healthcare Industry: An Adaptation Guide 3 | 4 | **Introduction** 5 | 6 | The Artificial Intelligence Vulnerability Scoring System (AIVSS) provides a general framework for assessing the security risks of AI systems. However, different industries have unique requirements and threat landscapes. This document adapts AIVSS to the specific needs and challenges of the healthcare industry, taking into account the regulatory environment, common use cases, and prevalent threats in this sector. 7 | 8 | **Why a Healthcare Industry-Specific AIVSS?** 9 | 10 | Healthcare organizations are increasingly adopting AI for a variety of applications, including: 11 | 12 | * **Diagnosis and Treatment Recommendations:** Assisting clinicians in diagnosing diseases and suggesting treatment options. 13 | * **Medical Imaging Analysis:** Analyzing medical images (e.g., X-rays, CT scans, MRIs) to detect anomalies and assist in diagnosis. 14 | * **Drug Discovery and Development:** Accelerating the process of drug discovery and development. 15 | * **Personalized Medicine:** Tailoring treatments and interventions to individual patient characteristics. 16 | * **Patient Monitoring and Care:** Monitoring patient health status, predicting potential risks, and providing personalized care recommendations. 17 | * **Administrative Tasks:** Automating administrative tasks such as appointment scheduling, billing, and claims processing. 18 | 19 | These applications handle highly sensitive patient data (Protected Health Information - PHI) and are often subject to strict regulatory oversight (e.g., HIPAA, GDPR). Security vulnerabilities in healthcare AI systems can have severe consequences, including: 20 | 21 | * **Patient Harm:** Incorrect diagnoses, treatment recommendations, or care decisions can directly harm patients. 22 | * **Privacy Violations:** Breaches of PHI can lead to identity theft, discrimination, and reputational damage. 23 | * **Regulatory Penalties:** Significant fines and sanctions for non-compliance with data protection and healthcare regulations. 24 | * **Reputational Damage:** Loss of patient trust and damage to the organization's reputation. 25 | * **Legal Action:** Lawsuits from affected patients or regulatory bodies. 26 | 27 | Therefore, a tailored AIVSS adaptation is crucial for healthcare organizations to effectively assess and manage the security risks associated with their AI systems. 28 | 29 | **Adaptations for the Healthcare Industry** 30 | 31 | This adaptation focuses on modifying the weights and scoring criteria within the AIVSS framework to reflect the specific priorities and concerns of the healthcare industry. 32 | 33 | **1. Modified Weights** 34 | 35 | In the healthcare industry, patient safety and data privacy are paramount. We propose the following adjusted weights for the AIVSS components: 36 | 37 | * **w₁ (Base Metrics):** 0.2 (Slightly reduced emphasis as baseline security is generally well-established in healthcare, but still important) 38 | * **w₂ (AI-Specific Metrics):** 0.5 (High emphasis due to the unique risks of AI systems and the sensitivity of healthcare data) 39 | * **w₃ (Impact Metrics):** 0.3 (Increased emphasis to reflect the potential for patient harm and significant regulatory penalties) 40 | 41 | **Rationale:** 42 | 43 | * Healthcare organizations have established cybersecurity programs, but base metrics remain important due to the critical nature of the sector. 44 | * AI-specific threats, such as data poisoning, adversarial attacks, and biased algorithms, can have direct consequences on patient care and privacy. 45 | * Impact in healthcare is heightened because vulnerabilities can directly affect patient safety and lead to severe regulatory and reputational consequences. 46 | 47 | **2. AI-Specific Metrics: Tailored Scoring Guidance** 48 | 49 | Here's how the AI-Specific Metrics should be tailored for the healthcare industry: 50 | 51 | **MR (Model Robustness)** 52 | 53 | * **Evasion Resistance:** 54 | * **Focus:** High emphasis on robustness against evasion attacks that could manipulate diagnosis, treatment recommendations, or medical image analysis. 55 | * **Examples:** 56 | * **0.9:** An attacker can subtly alter a medical image to cause a misdiagnosis by the AI system. 57 | * **0.5:** The model is trained with some adversarial examples but is vulnerable to more sophisticated attacks designed to evade detection. 58 | * **0.1:** The model is rigorously tested and robust against a wide range of evasion attacks relevant to medical image analysis and diagnosis. 59 | 60 | * **Gradient Masking/Obfuscation:** 61 | * **Focus:** Important for protecting the intellectual property of healthcare AI models and preventing model extraction by competitors. 62 | * **Examples:** 63 | * **0.8:** A competitor can easily access and interpret the model's gradients, potentially revealing proprietary information about the model's architecture or training data. 64 | * **0.4:** Some gradient obfuscation techniques are used, but they do not fully prevent reverse-engineering. 65 | * **0.1:** Strong gradient masking is employed, making it computationally infeasible to extract sensitive information from the model's gradients. 66 | 67 | * **Robustness Certification:** 68 | * **Focus:** While formal certification is still developing, rigorous testing against healthcare-specific attack scenarios is essential. 69 | * **Examples:** 70 | * **0.7:** No robustness testing performed specifically for healthcare-related attacks. 71 | * **0.4:** Model tested against basic adversarial attacks, but not specifically tailored to medical imaging or diagnosis. 72 | * **0.1:** Model rigorously tested against a range of healthcare-specific attacks, with documented results and independent validation. 73 | 74 | **DS (Data Sensitivity)** 75 | 76 | * **Data Confidentiality:** 77 | * **Focus:** Extremely high emphasis due to the handling of highly sensitive Protected Health Information (PHI). Compliance with regulations like HIPAA and GDPR is paramount. 78 | * **Examples:** 79 | * **1.0:** Patient data, including PHI, is stored unencrypted and accessible to unauthorized personnel. 80 | * **0.6:** Data is encrypted at rest, but access controls are weak, and data breaches are possible. 81 | * **0.1:** Data is encrypted at rest and in transit, with strict access controls, regular audits, and full compliance with all relevant data privacy regulations (HIPAA, GDPR). 82 | 83 | * **Data Integrity:** 84 | * **Focus:** Critical to ensure the accuracy of patient data used for training and decision-making. Tampering with data can lead to incorrect diagnoses, treatment recommendations, and patient harm. 85 | * **Examples:** 86 | * **0.9:** No data integrity checks in place, making it easy to manipulate patient records or medical image data. 87 | * **0.5:** Basic checksums are used, but there are no mechanisms to prevent or detect sophisticated data tampering. 88 | * **0.1:** Strong integrity checks, such as digital signatures and blockchain technology, are used to ensure data immutability and detect any tampering. 89 | 90 | * **Data Provenance:** 91 | * **Focus:** Important for auditing, regulatory compliance, and understanding the origin and processing of patient data. 92 | * **Examples:** 93 | * **0.8:** The origin and processing history of patient data are not tracked or documented. 94 | * **0.4:** Some information about data sources is available, but the lineage is incomplete and difficult to audit. 95 | * **0.1:** Detailed data lineage is tracked, including all transformations and processing steps, with a clear audit trail for regulatory compliance. 96 | 97 | **EI (Ethical Implications)** 98 | 99 | * **Bias and Discrimination:** 100 | * **Focus:** Extremely high emphasis due to the potential for AI systems to perpetuate or amplify existing biases in healthcare, leading to disparities in diagnosis, treatment, and outcomes. 101 | * **Examples:** 102 | * **0.9:** A diagnostic AI model exhibits significant bias against certain demographic groups, leading to inaccurate diagnoses and treatment recommendations. 103 | * **0.5:** Some fairness metrics are monitored, but no active bias mitigation techniques are employed. 104 | * **0.1:** Model is regularly audited for bias, and techniques like re-weighting or adversarial debiasing are used to mitigate any identified biases, ensuring fairness across different patient populations. 105 | 106 | * **Transparency and Explainability:** 107 | * **Focus:** High emphasis, especially for building trust with clinicians and patients and meeting regulatory requirements for explainable AI in healthcare. 108 | * **Examples:** 109 | * **0.8:** A treatment recommendation AI provides no explanation for its suggestions, making it difficult for clinicians to understand and trust the system. 110 | * **0.4:** Limited explainability is provided, but it is not sufficient for clinicians to fully understand the rationale behind the AI's recommendations. 111 | * **0.1:** System provides clear and understandable explanations for all decisions, enabling clinicians to understand the reasoning behind the AI's recommendations and make informed decisions. 112 | 113 | * **Accountability:** 114 | * **Focus:** Clear lines of accountability are essential for addressing errors, biases, and security incidents in healthcare AI systems. 115 | * **Examples:** 116 | * **0.7:** It is unclear who is responsible for addressing errors or biases in the AI-driven diagnostic system. 117 | * **0.3:** Some roles and responsibilities are defined, but there is no formal accountability framework. 118 | * **0.1:** A comprehensive accountability framework is in place, with clear procedures for incident response, remediation, and audits. 119 | 120 | * **Societal Impact:** 121 | * **Focus:** Consideration of the broader impact of AI on healthcare access, equity, and the potential for exacerbating existing health disparities. 122 | * **Examples:** 123 | * **0.7:** The deployment of an AI system for diagnosis has led to a decrease in access to care for underserved communities due to biased data or algorithmic design. 124 | * **0.4:** Some assessment of societal impact has been conducted, but no concrete steps have been taken to address potential negative consequences. 125 | * **0.1:** The AI system is designed to promote health equity and is regularly evaluated for its impact on different communities, with ongoing efforts to mitigate any disparities. 126 | 127 | **DC (Decision Criticality)** 128 | 129 | * **Safety-Critical:** 130 | * **Focus:** Extremely high emphasis in healthcare, as AI systems are increasingly used in safety-critical applications such as diagnosis, treatment, and patient monitoring. Errors or malfunctions can directly harm patients. 131 | * **Examples:** 132 | * **0.9:** An AI-powered surgical robot has no failsafe mechanisms and could cause serious injury to a patient in case of malfunction. 133 | * **0.5:** Basic safety measures are in place, but the system has not undergone rigorous safety testing for medical applications. 134 | * **0.1:** The system meets the highest safety standards for medical devices, with multiple failsafe mechanisms and rigorous testing to prevent patient harm. 135 | 136 | * **Financial Impact:** 137 | * **Focus:** High, as security breaches, system errors, and regulatory penalties can lead to significant financial losses for healthcare organizations. 138 | * **Examples:** 139 | * **0.7:** A data breach exposing patient PHI leads to significant fines and legal costs. 140 | * **0.4:** Errors in an AI-driven billing system result in financial losses for the hospital. 141 | * **0.2:** System has robust controls and monitoring to prevent and detect errors that could lead to financial losses. 142 | 143 | * **Reputational Damage:** 144 | * **Focus:** High emphasis, as security breaches, ethical lapses, and patient harm can severely damage patient trust and the organization's reputation. 145 | * **Examples:** 146 | * **0.9:** A data breach involving patient medical records leads to widespread negative media coverage and loss of patient trust. 147 | * **0.5:** A biased diagnostic AI generates negative publicity and erodes public confidence in the healthcare organization. 148 | * **0.1:** The organization has a strong track record of responsible AI development and deployment, with proactive measures to protect its reputation. 149 | 150 | * **Operational Disruption:** 151 | * **Focus:** High emphasis, as downtime in critical healthcare systems can disrupt patient care and have serious consequences. 152 | * **Examples:** 153 | * **0.8:** A failure in the AI-powered patient monitoring system causes a disruption in care and potentially jeopardizes patient safety. 154 | * **0.4:** Some redundancy is in place, but failover procedures are not regularly tested, potentially leading to extended downtime. 155 | * **0.1:** The system is designed for high availability with robust failover mechanisms and a comprehensive business continuity plan to minimize disruptions to patient care. 156 | 157 | **AD (Adaptability)** 158 | 159 | * **Continuous Monitoring:** 160 | * **Focus:** Crucial for detecting anomalies, adversarial attacks, and performance degradation in real-time, especially for patient monitoring and diagnostic systems. 161 | * **Examples:** 162 | * **0.7:** No real-time monitoring for adversarial attacks or anomalies in patient data. 163 | * **0.4:** Basic monitoring of system outputs, but limited analysis and no automated alerts for suspicious activity. 164 | * **0.1:** Comprehensive monitoring of system inputs, outputs, and internal states, with anomaly detection algorithms and automated alerts for potential security incidents or performance degradation. 165 | 166 | * **Retraining Capabilities:** 167 | * **Focus:** Important for adapting to new medical knowledge, evolving threats, and changes in patient populations. 168 | * **Examples:** 169 | * **0.6:** Diagnostic model can only be retrained manually, which is a slow and infrequent process. 170 | * **0.3:** AI system for treatment recommendations can be retrained automatically, but the process is not triggered by real-time performance degradation or new medical guidelines. 171 | * **0.1:** Models are automatically retrained on a regular basis or triggered by changes in performance, new medical knowledge, or data distribution, ensuring they remain accurate and up-to-date. 172 | 173 | * **Threat Intelligence Integration:** 174 | * **Focus:** Essential for staying ahead of emerging threats to healthcare AI systems, including new attack techniques and vulnerabilities. 175 | * **Examples:** 176 | * **0.8:** No integration with threat intelligence feeds or other sources of information on healthcare cybersecurity threats. 177 | * **0.4:** Security team occasionally reviews threat intelligence reports but does not systematically incorporate them into security operations. 178 | * **0.1:** System automatically ingests and analyzes threat intelligence feeds relevant to healthcare AI, generating alerts and updating models as needed. 179 | 180 | * **Adversarial Training:** 181 | * **Focus:** Important for building robustness against attacks specifically designed to target healthcare AI systems, such as medical image manipulation or diagnostic evasion. 182 | * **Examples:** 183 | * **0.7:** Model is not trained to be resistant to any specific types of attacks relevant to healthcare. 184 | * **0.4:** Model is trained with some basic adversarial examples, but not specifically tailored to medical imaging or diagnosis. 185 | * **0.1:** Model undergoes continuous adversarial training using a variety of attack techniques relevant to the healthcare industry, such as image manipulation and diagnostic evasion attacks. 186 | 187 | **AA (Adversarial Attack Surface)** 188 | 189 | * **Model Inversion:** 190 | * **Focus:** High risk if sensitive patient data can be extracted from models, potentially violating privacy regulations like HIPAA. 191 | * **Examples:** 192 | * **0.8:** An attacker can reconstruct patient medical images or other PHI from a diagnostic model's outputs. 193 | * **0.4:** Some measures are in place to limit model output precision, but they do not fully prevent model inversion attacks. 194 | * **0.1:** Model is trained with differential privacy or other techniques that provide strong guarantees against model inversion. 195 | 196 | * **Model Extraction:** 197 | * **Focus:** Medium to high risk for proprietary healthcare AI models, as theft could lead to loss of intellectual property and competitive advantage. 198 | * **Examples:** 199 | * **0.7:** An attacker can create a functional copy of a proprietary diagnostic model by querying its API. 200 | * **0.4:** API access is rate-limited, but an attacker can still extract the model over a longer period. 201 | * **0.1:** Strong defenses against model extraction are in place, such as anomaly detection on API queries and model watermarking. 202 | 203 | * **Membership Inference:** 204 | * **Focus:** High risk, especially for models trained on sensitive patient data, as it could reveal whether specific individuals were part of the training set, violating privacy. 205 | * **Examples:** 206 | * **0.7:** An attacker can easily determine if a particular patient's data was used to train a diagnostic model. 207 | * **0.4:** Some regularization techniques are used, but they do not fully prevent membership inference attacks. 208 | * **0.1:** Model is trained with differential privacy or other techniques that provide strong guarantees against membership inference. 209 | 210 | **LL (Lifecycle Vulnerabilities)** 211 | 212 | * **Development:** 213 | * **Focus:** Secure development practices are crucial to prevent vulnerabilities from being introduced during the model development phase. 214 | * **Examples:** 215 | * **0.7:** Developers use personal laptops with inadequate security controls to develop and train models on sensitive patient data. 216 | * **0.4:** Some secure coding guidelines are followed, but there is no formal secure development lifecycle (SDL) in place. 217 | * **0.1:** Secure development lifecycle (SDL) practices are strictly followed, including code reviews, static analysis, and vulnerability scanning, with access controls on development resources. 218 | 219 | * **Training:** 220 | * **Focus:** Protecting the training environment and data is essential to prevent data breaches, poisoning attacks, and other security incidents. 221 | * **Examples:** 222 | * **0.8:** Training data is stored on an unsecured server with no encryption or access controls. 223 | * **0.4:** Training data is encrypted at rest, but access controls are not strictly enforced. 224 | * **0.1:** Training is performed in a secure and isolated environment with strict access controls, data encryption, and regular security audits. 225 | 226 | * **Deployment:** 227 | * **Focus:** Secure deployment practices are necessary to protect models from unauthorized access, tampering, and other attacks. 228 | * **Examples:** 229 | * **0.7:** Model is deployed on a public server with no authentication required to access its API. 230 | * **0.4:** Model is deployed behind a firewall, but API keys are shared among multiple users. 231 | * **0.1:** Model is deployed in a secure cloud environment with strong authentication, authorization, and regular security updates, adhering to healthcare industry best practices. 232 | 233 | * **Operations:** 234 | * **Focus:** Continuous monitoring and incident response are crucial for detecting and responding to security incidents in a timely manner. 235 | * **Examples:** 236 | * **0.8:** No security monitoring or incident response plan is in place for the deployed AI system. 237 | * **0.4:** Basic security monitoring is performed, but there is no dedicated team or process for responding to incidents. 238 | * **0.1:** A dedicated security operations center (SOC) monitors the system 24/7, with automated incident response capabilities and regular security audits. 239 | 240 | **GV (Governance and Validation)** 241 | 242 | * **Compliance:** 243 | * **Focus:** Extremely high emphasis. Healthcare organizations must comply with a wide range of regulations, including HIPAA, GDPR, and other data privacy and security laws. 244 | * **Examples:** 245 | * **0.9:** The AI system collects and processes patient data without obtaining proper consent, violating data privacy regulations. 246 | * **0.5:** Some efforts are made to comply with regulations, but there are significant gaps and no formal compliance program. 247 | * **0.1:** The system is fully compliant with all applicable regulations (HIPAA, GDPR), with a dedicated compliance team, regular audits, and a proactive approach to adapting to new regulations. 248 | 249 | * **Auditing:** 250 | * **Focus:** Regular audits are essential to ensure the security, fairness, ethical performance, and regulatory compliance of AI systems in healthcare. 251 | * **Examples:** 252 | * **0.8:** No audits are conducted of the AI system's design, development, deployment, or operation. 253 | * **0.4:** Infrequent or limited audits are performed, focusing only on specific aspects (e.g., code security). 254 | * **0.1:** Regular independent audits are conducted by reputable third parties, covering all aspects of the AI system lifecycle, with clear audit trails and documentation, specifically addressing healthcare regulations and ethical guidelines. 255 | 256 | * **Risk Management:** 257 | * **Focus:** AI risks should be fully integrated into the organization's overall enterprise risk management framework. 258 | * **Examples:** 259 | * **0.7:** AI risks are not considered in the organization's risk management processes. 260 | * **0.4:** Basic risk assessments are conducted for AI systems, but they are not integrated into the broader enterprise risk management framework. 261 | * **0.1:** A comprehensive AI risk management framework is in place, with specific processes for identifying, assessing, mitigating, and monitoring AI risks, fully integrated into the organizational risk framework. 262 | 263 | * **Human Oversight:** 264 | * **Focus:** Appropriate human oversight is crucial for critical healthcare decisions made by AI systems, ensuring patient safety and ethical considerations. 265 | * **Examples:** 266 | * **0.8:** Treatment recommendations are automatically generated by an AI system with no human review by clinicians. 267 | * **0.4:** Limited human oversight is in place, but it is primarily reactive and not well-defined. 268 | * **0.1:** Clear mechanisms for human review and intervention are established for critical healthcare decisions made by the AI system, with well-defined roles and responsibilities for clinicians. 269 | 270 | * **Ethical Framework Alignment:** 271 | * **Focus:** Healthcare organizations should adhere to ethical principles in the design, development, and deployment of AI systems. 272 | * **Examples:** 273 | * **0.7:** No consideration of ethical frameworks or principles in the development of the AI system. 274 | * **0.4:** Basic awareness of ethical guidelines, but limited implementation and no formal ethical review process. 275 | * **0.1:** The AI system's design and operation demonstrably align with established ethical frameworks (e.g., the Belmont Report, relevant medical ethics guidelines) and the organization's own ethical guidelines, with a specific focus on patient well-being and autonomy. 276 | 277 | **CS (Cloud Security Alliance LLM Taxonomy) - Adapted for Healthcare** 278 | 279 | * **Model Manipulation:** 280 | * **Focus:** High risk, as manipulated LLMs could generate false or misleading medical information, impacting diagnosis, treatment, or patient communication. 281 | * **Examples:** 282 | * **0.8:** An LLM used for generating patient education materials is vulnerable to prompt injection attacks that could lead to inaccurate or harmful medical advice being provided. 283 | * **0.4:** Some input filtering is in place, but the LLM can still be manipulated with sophisticated prompts. 284 | * **0.1:** Strong defenses against model manipulation are in place, including robust input validation, adversarial training, and output sanitization, specifically tailored for medical accuracy. 285 | 286 | * **Data Poisoning:** 287 | * **Focus:** High risk, as poisoned training data could lead to biased or inaccurate medical models, impacting diagnosis, treatment recommendations, or drug discovery. 288 | * **Examples:** 289 | * **0.8:** An attacker successfully poisons the training data for a diagnostic model, causing it to misdiagnose a particular condition. 290 | * **0.4:** Some outlier detection is used, but the system is still vulnerable to targeted data poisoning attacks. 291 | * **0.1:** Strong data validation, anomaly detection, and provenance tracking mechanisms are in place to prevent and detect data poisoning, with a focus on ensuring the integrity of medical data sources. 292 | 293 | * **Sensitive Data Disclosure:** 294 | * **Focus:** Extremely high risk, as LLMs could inadvertently leak confidential patient information (PHI) or other sensitive medical data in their outputs. 295 | * **Examples:** 296 | * **0.9:** An LLM used for summarizing patient records inadvertently reveals PHI in its summaries. 297 | * **0.5:** Some output filtering is in place, but the LLM may still disclose sensitive information under certain circumstances. 298 | * **0.1:** Strong access controls, encryption, and output sanitization mechanisms are in place, along with rigorous testing to prevent PHI disclosure, complying with HIPAA and other relevant regulations. 299 | 300 | * **Model Stealing:** 301 | * **Focus:** Medium to high risk for proprietary LLMs used for diagnosis, drug discovery, or other healthcare applications, as theft could lead to loss of intellectual property and competitive advantage. 302 | * **Examples:** 303 | * **0.7:** An attacker can extract a proprietary LLM used for drug discovery by repeatedly querying its API. 304 | * **0.4:** API access is rate-limited, but a determined attacker could still extract the model over time. 305 | * **0.1:** Strong defenses against model stealing are in place, including anomaly detection on API queries and model watermarking. 306 | 307 | * **Failure/Malfunctioning:** 308 | * **Focus:** Extremely high risk, as malfunctions in LLMs used for critical healthcare operations could lead to patient harm, delayed treatment, or disruption of care. 309 | * **Examples:** 310 | * **0.8:** An LLM used for generating treatment recommendations malfunctions, leading to incorrect or harmful advice being provided to clinicians. 311 | * **0.4:** Some error handling is in place, but the system may still experience downtime or produce incorrect outputs under certain conditions. 312 | * **0.1:** The LLM is designed for high availability and fault tolerance, with robust error handling, monitoring, and redundancy mechanisms, rigorously tested for reliability in a healthcare setting. 313 | 314 | * **Insecure Supply Chain:** 315 | * **Focus:** Medium to high risk, as vulnerabilities in third-party LLMs or libraries could compromise the security of healthcare applications. 316 | * **Examples:** 317 | * **0.6:** The organization uses a third-party LLM for medical image analysis without thoroughly vetting its security. 318 | * **0.3:** Some due diligence is performed on third-party LLMs, but there is no continuous monitoring for vulnerabilities. 319 | * **0.1:** Strong supply chain security practices are in place, including thorough security assessments of third-party LLMs and continuous monitoring for vulnerabilities. 320 | 321 | * **Insecure Apps/Plugins:** 322 | * **Focus:** Medium to high risk, as insecure integrations with third-party apps or plugins could introduce vulnerabilities into healthcare systems. 323 | * **Examples:** 324 | * **0.6:** An insecure plugin used with an LLM allows an attacker to access sensitive patient data. 325 | * **0.3:** Some security measures are in place for apps/plugins, but they are not comprehensive. 326 | * **0.1:** Strong security guidelines and a rigorous vetting process are in place for all apps/plugins that interact with the LLM, with specific attention to healthcare data security standards. 327 | 328 | * **Denial of Service (DoS):** 329 | * **Focus:** High risk, as DoS attacks on LLMs used for critical healthcare operations could disrupt patient care and have serious consequences. 330 | * **Examples:** 331 | * **0.7:** An LLM used for triaging patients in an emergency department is easily overwhelmed by a DoS attack, making it unavailable. 332 | * **0.4:** Some rate limiting is in place, but the system is still vulnerable to sophisticated DoS attacks. 333 | * **0.1:** Strong defenses against DoS attacks are in place, including traffic filtering, rate limiting, and auto-scaling, ensuring high availability for critical healthcare applications. 334 | 335 | * **Loss of Governance/Compliance:** 336 | * **Focus:** Extremely high risk, as non-compliance with healthcare regulations (e.g., HIPAA, GDPR) can lead to severe penalties, reputational damage, and patient harm. 337 | * **Examples:** 338 | * **0.8:** An LLM used for processing patient data is not compliant with HIPAA regulations, leading to potential privacy violations. 339 | * **0.4:** Some compliance efforts are made, but there are significant gaps and no formal compliance program for the LLM. 340 | * **0.1:** The LLM is fully compliant with all relevant healthcare regulations, with a dedicated compliance team, regular audits, and a proactive approach to adapting to new regulations. 341 | 342 | **3. Mitigation Multiplier: Tailored for Healthcare** 343 | 344 | Given the generally strong emphasis on security and compliance in healthcare, but also the evolving nature of AI risks, the mitigation multiplier should reflect a moderate stance: 345 | 346 | * **Strong Mitigation:** 1.0 347 | * **Moderate Mitigation:** 1.2 348 | * **Weak/No Mitigation:** 1.4 349 | 350 | **Example Assessment (Illustrative)** 351 | 352 | *(Refer to the full AIVSS framework for a complete example. This section highlights the healthcare industry-specific adaptations.)* 353 | 354 | Let's consider a hypothetical AI-powered diagnostic system used by a hospital for analyzing medical images: 355 | 356 | ```python 357 | vulnerability = { 358 | # ... (Base Metrics - assessed using standard security practices) 359 | 'ai_specific_metrics': { 360 | 'model_robustness': { 361 | 'evasion_resistance': 0.8, # High susceptibility to image manipulation attacks 362 | 'gradient_masking': 0.5, # Some gradient obfuscation, but could be improved 363 | 'robustness_certification': 0.6 # Basic testing, but not specific to medical image attacks 364 | }, 365 | 'data_sensitivity': { 366 | 'data_confidentiality': 0.9, # Highly sensitive PHI with inadequate protection 367 | 'data_integrity': 0.7, # Some data integrity checks, but not robust enough for medical images 368 | 'data_provenance': 0.6 # Some data source information, but incomplete lineage 369 | }, 370 | 'ethical_impact': { 371 | 'bias_discrimination': 0.7, # Risk of biased diagnoses based on demographic factors 372 | 'transparency_explainability': 0.6, # Limited explainability, making it hard for clinicians to trust the system 373 | 'accountability': 0.4, # Some accountability defined, but needs improvement 374 | 'societal_impact': 0.6 # Some consideration of healthcare access, but more needed 375 | }, 376 | 'decision_criticality': { 377 | 'safety_critical': 0.8, # High, as incorrect diagnoses can lead to patient harm 378 | 'financial_impact': 0.6, # Potential for financial losses due to errors and regulatory penalties 379 | 'reputational_damage': 0.8, # High risk of reputational damage from misdiagnoses or data breaches 380 | 'operational_disruption': 0.7 # Potential for significant disruption to diagnostic workflows 381 | }, 382 | 'adaptability': { 383 | 'continuous_monitoring': 0.5, # Basic monitoring, but limited real-time anomaly detection for medical images 384 | 'retraining_capabilities': 0.4, # Manual retraining possible, but not automated or triggered by performance changes 385 | 'threat_intelligence_integration': 0.3, # Limited use of threat intelligence in healthcare context 386 | 'adversarial_training': 0.4 # Some adversarial training, but not comprehensive for medical image attacks 387 | }, 388 | 'adversarial_attack_surface': { 389 | 'model_inversion': 0.7, # High risk, as patient medical images are highly sensitive 390 | 'model_extraction': 0.5, # Moderate risk, as the model may contain proprietary diagnostic techniques 391 | 'membership_inference': 0.6 # Moderate to high risk, as attackers might try to infer if specific patients' data was used in training 392 | }, 393 | 'lifecycle_vulnerabilities': { 394 | 'development': 0.5, # Some secure development practices, but not comprehensive 395 | 'training': 0.6, # Basic security measures for the training environment, but data security needs improvement 396 | 'deployment': 0.5, # Some security measures in place, but not fully robust 397 | 'operations': 0.6 # Basic security monitoring and incident response, but could be more proactive 398 | }, 399 | 'governance_and_validation': { 400 | 'compliance': 0.7, # Some compliance efforts (e.g., HIPAA), but gaps remain 401 | 'auditing': 0.4, # Infrequent or limited audits, not comprehensive for healthcare AI 402 | 'risk_management': 0.5, # AI risks partially integrated into the organizational risk framework 403 | 'human_oversight': 0.6, # Some human review of diagnoses, but not systematic 404 | 'ethical_framework_alignment': 0.5 # Some awareness of ethical guidelines, but limited implementation 405 | }, 406 | 'cloud_security_llm': { 407 | 'model_manipulation': 0.7, # Vulnerable to prompt injection if using LLMs for report generation or analysis 408 | 'data_poisoning': 0.6, # Risk of data poisoning, especially if using external medical data sources 409 | 'sensitive_data_disclosure': 0.8, # High risk of PHI leakage through the LLM 410 | 'model_stealing': 0.4, # Some risk if the LLM is proprietary 411 | 'failure_malfunctioning': 0.7, # High risk, as malfunctions could disrupt diagnosis or treatment 412 | 'insecure_supply_chain': 0.5, # Some reliance on third-party LLMs or libraries, but not thoroughly vetted 413 | 'insecure_apps_plugins': 0.4, # Some risk from third-party integrations 414 | 'denial_of_service': 0.7, # Vulnerable to DoS attacks, potentially disrupting critical operations 415 | 'loss_of_governance_compliance': 0.7 # High risk of non-compliance related to LLM usage and PHI 416 | } 417 | }, 418 | # ... (Impact Metrics, Temporal Metrics - assessed using standard practices) 419 | 'mitigation_multiplier': 1.2, # Example: Moderate Mitigation - reflecting the need for improvement in some areas 420 | 'model_complexity_multiplier': 1.3 # Example: Moderately Complex Model 421 | } 422 | 423 | # ... (Calculations using the AIVSS formula and adapted weights) 424 | ``` 425 | 426 | **Key Considerations for the Example:** 427 | 428 | * **Context:** This is a simplified example. A real-world assessment would involve a much more detailed evaluation of each sub-category based on the specific diagnostic system and its environment. 429 | * **Healthcare Focus:** The scores and justifications reflect the specific priorities and concerns of the healthcare industry, such as patient safety, data privacy (PHI), regulatory compliance (HIPAA, GDPR), and the potential for biased or inaccurate diagnoses. 430 | * **Adaptation:** This example demonstrates how the AIVSS framework can be adapted to a specific industry by adjusting weights, tailoring scoring criteria, and providing industry-specific examples. 431 | 432 | **Conclusion** 433 | 434 | This adapted AIVSS framework provides a tailored approach for healthcare organizations to assess and manage the security risks associated with their AI systems. By focusing on the unique challenges and priorities of the healthcare sector, this adaptation enables organizations to: 435 | 436 | * **Identify and prioritize AI-specific vulnerabilities** that could lead to patient harm, privacy violations, regulatory penalties, and reputational damage. 437 | * **Develop more effective mitigation strategies** tailored to the specific threats faced by healthcare institutions. 438 | * **Demonstrate a strong security and compliance posture** to regulators, patients, and partners. 439 | * **Promote the responsible and trustworthy development and deployment of AI** in healthcare, ultimately improving patient care and outcomes. 440 | 441 | This document should be used in conjunction with the full AIVSS framework and the accompanying AI threat taxonomy appendix. Continuous updates and community feedback will be essential to ensure that this adaptation remains relevant and effective in the rapidly evolving landscape of AI security in the healthcare sector. 442 | 443 | -------------------------------------------------------------------------------- /AIVSS-Chinese.md: -------------------------------------------------------------------------------- 1 | 2 | # 人工智能漏洞评分系统 (AIVSS) # 3 | 4 | ## 1. 简介 ## 5 | 6 | 人工智能漏洞评分系统 (AIVSS)提供了一个标准化、全面的框架,用于评估和量化与人工智能系统相关的安全风险。该框架特别关注大型语言模型 (LLM) 和基于云的部署,同时仍适用于广泛的人工智能系统。AIVSS 将传统的安全漏洞评分概念应用于人工智能的独特特征和挑战,并借鉴了领先的人工智能威胁分类法和安全标准。本文档概述了 AIVSS 框架,包括详细的评分细则、实施检查表以及环境因素的考虑。 7 | 8 | ## 2. AIVSS 的必要性 ## 9 | 10 | 传统的漏洞评分系统,例如通用漏洞评分系统 (CVSS),不足以应对人工智能系统带来的独特安全挑战。这些挑战包括: 11 | 12 | * **对抗性攻击:** 人工智能系统容易受到对抗性攻击,这些攻击通过精心设计的输入来操纵模型行为,这是传统系统无法充分捕捉到的威胁。 13 | * **模型退化:** 人工智能模型可能会由于概念漂移或数据中毒而随着时间的推移而退化,从而影响其准确性和可靠性。 14 | * **生命周期漏洞:** 人工智能系统具有复杂的生命周期,从数据收集和训练到部署和维护,每个阶段都会引入潜在的漏洞。 15 | * **伦理和社会影响:** 人工智能系统可能具有重大的伦理和社会影响,例如偏见和歧视,这些在传统的安全评估中并未考虑。 16 | * **人工智能的动态性:** 人工智能系统通常是动态和自适应的,这使得静态评分方法的效果较差。 17 | 18 | AIVSS 通过提供一个针对人工智能特定安全风险的全面框架来应对这些挑战。 19 | 20 | ## 3. 框架组件 ## 21 | 22 | AIVSS 包含以下关键组件: 23 | 24 | ### 3.1. 基本指标 ### 25 | 26 | 基本指标捕获了漏洞的基本特征,这些特征在不同时间和不同环境中保持不变。 27 | 28 | * **攻击向量 (AV):** 反映漏洞利用的上下文。 29 | * 网络 (N):0.85 30 | * 相邻网络 (A):0.62 31 | * 本地 (L):0.55 32 | * 物理 (P):0.2 33 | * **攻击复杂性 (AC):** 衡量攻击者控制之外必须存在才能利用漏洞的条件。 34 | * 低 (L):0.77 35 | * 高 (H):0.44 36 | * **所需权限 (PR):** 描述攻击者在成功利用漏洞之前必须拥有的权限级别。 37 | * 无 (N):0.85 38 | * 低 (L):0.62 39 | * 高 (H):0.27 40 | * **用户交互 (UI):** 捕获除攻击者之外的用户参与成功入侵易受攻击组件的要求。 41 | * 无 (N):0.85 42 | * 必需 (R):0.62 43 | * **范围 (S):** 衡量一个易受攻击组件中的漏洞是否会影响其安全范围之外的组件中的资源。 44 | * 未更改 (U):1.0 45 | * 已更改 (C):1.5 46 | 47 | ### 3.2. 人工智能特定指标 ### 48 | 49 | 人工智能特定指标捕获了与人工智能系统相关的独特漏洞和风险。这些指标是根据详细的评分细则(在第 4 节中提供)进行评估的。 50 | 51 | ``` 52 | AISpecificMetrics = [MR × DS × EI × DC × AD × AA × LL × GV × CS] × ModelComplexityMultiplier 53 | ``` 54 | 55 | * **MR (模型稳健性):** 评估系统对对抗性攻击和模型退化的抵抗力。 56 | * **DS (数据敏感性):** 评估与人工智能系统使用的数据的机密性、完整性和来源相关的风险。 57 | * **EI (伦理影响):** 考虑潜在的偏见、透明度问题、问责制问题和社会影响。 58 | * **DC (决策关键性):** 衡量人工智能系统做出的不正确或恶意决策的潜在后果。 59 | * **AD (适应性):** 评估系统适应不断变化的威胁并在一段时间内保持安全性的能力。 60 | * **AA (对抗性攻击面):** 评估系统暴露于各种对抗性攻击技术的程度。 61 | * **LL (生命周期漏洞):** 考虑人工智能系统生命周期不同阶段的安全风险。 62 | * **GV (治理和验证):** 评估治理机制和验证过程的存在性和有效性。 63 | * **CS (云安全联盟 LLM 分类法):** 解决云环境中 LLM 的特定威胁,如 CSA LLM 威胁分类法所定义。 64 | * **模型复杂度乘数:** 根据人工智能模型的复杂度调整人工智能特定指标分数的因子(对于简单模型为 1.0,对于高度复杂模型为 1.5)。 65 | 66 | ### 3.3. 环境指标 ### 67 | 68 | 环境指标反映了人工智能系统部署环境的特征,这些特征会影响整体风险。 69 | 70 | * **机密性要求 (CR):** 衡量维护人工智能系统处理的数据的机密性的重要性。 71 | * **完整性要求 (IR):** 衡量维护数据和人工智能系统输出的完整性的重要性。 72 | * **可用性要求 (AR):** 衡量确保人工智能系统为其预期目的的可用性的重要性。 73 | * **社会影响要求 (SIR):** 衡量减轻人工智能系统潜在负面社会影响的重要性。 74 | 75 | 这些指标的评级如下: 76 | 77 | * 未定义 (X):1.0(默认值,不修改分数) 78 | * 低 (L):0.5 79 | * 中 (M):1.0 80 | * 高 (H):1.5 81 | 82 | ### 3.4. 修正的基本指标 ### 83 | 84 | 这些指标基于基本指标,但可以根据特定环境进行修改: 85 | 86 | * 修正的攻击向量 (MAV) 87 | * 修正的攻击复杂性 (MAC) 88 | * 修正的所需权限 (MPR) 89 | * 修正的用户交互 (MUI) 90 | * 修正的范围 (MS) 91 | 92 | 这些指标的评级方式与基本指标相同,并增加: 93 | 94 | * 未定义 (X):使用未修改的基本指标值。 95 | 96 | ## 4. 人工智能特定指标的详细评分细则(分数越高 = 越严重) ## 97 | 98 | 人工智能特定指标中的每个子类别都按 0.0 到 1.0 的等级进行评分,具有以下一般解释: 99 | 100 | * **0.0:未知漏洞:** 表示没有已知漏洞或对特定威胁有正式证明的抵抗力。 101 | * **0.1 - 0.3:低漏洞:** 表示存在低漏洞,并采取了强有力的缓解措施,但仍可能存在一些小的弱点。 102 | * **0.4 - 0.6:中等漏洞:** 表示存在中等漏洞,并采取了一些缓解措施,但仍存在明显的弱点。 103 | * **0.7 - 1.0:严重/高漏洞:** 表示存在严重漏洞,几乎没有或没有采取缓解措施。 104 | 105 | **MR (模型稳健性)** 106 | 107 | * **规避抵抗** 108 | * **0.0:** 正式验证了对各种规避攻击的稳健性。 109 | * **0.1-0.3:** 对大多数已知的规避攻击具有稳健性,采用了多种防御机制(例如,对抗性训练、输入净化、认证的稳健性)。 110 | * **0.4-0.6:** 容易受到某些规避攻击,进行了基本的对抗性训练或输入验证。 111 | * **0.7-1.0:** 非常容易受到常见的规避攻击(例如,FGSM、PGD)。没有或只有极少的防御措施。 112 | * **示例:** 113 | * **0.0:** 模型的稳健性通过形式化方法得到证明。 114 | * **0.2:** 模型结合使用对抗性训练、输入过滤和认证的稳健性技术。 115 | * **0.5:** 模型使用对抗性样本进行训练,但仍然容易受到更复杂的攻击。 116 | * **0.8:** 通过对输入图像添加小的扰动,模型很容易被欺骗。 117 | 118 | * **梯度掩蔽/混淆** 119 | * **0.0:** 梯度完全隐藏或正式证明无法恢复。 120 | * **0.1-0.3:** 使用了强大的梯度掩蔽技术(例如,破碎梯度、温度计编码),使基于梯度的攻击变得更加困难。 121 | * **0.4-0.6:** 采用了基本的梯度混淆方法(例如,添加噪声),但梯度仍然可以部分恢复。 122 | * **0.7-1.0:** 梯度易于访问和解释,没有使用掩蔽技术。 123 | * **示例:** 124 | * **0.0:** 模型使用同态加密或其他方法使梯度完全无法访问。 125 | * **0.2:** 模型使用破碎梯度等先进技术,使基于梯度的攻击在计算上非常昂贵。 126 | * **0.5:** 对梯度添加了一些噪声,但它们仍然揭示了有关模型的信息。 127 | * **0.9:** 模型的梯度可以轻松计算和可视化。 128 | 129 | * **稳健性认证** 130 | * **0.0:** 从信誉良好的第三方组织获得正式的稳健性认证。 131 | * **0.1-0.3:** 使用多种指标(例如,CLEVER、Robustness Gym)对各种攻击进行了严格的稳健性测试。 132 | * **0.4-0.6:** 对有限的攻击集或使用简单的指标进行了基本的稳健性测试。 133 | * **0.7-1.0:** 没有进行稳健性测试。 134 | * **示例:** 135 | * **0.0:** 模型已通过公认的认证机构认证,可抵御特定类型的攻击。 136 | * **0.2:** 使用像 Robustness Gym 这样的综合稳健性测试框架评估模型。 137 | * **0.5:** 模型针对具有有限扰动预算范围的 FGSM 攻击进行了测试。 138 | * **0.8:** 没有针对对抗性示例的稳健性测试。 139 | 140 | **DS (数据敏感性)** 141 | 142 | * **数据机密性** 143 | * **0.0:** 使用差分隐私或同态加密等技术完全匿名化数据。 144 | * **0.1-0.3:** 对静态和传输中的数据使用强加密(例如,AES-256),并实施严格的访问控制和密钥管理实践。 145 | * **0.4-0.6:** 敏感数据具有基本的访问控制(例如,密码),但没有加密。 146 | * **0.7-1.0:** 高度敏感的数据(例如,PII、财务数据)在没有或只有极少保护的情况下存储或处理。 147 | * **示例:** 148 | * **0.0:** 数据已完全匿名化,并且可以证明与个人无关。 149 | * **0.2:** 数据在静态和传输中均已加密,并具有严格的访问控制和密钥轮换策略。 150 | * **0.5:** 数据访问受用户角色限制,但数据以明文形式存储。 151 | * **0.9:** 训练数据包括所有开发人员都可以访问的未加密的 PII。 152 | 153 | * **数据完整性** 154 | * **0.0:** 使用区块链或 Merkle 树等技术正式验证数据完整性。 155 | * **0.1-0.3:** 实施了强大的完整性检查(例如,数字签名、加密哈希)和篡改检测机制。 156 | * **0.4-0.6:** 使用了基本的完整性检查(例如,校验和),但没有防篡改机制。 157 | * **0.7-1.0:** 没有数据完整性检查,数据可以轻松修改而不被检测到。 158 | * **示例:** 159 | * **0.0:** 数据存储在区块链上,确保不变性和防篡改完整性。 160 | * **0.2:** 数据经过数字签名,任何修改都会被检测到并发出警报。 161 | * **0.5:** 在访问时使用校验和来验证数据完整性。 162 | * **0.8:** 数据可以在没有任何检测的情况下被更改。 163 | 164 | * **数据来源** 165 | * **0.0:** 数据来源经过正式验证和可审计,并具有确保数据源真实性和可信度的机制。 166 | * **0.1-0.3:** 跟踪详细的数据沿袭,包括所有转换和处理步骤,并具有清晰的审计跟踪。 167 | * **0.4-0.6:** 有关数据源的基本信息可用,但沿袭不完整或不清楚。 168 | * **0.7-1.0:** 没有关于数据来源、收集方法或转换的信息。 169 | * **示例:** 170 | * **0.0:** 数据来源经过加密验证且防篡改。 171 | * **0.2:** 跟踪完整的数据沿袭,包括所有处理步骤和数据所有者。 172 | * **0.5:** 记录了数据源,但未明确记录应用的转换。 173 | * **0.9:** 数据的来源和收集方法未知。 174 | 175 | **EI (伦理影响)** 176 | 177 | * **偏见和歧视** 178 | * **0.0:** 系统在不同群体中表现出公平和无偏见,并持续监控和审核偏见。 179 | * **0.1-0.3:** 使用多种指标(例如,机会均等、预测率平价)进行严格的公平性测试,并应用了偏见缓解技术(例如,重新加权、对抗性去偏)。 180 | * **0.4-0.6:** 对潜在偏见有一些认识,监控了基本的公平性指标(例如,人口统计均等),但没有积极的缓解措施。 181 | * **0.7-1.0:** 歧视性结果的风险很高,没有使用偏见检测或缓解方法。 182 | * **示例:** 183 | * **0.0:** 系统的公平性经过正式验证并持续监控。 184 | * **0.2:** 系统使用对抗性去偏等技术进行训练,并定期审核公平性。 185 | * **0.5:** 监控公平性指标,但不采取任何措施来解决已识别的偏见。 186 | * **0.9:** 系统始终对某些人口群体产生有偏见的输出。 187 | 188 | * **透明度和可解释性** 189 | * **0.0:** 系统的决策过程完全透明且可正式解释,并建立了明确的因果关系。 190 | * **0.1-0.3:** 高度可解释,系统使用固有可解释的模型(例如,决策树)或为所有决策提供可靠和全面的解释。 191 | * **0.4-0.6:** 有限的可解释性,可以生成一些事后解释(例如,LIME、SHAP),但它们可能不可靠或不全面。 192 | * **0.7-1.0:** 黑盒系统,无法深入了解决策过程。 193 | * **示例:** 194 | * **0.0:** 系统的逻辑完全透明,并且可以正式验证。 195 | * **0.2:** 系统使用可解释的模型或为每个决策提供详细且可靠的解释。 196 | * **0.5:** 可以生成事后解释,但它们并不总是准确或完整的。 197 | * **0.8:** 没有提供系统决策的解释。 198 | 199 | * **问责制** 200 | * **0.0:** 完全问责制,并具有补救、纠正和独立监督机制。 201 | * **0.1-0.3:** 建立了明确的问责制框架,并定义了处理错误和争议的角色、责任和流程。 202 | * **0.4-0.6:** 对开发人员或运营商分配了一些责任,但没有正式的问责制框架。 203 | * **0.7-1.0:** 对系统的行为或错误没有明确的问责制。 204 | * **示例:** 205 | * **0.0:** 系统具有正式的问责制框架,并具有独立审核和公开报告机制。 206 | * **0.2:** 为开发、部署和运营定义了明确的角色和责任,并制定了事件响应计划。 207 | * **0.5:** 开发团队通常负责,但没有明确的错误处理程序。 208 | * **0.9:** 不清楚系统出错时谁负责。 209 | 210 | * **社会影响** 211 | * **0.0:** 系统旨在最大限度地提高积极的社会影响并最大限度地减少负面后果,并持续监控和与受影响的社区互动。 212 | * **0.1-0.3:** 进行了全面的社会影响评估,考虑了广泛的利益相关者和潜在危害,并制定了缓解策略。 213 | * **0.4-0.6:** 对潜在的社会影响有一些考虑,但没有全面的评估或主动缓解措施。 214 | * **0.7-1.0:** 负面社会影响的风险很高(例如,失业、操纵、信任侵蚀),没有评估或缓解措施。 215 | * **示例:** 216 | * **0.0:** 系统采用强有力的道德框架设计,促进公平、透明和社会福祉。 217 | * **0.2:** 进行了全面的社会影响评估,并制定了缓解策略。 218 | * **0.5:** 开发人员承认潜在的负面影响,但没有采取具体步骤来解决这些影响。 219 | * **0.8:** 该系统可用于大规模监视或传播错误信息,而没有任何保障措施。 220 | 221 | **DC (决策关键性)** 222 | 223 | * **安全关键** 224 | * **0.0:** 系统经过正式验证,符合安全关键标准(例如,汽车领域的 ISO 26262、医疗器械领域的 IEC 62304)。 225 | * **0.1-0.3:** 进行了严格的安全测试,包括边缘情况和故障场景,并具有故障安全机制和人工监督。 226 | * **0.4-0.6:** 采取了基本的安全措施(例如,一些冗余),但没有严格的安全测试或正式验证。 227 | * **0.7-1.0:** 系统用于安全关键应用(例如,自动驾驶、医疗诊断),没有适当的安全考虑或故障安全机制。 228 | * **示例:** 229 | * **0.0:** 系统已通过认证,符合其应用领域的相关安全标准。 230 | * **0.2:** 系统经过严格的安全测试,并具有多种故障安全机制。 231 | * **0.5:** 系统有一些备份系统,但它们没有经过全面测试。 232 | * **0.9:** 系统用于控制关键功能,没有任何冗余或故障安全机制。 233 | 234 | * **财务影响** 235 | * **0.0:** 系统旨在最大限度地降低财务风险,并具有实时欺诈预防、异常检测和全面的保险范围。 236 | * **0.1-0.3:** 实施了强大的财务控制和欺诈检测机制,并进行定期审核以识别和减轻财务风险。 237 | * **0.4-0.6:** 采取了一些措施来减轻财务风险(例如,交易限额),但没有全面的风险评估或欺诈预防机制。 238 | * **0.7-1.0:** 由于系统错误或恶意攻击而造成重大财务损失的风险很高,没有采取任何保护措施。 239 | * **示例:** 240 | * **0.0:** 系统具有多层财务控制、实时欺诈预防和针对财务损失的保险。 241 | * **0.2:** 系统使用先进的欺诈检测算法并进行定期财务审核。 242 | * **0.5:** 系统有一些交易限额和基本的欺诈监控。 243 | * **0.8:** 系统错误可能导致在没有任何检测的情况下进行未经授权的大额交易。 244 | 245 | * **声誉损害** 246 | * **0.0:** 系统旨在最大限度地降低声誉风险,并持续监控公众认知、主动与利益相关者互动,并制定了稳健的危机管理计划。 247 | * **0.1-0.3:** 进行了声誉风险评估,考虑了各种场景和利益相关者,并制定了沟通计划和缓解策略。 248 | * **0.4-0.6:** 对声誉风险有一些认识,对公众认知的监控有限,但没有采取主动措施来应对负面宣传。 249 | * **0.7-1.0:** 由于系统错误、偏见或安全漏洞而造成严重声誉损害的风险很高,没有缓解策略。 250 | * **示例:** 251 | * **0.0:** 系统设计为透明和道德的,最大限度地降低了声誉损害的风险,并且公司在负责任的人工智能实践方面拥有良好的记录。 252 | * **0.2:** 进行了声誉风险评估,并制定了危机沟通计划。 253 | * **0.5:** 公司监控社交媒体上的负面评论,但没有计划解决这些评论。 254 | * **0.8:** 系统错误或偏见可能导致广泛的公众批评和信任丧失。 255 | 256 | * **运营中断** 257 | * **0.0:** 系统设计为具有高可用性和弹性,并具有实时监控、自动恢复和定期测试故障转移机制。 258 | * **0.1-0.3:** 强大的运营控制,包括冗余、故障转移机制和全面的业务连续性和灾难恢复计划。 259 | * **0.4-0.6:** 采取了一些措施来减轻运营风险(例如,有限的冗余),但没有全面的业务连续性计划。 260 | * **0.7-1.0:** 由于系统故障或攻击而导致重大运营中断的风险很高,没有备份系统或恢复计划。 261 | * **示例:** 262 | * **0.0:** 系统设计为具有多层冗余和自动恢复的 24/7 全天候可用性。 263 | * **0.2:** 系统具有定期测试和更新的全面业务连续性计划。 264 | * **0.5:** 系统具有一些冗余组件,但故障转移程序没有定期测试。 265 | * **0.8:** 系统故障可能会导致关键业务运营中断,而没有任何备份。 266 | 267 | **AD (适应性)** 268 | 269 | * **持续监控** 270 | * **0.0:** 实时监控,对检测到的威胁自动响应,包括动态模型调整和回滚功能。 271 | * **0.1-0.3:** 全面监控系统输入、输出和内部状态,并具有异常检测算法和针对可疑活动的自动警报。 272 | * **0.4-0.6:** 采取了基本的监控措施(例如,记录系统输出),但分析有限,没有自动警报。 273 | * **0.7-1.0:** 没有针对对抗性攻击、异常或性能下降的监控。 274 | * **示例:** 275 | * **0.0:** 系统具有实时入侵检测和自动响应功能。 276 | * **0.2:** 系统使用 SIEM 系统监控异常并生成警报。 277 | * **0.5:** 系统日志已存储,但仅定期手动分析。 278 | * **0.9:** 没有收集日志,也没有执行监控。 279 | 280 | * **再训练能力** 281 | * **0.0:** 由于性能下降、概念漂移或新数据的可用性而触发持续和自动的再训练,且人工干预最少。 282 | * **0.1-0.3:** 建立了自动再训练管道,允许使用新数据和模型改进进行定期更新。 283 | * **0.4-0.6:** 可以手动再训练,但不经常且耗时,自动化程度有限。 284 | * **0.7-1.0:** 没有再训练模型的能力,或者再训练需要大量的人工工作和停机时间。 285 | * **示例:** 286 | * **0.0:** 模型不断学习并适应新数据和不断变化的条件。 287 | * **0.2:** 模型使用自动管道定期自动再训练。 288 | * **0.5:** 模型可以手动再训练,但这需要大量的精力和停机时间。 289 | * **0.8:** 如果不从头开始重建模型,则无法更新模型。 290 | 291 | * **威胁情报集成** 292 | * **0.0:** 基于威胁情报的主动威胁搜寻,自动分析和关联威胁数据,以在潜在风险影响系统之前识别和减轻这些风险。 293 | * **0.1-0.3:** 将威胁情报源集成到安全监控和响应系统中,提供有关新威胁的自动警报和更新。 294 | * **0.4-0.6:** 使用了基本的威胁情报(例如,手动查看威胁报告),但没有系统地集成到安全运营中。 295 | * **0.7-1.0:** 没有与威胁情报源或其他安全信息源集成。 296 | * **示例:** 297 | * **0.0:** 系统使用威胁情报主动识别和减轻潜在漏洞。 298 | * **0.2:** 系统自动摄取和分析威胁情报源,为相关威胁生成警报。 299 | * **0.5:** 安全团队偶尔会查看威胁情报报告,但不采取具体行动。 300 | * **0.9:** 安全团队不知道当前对人工智能系统的威胁。 301 | 302 | * **对抗性训练** 303 | * **0.0:** 使用不断发展的攻击技术进行持续的对抗性训练,随着新攻击的发现而纳入新攻击,并使用形式化验证方法来确保稳健性。 304 | * **0.1-0.3:** 使用更大的扰动预算,针对各种攻击(例如,PGD、C&W)进行稳健的对抗性训练,并使用多种技术(例如,集成对抗性训练、认证防御)。 305 | * **0.4-0.6:** 使用有限的攻击类型集(例如,FGSM)和小的扰动预算进行基本的对抗性训练。 306 | * **0.7-1.0:** 在模型开发期间没有使用对抗性训练。 307 | * **示例:** 308 | * **0.0:** 模型经过持续的对抗性训练,并针对特定的攻击模型正式验证了稳健性。 309 | * **0.2:** 模型使用不同的对抗性训练技术和攻击类型的组合进行训练。 310 | * **0.5:** 模型使用 FGSM 生成的对抗性示例进行训练。 311 | * **0.8:** 模型没有经过训练来抵抗任何对抗性示例。 312 | 313 | **AA (对抗性攻击面)** 314 | 315 | * **模型反演** 316 | * **0.0:** 模型可证明能够抵抗模型反演攻击,并对训练数据的隐私提供正式保证。 317 | * **0.1-0.3:** 对模型反演有强大的防御能力,例如差分隐私或数据净化技术,大大增加了重建训练数据的难度。 318 | * **0.4-0.6:** 采取了一些措施来减轻模型反演(例如,限制模型输出精度),但仍然存在重大风险。 319 | * **0.7-1.0:** 模型反演攻击的风险很高,可以从模型输出或梯度中轻松重建敏感的训练数据。 320 | * **示例:** 321 | * **0.0:** 模型被正式证明在特定攻击模型下能够抵抗模型反演。 322 | * **0.2:** 模型使用差分隐私进行训练,为模型反演提供了强大的保护。 323 | * **0.5:** 模型的输出被四舍五入或扰动以使反演更加困难,但仍可能泄漏一些信息。 324 | * **0.9:** 攻击者可以轻松地从模型的输出中重建人脸或其他敏感数据。 325 | 326 | * **模型提取** 327 | * **0.0:** 模型可证明能够抵抗模型提取,并对创建功能副本的难度提供正式保证。 328 | * **0.1-0.3:** 对模型提取有强大的防御能力,例如 API 查询异常检测、模型水印和与用户的法律协议,这使得窃取模型的难度和成本大大增加。 329 | * **0.4-0.6:** 采取了一些措施来减轻模型提取(例如,速率限制、水印),但坚定的攻击者仍然可以成功。 330 | * **0.7-1.0:** 模型提取的风险很高,攻击者可以通过查询其 API 轻松创建模型的功能副本。 331 | * **示例:** 332 | * **0.0:** 模型被设计为能够抵抗模型提取,并且其功能无法通过黑盒查询复制。 333 | * **0.2:** 模型使用水印和异常检测来检测和防止提取尝试。 334 | * **0.5:** API 访问受到速率限制,但攻击者仍然可以在较长时间内提取模型。 335 | * **0.8:** 攻击者可以通过发出大量 API 调用来创建模型的副本。 336 | 337 | * **成员推理** 338 | * **0.0:** 模型可证明能够抵抗成员推理攻击,并对单个训练数据点的隐私提供正式保证。 339 | * **0.1-0.3:** 对成员推理有强大的防御能力,例如差分隐私或模型堆叠,大大降低了攻击者推断成员资格的能力。 340 | * **0.4-0.6:** 采取了一些措施来减轻成员推理(例如,正则化、丢弃),但仍然存在重大风险。 341 | * **0.7-1.0:** 成员推理攻击的风险很高,攻击者可以轻松确定特定数据点是否用于模型的训练集中。 342 | * **示例:** 343 | * **0.0:** 模型被正式证明在特定攻击模型下能够抵抗成员推理。 344 | * **0.2:** 模型使用差分隐私进行训练,为成员推理提供了强大的保护。 345 | * **0.5:** 模型使用正则化技术,这可能会降低成员推理的风险,但没有正式保证。 346 | * **0.9:** 攻击者可以轻松确定特定个人的数据是否用于训练模型。 347 | 348 | **LL (生命周期漏洞)** 349 | 350 | * **开发** 351 | * **0.0:** 具有代码正式验证、严格访问控制和持续监控安全威胁的安全开发环境。 352 | * **0.1-0.3:** 遵循安全开发生命周期 (SDL) 实践,包括代码审查、静态分析和漏洞扫描,并对开发资源进行访问控制。 353 | * **0.4-0.6:** 在开发环境中采取了基本的安全措施(例如,开发人员工作站安装了防病毒软件),有一些安全编码准则,但没有正式的安全开发生命周期 (SDL)。 354 | * **0.7-1.0:** 不安全的开发环境,没有安全编码实践,对开发资源没有访问控制。 355 | * **示例:** 356 | * **0.0:** 开发环境是隔离的并持续监控,并使用形式化方法来验证关键代码组件的安全性。 357 | * **0.2:** 遵循 SDL 实践,包括代码审查、静态分析和漏洞扫描,并根据角色限制对代码存储库的访问。 358 | * **0.5:** 开发人员使用公司提供的具有基本安全软件的笔记本电脑,并且制定了一些安全编码准则。 359 | * **0.8:** 开发人员在没有安全控制的个人笔记本电脑上工作,并且代码存储在没有访问限制的公共存储库中。 360 | 361 | * **训练** 362 | * **0.0:** 具有训练过程正式验证、严格访问控制和持续监控入侵和异常的安全且隔离的训练环境。 363 | * **0.1-0.3:** 具有访问控制、静态和传输中数据加密以及定期安全审核的安全训练环境。 364 | * **0.4-0.6:** 在训练环境中采取了基本的安全措施(例如,训练数据存储在受密码保护的服务器上),但没有加密或严格的访问控制。 365 | * **0.7-1.0:** 不安全的训练环境,没有数据安全或访问控制,训练数据在不安全的系统上存储和处理。 366 | * **示例:** 367 | * **0.0:** 训练在具有严格访问控制、持续监控和训练过程正式验证的安全区域中执行。 368 | * **0.2:** 训练数据在静态和传输中均已加密,访问根据角色进行限制,并且定期审核训练环境的安全性。 369 | * **0.5:** 训练数据存储在受密码保护的服务器上,但访问没有严格控制。 370 | * **0.8:** 训练数据存储在没有任何加密或访问控制的公共云服务器上。 371 | 372 | * **部署** 373 | * **0.0:** 具有持续监控、自动安全修补和部署过程正式验证的安全且隔离的部署环境。 374 | * **0.1-0.3:** 具有强身份验证和授权、定期安全更新和入侵检测系统的安全部署环境。 375 | * **0.4-0.6:** 在部署环境中采取了基本的安全措施(例如,模型部署在防火墙后面),但没有强身份验证或授权机制。 376 | * **0.7-1.0:** 不安全的部署环境,没有访问控制或安全监控,模型部署在没有任何保护的公开访问的服务器上。 377 | * **示例:** 378 | * **0.0:** 模型部署在具有严格访问控制、持续监控和自动安全修补的安全区域中。 379 | * **0.2:** 模型部署在具有强身份验证、授权和定期安全更新的安全云环境中。 380 | * **0.5:** 模型部署在防火墙后面,但 API 密钥在多个用户之间共享。 381 | * **0.8:** 模型部署在公共服务器上,访问其 API 不需要身份验证。 382 | 383 | * **运营** 384 | * **0.0:** 具有自动事件响应能力的持续安全监控、定期安全审核和专门的安全运营中心 (SOC)。 385 | * **0.1-0.3:** 使用 SIEM 系统进行全面的安全监控,针对可疑活动发出自动警报,以及定期测试的定义明确的事件响应计划。 386 | * **0.4-0.6:** 基本的安全监控(例如,手动查看日志),有限的事件响应能力,没有正式的事件响应计划。 387 | * **0.7-1.0:** 没有安全监控或事件响应计划,没有收集或分析系统日志。 388 | * **示例:** 389 | * **0.0:** 专门的 SOC 全天候监控系统,具有自动事件响应能力和定期安全审核。 390 | * **0.2:** 使用 SIEM 系统监控安全事件、生成警报并触发事件响应程序。 391 | * **0.5:** 每周收集并手动查看系统日志,并且有一个基本的事件响应计划。 392 | * **0.8:** 没有收集日志,也没有响应安全事件的流程。 393 | 394 | **GV (治理和验证)** 395 | 396 | * **合规性** 397 | * **0.0:** 系统超出法规要求并制定了合规性方面的行业最佳实践,并采取主动措施来适应新法规。 398 | * **0.1-0.3:** 完全遵守相关法规和行业标准,并设有专门的合规团队和定期审核。 399 | * **0.4-0.6:** 对法规有基本的了解,有一些临时的合规工作,但没有正式的合规计划。 400 | * **0.7-1.0:** 不了解或不遵守相关法规(例如,GDPR、CCPA、HIPAA)或行业标准。 401 | * **示例:** 402 | * **0.0:** 系统设计为默认合规,超出法规要求并制定行业最佳实践。 403 | * **0.2:** 系统完全符合所有适用的法规,并进行定期审核和设有专门的合规团队。 404 | * **0.5:** 为遵守法规做出了一些努力,但存在重大差距,并且没有正式的合规计划。 405 | * **0.8:** 系统在未经用户同意或适当保护的情况下收集和处理个人数据,违反了数据隐私法规。 406 | 407 | * **审计** 408 | * **0.0:** 由信誉良好的第三方进行定期独立审核,并对系统的安全性、公平性和道德绩效进行正式验证。 409 | * **0.1-0.3:** 进行定期的内部审核,涵盖人工智能系统生命周期的各个方面,并具有清晰的审核跟踪和文档。 410 | * **0.4-0.6:** 不经常或有限的审核(例如,仅审核代码是否存在安全漏洞),没有独立的验证。 411 | * **0.7-1.0:** 没有对人工智能系统的设计、开发、部署或运营进行审核。 412 | * **示例:** 413 | * **0.0:** 每年由信誉良好的第三方进行独立审核,并公开报告结果。 414 | * **0.2:** 进行定期的内部审核,涵盖安全性、公平性和绩效,并具有详细的审核跟踪。 415 | * **0.5:** 在部署之前审核代码是否存在安全漏洞,但不进行其他审核。 416 | * **0.8:** 没有维护审核日志,也没有执行审核。 417 | 418 | * **风险管理** 419 | * **0.0:** 主动和持续的人工智能风险管理,并设有专门的人工智能风险管理团队、定期风险评估,并高度重视预测和减轻新出现的人工智能风险。 420 | * **0.1-0.3:** 建立了全面的人工智能风险管理框架,并具有识别、评估、减轻和监控人工智能风险的具体流程,并完全集成到组织风险框架中。 421 | * **0.4-0.6:** 对人工智能系统进行基本的风险评估,缓解策略有限,人工智能风险部分集成到组织风险框架中。 422 | * **0.7-1.0:** 没有人工智能特定的风险管理流程,在整体组织风险框架中没有考虑人工智能风险。 423 | * **示例:** 424 | * **0.0:** 人工智能风险管理是一个持续的过程,与组织的整体风险管理和治理结构相结合。 425 | * **0.2:** 建立了全面的人工智能风险管理框架,并进行定期风险评估和制定缓解计划。 426 | * **0.5:** 对人工智能风险进行临时评估,缓解策略有限。 427 | * **0.8:** 在组织的风险管理流程中没有考虑人工智能风险。 428 | 429 | * **人工监督** 430 | * **0.0:** 人在环路系统具有明确定义的角色和职责,明确的人机协作程序,以及在系统运行的各个阶段进行人工监督的机制。 431 | * **0.1-0.3:** 在系统的决策过程中具有明确的人工审查和干预机制,并为人工操作员定义了明确的角色和职责。 432 | * **0.4-0.6:** 有限的人工监督,主要是被动的(例如,用户可以报告错误),没有明确的人工干预或否决机制。 433 | * **0.7-1.0:** 在人工智能系统的决策过程中没有人工监督或干预。 434 | * **示例:** 435 | * **0.0:** 系统设计用于人机协作,人类在决策过程中发挥核心作用。 436 | * **0.2:** 系统具有人工操作员在特定情况下审查和否决其决策的机制。 437 | * **0.5:** 用户可以报告错误,但没有人工干预系统决策的流程。 438 | * **0.8:** 系统在没有任何人工控制或监控的情况下自主运行。 439 | 440 | * **道德框架一致性** 441 | * **0.0:** 系统明显遵守并促进道德人工智能原则,并持续监控和审核道德绩效。 442 | * **0.1-0.3:** 系统设计和运营与已建立的道德框架(例如,经合组织人工智能原则、蒙特利尔负责任人工智能宣言)保持一致,并具有解决道德问题的机制。 443 | * **0.4-0.6:** 对道德准则有基本的认识,实施有限,没有正式的道德审查流程。 444 | * **0.7-1.0:** 在人工智能系统的设计、开发或部署中没有考虑道德框架或原则。 445 | * **示例:** 446 | * **0.0:** 定期评估系统的道德绩效,并且它积极促进道德人工智能原则。 447 | * **0.2:** 系统设计结合了相关道德框架的原则,并且有一个解决道德问题的流程。 448 | * **0.5:** 开发人员了解道德准则,但没有将它们正式集成到系统的设计中。 449 | * **0.8:** 系统在开发和部署时没有考虑道德影响。 450 | 451 | **CS (云安全联盟 LLM 分类法)** 452 | 453 | * **模型操纵** 454 | * **0.0:** 系统可证明能够抵抗模型操纵,并对提示注入和其他对抗性技术的稳健性进行正式验证。 455 | * **0.1-0.3:** 对模型操纵有强大的防御能力(例如,输入过滤、对抗性训练、输出验证),使得操纵模型行为变得非常困难。 456 | * **0.4-0.6:** 对操纵采取了一些防御措施(例如,基本的输入净化),但仍然存在漏洞,并且通过一些努力可以操纵模型。 457 | * **0.7-1.0:** 非常容易受到模型操纵,包括提示注入和其他对抗性技术,没有或只有极少的防御措施。 458 | * **示例:** 459 | * **0.0:** 模型的抗提示注入能力经过正式验证。 460 | * **0.2:** 模型结合使用输入过滤、对抗性训练和输出验证来防御操纵。 461 | * **0.5:** 模型具有基本的输入净化功能,但仍然可以通过精心设计的提示进行操纵。 462 | * **0.8:** 模型很容易受到提示注入攻击。 463 | 464 | * **数据中毒** 465 | * **0.0:** 系统可证明能够抵抗数据中毒,并对训练数据的完整性和安全性提供正式保证。 466 | * **0.1-0.3:** 实施了强大的数据验证、异常检测和来源跟踪机制,使得成功毒化训练数据变得非常困难。 467 | * **0.4-0.6:** 采取了一些措施来减轻数据中毒(例如,异常值检测),但仍然存在风险,并且有针对性的中毒攻击仍然可能成功。 468 | * **0.7-1.0:** 数据中毒的风险很高,没有或只有极少的措施来确保训练数据的完整性和安全性。 469 | * **示例:** 470 | * **0.0:** 训练数据存储在具有加密完整性验证的不可变账本上。 471 | * **0.2:** 使用强大的数据验证、异常检测和来源跟踪机制来防止和检测数据中毒。 472 | * **0.5:** 使用基本的异常值检测,但复杂的中毒攻击仍然可能成功。 473 | * **0.8:** 训练数据很容易被篡改,并且没有检测中毒的机制。 474 | 475 | * **敏感数据泄露** 476 | * **0.0:** 系统可证明能够防止敏感数据泄露,并对敏感信息的隐私提供正式保证。 477 | * **0.1-0.3:** 实施了强大的访问控制、加密和输出净化机制,使得从系统中提取敏感数据变得非常困难。 478 | * **0.4-0.6:** 采取了一些措施来防止数据泄漏(例如,输出过滤),但仍然存在漏洞,并且在某些情况下可能会泄露敏感信息。 479 | * **0.7-1.0:** 敏感数据泄露的风险很高,没有或只有极少的措施来保护系统处理或存储的敏感信息。 480 | * **示例:** 481 | * **0.0:** 系统使用同态加密或其他隐私保护技术来防止任何敏感数据泄露。 482 | * **0.2:** 使用强大的访问控制、加密和输出净化来防止数据泄漏。 483 | * **0.5:** 模型输出经过过滤以删除潜在的敏感信息,但仍可能发生一些泄漏。 484 | * **0.8:** 模型可能会在其输出中泄露敏感信息,并且没有防止数据泄露的保护措施。 485 | 486 | * **模型窃取** 487 | * **0.0:** 模型可证明能够抵抗模型窃取,并对创建功能副本的难度提供正式保证。 488 | * **0.1-0.3:** 对模型窃取有强大的防御能力(例如,API 查询异常检测、模型水印、法律协议),这使得窃取模型的难度和成本大大增加。 489 | * **0.4-0.6:** 采取了一些措施来减轻模型窃取(例如,速率限制),但坚定的攻击者仍然可以成功。 490 | * **0.7-1.0:** 模型窃取的风险很高,攻击者可以通过查询其 API 轻松创建模型的功能副本。 491 | * **示例:** 492 | * **0.0:** 模型被设计为能够抵抗模型提取,并且其功能无法通过黑盒查询复制。 493 | * **0.2:** 模型结合使用水印、异常检测和法律协议来阻止和检测模型窃取。 494 | * **0.5:** API 访问受到速率限制,但攻击者仍然可以在较长时间内提取模型。 495 | * **0.8:** 攻击者可以通过发出大量 API 调用来创建模型的副本。 496 | 497 | * **故障/失灵** 498 | * **0.0:** 系统设计为具有高可用性和容错能力,并对其可靠性进行正式验证。 499 | * **0.1-0.3:** 实施了强大的错误处理、监控和冗余机制,大大降低了故障或失灵的风险。 500 | * **0.4-0.6:** 采取了一些措施来确保可靠性(例如,基本的错误处理),但仍然存在风险,并且系统可能会出现停机或在某些条件下产生不正确的输出。 501 | * **0.7-1.0:** 故障或失灵的风险很高,没有或只有极少的措施来确保系统的可靠性。 502 | * **示例:** 503 | * **0.0:** 系统设计为具有多层冗余和故障转移机制,并且其可靠性经过正式验证。 504 | * **0.2:** 系统具有强大的错误处理、监控和自愈能力。 505 | * **0.5:** 系统具有基本的错误处理和日志记录功能,但可能会因意外错误而停机。 506 | * **0.8:** 系统容易崩溃或出错,并且没有确保其持续运行的机制。 507 | 508 | * **不安全的供应链** 509 | * **0.0:** 安全且可审核的供应链,并对所有第三方组件和依赖项进行正式验证。 510 | * **0.1-0.3:** 实施了强大的供应链安全实践(例如,代码签名、依赖项验证、定期审核),最大限度地降低了供应链攻击的风险。 511 | * **0.4-0.6:** 采取了一些措施来减轻供应链风险(例如,使用可信来源),但第三方组件中仍可能存在漏洞。 512 | * **0.7-1.0:** 供应链漏洞的风险很高,没有或只有极少的措施来确保第三方组件和依赖项的安全性。 513 | * **示例:** 514 | * **0.0:** 所有第三方组件都经过正式的安全验证,并且持续监控供应链是否存在漏洞。 515 | * **0.2:** 在整个供应链中遵循强大的安全实践,包括代码签名、依赖项验证和定期审核。 516 | * **0.5:** 使用来自信誉良好的来源的第三方库,但它们没有经过彻底的安全漏洞审查。 517 | * **0.8:** 系统依赖具有已知漏洞的过时或未修补的第三方组件。 518 | 519 | * **不安全的应用程序/插件** 520 | * **0.0:** 对应用程序/插件强制执行安全的开发和集成实践,并对其安全性进行正式验证。 521 | * **0.1-0.3:** 针对应用程序/插件制定了强大的安全准则和审查流程,最大限度地降低了第三方集成引入漏洞的风险。 522 | * **0.4-0.6:** 对应用程序/插件采取了一些安全措施(例如,沙盒),但仍然存在风险,并且可能通过不安全的集成引入漏洞。 523 | * **0.7-1.0:** 不安全的应用程序/插件引入漏洞的风险很高,没有或只有极少的措施来确保第三方集成的安全性。 524 | * **示例:** 525 | * **0.0:** 所有应用程序/插件都经过严格的安全审查,并在允许与系统集成之前进行正式验证。 526 | * **0.2:** 针对应用程序/插件开发制定了强大的安全准则,并使用审查流程来最大限度地降低风险。 527 | * **0.5:** 应用程序/插件是沙盒化的,但它们仍然可以访问敏感数据或功能。 528 | * **0.8:** 可以轻松安装应用程序/插件,而无需任何安全检查,这可能会将漏洞引入系统。 529 | 530 | * **拒绝服务 (DoS)** 531 | * **0.0:** 系统可证明能够抵抗 DoS 攻击,并对其在高负载或恶意流量下的可用性提供正式保证。 532 | * **0.1-0.3:** 对 DoS 攻击有强大的防御能力(例如,流量过滤、速率限制、自动扩展),使得中断系统的可用性变得非常困难。 533 | * **0.4-0.6:** 采取了一些措施来减轻 DoS 攻击(例如,基本的速率限制),但系统仍然可能容易受到复杂攻击。 534 | * **0.7-1.0:** 非常容易受到 DoS 攻击,没有或只有极少的措施来保护系统的可用性。 535 | * **示例:** 536 | * **0.0:** 系统设计为能够承受大量流量峰值,并且其抗 DoS 攻击能力经过正式验证。 537 | * **0.2:** 系统结合使用流量过滤、速率限制和自动扩展来减轻 DoS 攻击。 538 | * **0.5:** 系统具有基本的速率限制,但仍然可能被大量请求淹没。 539 | * **0.8:** 通过发送大量请求或恶意流量可以轻松地使系统不可用。 540 | 541 | * **失去治理/合规性** 542 | * **0.0:** 系统达到或超过所有相关的法规和治理要求,并采取主动措施来适应新法规,并高度重视保持合规性。 543 | * **0.1-0.3:** 实施了强大的合规框架和控制措施,确保遵守相关的法规和治理政策。 544 | * **0.4-0.6:** 做出了一些合规努力,但仍然存在差距,并且系统可能无法完全满足所有法规或治理要求。 545 | * **0.7-1.0:** 不遵守法规或治理政策的风险很高,没有或只有极少的措施来确保遵守。 546 | * **示例:** 547 | * **0.0:** 系统设计为默认合规,并具有自动机制来确保遵守法规和政策。 548 | * **0.2:** 定期审核系统的合规性,并由专门的团队确保满足所有要求。 549 | * **0.5:** 为遵守法规做出了一些努力,但存在重大差距,并且没有正式的合规计划。 550 | * **0.8:** 系统不符合数据隐私法规,并且没有确保遵守内部政策的机制。 551 | 552 | ## 5. 评分方法 ## 553 | 554 | **基本公式** 555 | 556 | ``` 557 | AIVSS_Score = [ 558 | (w₁ × ModifiedBaseScore) + 559 | (w₂ × AISpecificMetrics) + 560 | (w₃ × ImpactMetrics) 561 | ] × TemporalMetrics × MitigationMultiplier 562 | 563 | 其中:0 ≤ AIVSS_Score ≤ 10 564 | ``` 565 | 566 | * **w₁, w₂, w₃:** 分配给每个组件(修正的基本指标、人工智能特定指标、影响指标)的权重。建议的起始点:w₁ = 0.3,w₂ = 0.5,w₃ = 0.2(更加重视人工智能特定风险)。根据特定的人工智能系统及其风险状况进行调整。 567 | * **时间指标:** 根据可利用性、补救级别和报告可信度进行调整(类似于 CVSS 时间分数)。 568 | * **可利用性 (E):** 569 | * 未定义 (ND):1.0 570 | * 未经验证 (U):0.9 571 | * 概念验证 (P):0.95 572 | * 功能性 (F):1.0 573 | * 高 (H):1.0 574 | * **补救级别 (RL):** 575 | * 未定义 (ND):1.0 576 | * 官方修复 (O):0.95 577 | * 临时修复 (T):0.96 578 | * 变通方法 (W):0.97 579 | * 不可用 (U):1.0 580 | * **报告可信度 (RC):** 581 | * 未定义 (ND):1.0 582 | * 未知 (U):0.92 583 | * 合理 (R):0.96 584 | * 已确认 (C):1.0 585 | * **缓解乘数:** 根据缺乏有效缓解措施而增加分数的因子(范围从 1.0 到 1.5)。1.0 = 强有力的缓解措施;1.5 = 没有/薄弱的缓解措施。 586 | 587 | ## 6. 组件计算 ## 588 | 589 | **1. 修正的基本指标** 590 | 591 | ``` 592 | ModifiedBaseScore = min(10, [MAV × MAC × MPR × MUI × MS] × ScopeMultiplier) 593 | ``` 594 | 595 | 其中修正的基本指标 (MAV, MAC, MPR, MUI, MS) 源自基本指标,并根据特定环境和使用环境指标进行调整。每个修正的基本指标都可以按照与基本指标相同的方式进行评级,并增加: 596 | 597 | * 未定义 (X):使用未修改的基本指标值。 598 | 599 | **2. 人工智能特定指标** 600 | 601 | ``` 602 | AISpecificMetrics = [MR × DS × EI × DC × AD × AA × LL × GV × CS] × ModelComplexityMultiplier 603 | ``` 604 | 605 | * 每个指标 (MR, DS, EI, DC, AD, AA, LL, GV, CS) 根据每个子类别中漏洞的严重程度,使用上面提供的详细评分细则从 0.0 到 1.0 进行评分(分数越高 = 问题越严重)。 606 | * **模型复杂度乘数:** 用于解释更高级模型的攻击面和复杂性增加的因子(1.0 到 1.5)。 607 | 608 | **3. 影响指标** 609 | 610 | ``` 611 | ImpactMetrics = (C + I + A + SI) / 4 612 | ``` 613 | 614 | * **C (机密性影响):** 对数据机密性的影响。 615 | * **I (完整性影响):** 对数据和系统完整性的影响。 616 | * **A (可用性影响):** 对系统可用性的影响。 617 | * **SI (社会影响):** 更广泛的社会危害(例如,歧视、操纵)。由 EI(伦理影响)子类别提供信息。 618 | 619 | **严重性级别(对于 C、I、A、SI):** 620 | 621 | * 无:0.0 622 | * 低:0.22 623 | * 中:0.55 624 | * 高:0.85 625 | * 严重:1.0 626 | 627 | **4. 环境分数** 628 | 629 | 环境分数是通过使用环境指标修改基本分数来计算的。该公式综合考虑了以下因素: 630 | 631 | ``` 632 | EnvironmentalScore = [(ModifiedBaseScore + (Environmental Component)) × TemporalMetrics] × (1 + EnvironmentalMultiplier) 633 | ``` 634 | 635 | 环境组件源自人工智能特定指标,并根据环境上下文进行调整: 636 | 637 | ``` 638 | EnvironmentalComponent = [CR × IR × AR × SIR] × AISpecificMetrics 639 | ``` 640 | 641 | 其中: 642 | 643 | * CR、IR、AR、SIR 分别是机密性、完整性、可用性和社会影响要求。 644 | * 环境乘数根据 CR、IR、AR、SIR 未涵盖的特定环境因素调整分数。 645 | 646 | **风险类别** 647 | 648 | ``` 649 | 严重:9.0 - 10.0 650 | 高: 7.0 - 8.9 651 | 中: 4.0 - 6.9 652 | 低: 0.1 - 3.9 653 | 无: 0.0 654 | ``` 655 | 656 | ## 7. 实施指南 ## 657 | 658 | **先决条件** 659 | 660 | * 访问人工智能系统架构详细信息 661 | * 安全评估工具 662 | * 了解机器学习/人工智能概念和正在评估的特定人工智能模型。 663 | * 具备道德人工智能原则和潜在社会影响方面的专业知识。 664 | * 熟悉云安全原则,特别是如果人工智能系统是基于云的或 LLM,则熟悉 CSA LLM 威胁分类法。 665 | * 具有漏洞分析经验,特别是在人工智能/机器学习系统的背景下。 666 | 667 | **角色和责任:** 668 | 669 | * **人工智能安全团队/专家:** 领导 AIVSS 评估,与其他团队协调,确保准确性和完整性。 670 | * **人工智能开发人员/数据科学家:** 提供技术细节,协助识别漏洞,实施缓解措施。 671 | * **安全工程师:** 评估基本指标,评估开发、训练和部署环境的安全性,为整体评估做出贡献。 672 | * **合规/风险官:** 确保与法规和组织风险管理框架保持一致。 673 | * **道德人工智能官/审查委员会:** 评估道德影响并提供减轻道德风险的指导。 674 | 675 | ## 8. AIVSS 评估检查表 ## 676 | 677 | 此检查表为组织进行 AIVSS 评估提供了一个简化且可操作的指南。 678 | 679 | **阶段 1:系统和环境定义** 680 | 681 | * [ ] **1.1** 确定要评估的人工智能系统,包括其组件、数据流和依赖项。 682 | * [ ] **1.2** 定义系统的运行环境,包括其部署模型(云、本地、混合)、网络配置和用户群。 683 | * [ ] **1.3** 根据系统的具体情况确定环境指标 (CR, IR, AR, SIR)。 684 | * [ ] **1.4** 根据环境因素记录修正的基本指标 (MAV, MAC, MPR, MUI, MS)。 685 | 686 | **阶段 2:基本指标和人工智能特定指标评估** 687 | 688 | * [ ] **2.1** 根据已识别的漏洞评估基本指标 (AV, AC, PR, UI, S)。 689 | * [ ] **2.2** 使用详细的评分细则评估每个人工智能特定指标 (MR, DS, EI, DC, AD, AA, LL, GV, CS): 690 | * [ ] **2.2.1** 模型稳健性 (MR):规避抵抗、梯度掩蔽、稳健性认证。 691 | * [ ] **2.2.2** 数据敏感性 (DS):数据机密性、数据完整性、数据来源。 692 | * [ ] **2.2.3** 伦理影响 (EI):偏见和歧视、透明度和可解释性、问责制、社会影响。 693 | * [ ] **2.2.4** 决策关键性 (DC):安全关键、财务影响、声誉损害、运营中断。 694 | * [ ] **2.2.5** 适应性 (AD):持续监控、再训练能力、威胁情报集成、对抗性训练。 695 | * [ ] **2.2.6** 对抗性攻击面 (AA):模型反演、模型提取、成员推理。 696 | * [ ] **2.2.7** 生命周期漏洞 (LL):开发、训练、部署、运营。 697 | * [ ] **2.2.8** 治理和验证 (GV):合规性、审计、风险管理、人工监督、道德框架一致性。 698 | * [ ] **2.2.9** 云安全联盟 LLM 分类法 (CS):模型操纵、数据中毒、敏感数据泄露、模型窃取、故障/失灵、不安全的供应链、不安全的应用程序/插件、拒绝服务 (DoS)、失去治理/合规性。 699 | * [ ] **2.3** 根据评估的人工智能模型确定模型复杂度乘数。 700 | 701 | **阶段 3:影响和时间评估** 702 | 703 | * [ ] **3.1** 根据漏洞的潜在后果评估影响指标 (C, I, A, SI)。 704 | * [ ] **3.2** 根据当前的可利用性、可用的补救措施和报告可信度评估时间指标 (E, RL, RC)。 705 | 706 | **阶段 4:缓解和评分** 707 | 708 | * [ ] **4.1** 评估现有缓解措施的有效性并确定缓解乘数。 709 | * [ ] **4.2** 计算修正的基本分数。 710 | * [ ] **4.3** 计算人工智能特定指标分数。 711 | * [ ] **4.4** 计算影响指标分数。 712 | * [ ] **4.5** 计算环境组件。 713 | * [ ] **4.6** 计算环境分数。 714 | * [ ] **4.7** 使用公式生成最终的 AIVSS 分数。 715 | 716 | **阶段 5:报告和补救** 717 | 718 | * [ ] **5.1** 在综合报告中记录评估结果,包括 AIVSS 分数、详细的指标分数、理由和支持证据。 719 | * [ ] **5.2** 将评估结果传达给相关的利益相关者(技术团队、管理层、董事会)。 720 | * [ ] **5.3** 根据 AIVSS 分数和已识别的漏洞制定和优先考虑补救建议。 721 | * [ ] **5.4** 实施建议的缓解措施并跟踪进度。 722 | * [ ] **5.5** 在实施缓解措施后重新评估人工智能系统,以验证其有效性并更新 AIVSS 分数。 723 | 724 | ## 9. 示例评估: ## 725 | 726 | ```python 727 | # 示例漏洞评估(说明性和简化) 728 | vulnerability = { 729 | 'attack_vector': 'Network', # 0.85 730 | 'attack_complexity': 'High', # 0.44 731 | 'privileges_required': 'Low', # 0.62 732 | 'user_interaction': 'None', # 0.85 733 | 'scope': 'Unchanged', # 1.0 734 | 'model_robustness': { 735 | 'evasion_resistance': 0.7, # 对规避的高度敏感性 736 | 'gradient_masking': 0.8, # 梯度易于访问 737 | }, 738 | 'data_sensitivity': { 739 | 'data_confidentiality': 0.9, # 具有极少保护的敏感数据 740 | 'data_integrity': 0.7 # 没有数据完整性检查 741 | }, 742 | 'ethical_impact': { 743 | 'bias_discrimination': 0.8, # 歧视性结果的高风险 744 | 'transparency_explainability': 0.7, # 黑盒系统 745 | }, 746 | 'cloud_security': { 747 | 'model_manipulation': 0.8, # 容易受到提示注入的攻击 748 | 'data_poisoning': 0.6, # 数据中毒的一些风险 749 | 'sensitive_data_disclosure': 0.7, # 敏感数据泄露的风险 750 | 'model_stealing': 0.5, # 一些模型窃取缓解措施,但仍然存在风险 751 | 'failure_malfunctioning': 0.7, # 故障风险 752 | 'insecure_supply_chain': 0.6, # 一些供应链风险 753 | 'insecure_apps_plugins': 0.4, # 一些应用程序/插件安全性,但仍然存在风险 754 | 'denial_of_service': 0.8, # 容易受到 DoS 攻击 755 | 'loss_of_governance_compliance': 0.7 # 不合规的风险 756 | }, 757 | # ... (具有子类别的其他人工智能特定指标) 758 | 'confidentiality_impact': 'High', # 0.85 759 | 'integrity_impact': 'Medium', # 0.55 760 | 'availability_impact': 'Low', # 0.22 761 | 'societal_impact': 'Medium', # 0.55 762 | 'temporal_metrics': { 763 | 'exploitability': 'Proof-of-Concept', # 0.95 764 | 'remediation_level': 'Temporary Fix', # 0.96 765 | 'report_confidence': 'Confirmed' # 1.0 766 | }, 767 | 'mitigation_multiplier': 1.4, # 示例:薄弱的缓解措施 768 | 'model_complexity_multiplier': 1.4, # 示例:复杂模型(例如,大型语言模型) 769 | 'environmental_metrics': { 770 | 'cr': 'High', # 1.5 771 | 'ir': 'Medium', # 1.0 772 | 'ar': 'Low', # 0.5 773 | 'sir': 'Medium', # 1.0 774 | }, 775 | 776 | } 777 | 778 | # 根据环境调整基本指标 779 | vulnerability['modified_attack_vector'] = vulnerability['attack_vector'] # 无变化 780 | vulnerability['modified_attack_complexity'] = vulnerability['attack_complexity'] * 0.5 # 示例:在此环境中较低的复杂性 781 | vulnerability['modified_privileges_required'] = vulnerability['privileges_required'] # 无变化 782 | vulnerability['modified_user_interaction'] = vulnerability['user_interaction'] # 无变化 783 | vulnerability['modified_scope'] = vulnerability['scope'] # 无变化 784 | 785 | # 计算分数(简化和说明性) 786 | 787 | # 修正的基本指标 788 | modified_base_score = min(10, vulnerability['modified_attack_vector'] * vulnerability['modified_attack_complexity'] * vulnerability['modified_privileges_required'] * vulnerability['modified_user_interaction'] * vulnerability['modified_scope']) # = 0.098 789 | 790 | # 人工智能特定指标 - 示例计算(为简单起见使用平均值): 791 | mr_score = (vulnerability['model_robustness']['evasion_resistance'] + 792 | vulnerability['model_robustness']['gradient_masking']) / 2 # = 0.75 793 | ds_score = (vulnerability['data_sensitivity']['data_confidentiality'] + 794 | vulnerability['data_sensitivity']['data_integrity']) / 2 # = 0.8 795 | ei_score = (vulnerability['ethical_impact']['bias_discrimination'] + 796 | vulnerability['ethical_impact']['transparency_explainability']) / 2 # = 0.75 797 | 798 | # 云安全 (CS) - 使用详细的细则和平均值: 799 | cs_score = (vulnerability['cloud_security']['model_manipulation'] + 800 | vulnerability['cloud_security']['data_poisoning'] + 801 | vulnerability['cloud_security']['sensitive_data_disclosure'] + 802 | vulnerability['cloud_security']['model_stealing'] + 803 | vulnerability['cloud_security']['failure_malfunctioning'] + 804 | vulnerability['cloud_security']['insecure_supply_chain'] + 805 | vulnerability['cloud_security']['insecure_apps_plugins'] + 806 | vulnerability['cloud_security']['denial_of_service'] + 807 | vulnerability['cloud_security']['loss_of_governance_compliance']) / 9 # = 0.64 808 | 809 | # 假设计算了其他人工智能特定指标,我们有这些分数(说明性): 810 | dc_score = 0.7 811 | ad_score = 0.55 812 | aa_score = 0.6 813 | ll_score = 0.75 814 | gv_score = 0.8 815 | 816 | # 计算整体人工智能特定指标分数: 817 | ais_score = (mr_score * ds_score * ei_score * dc_score * ad_score * aa_score * ll_score * gv_score * cs_score) * vulnerability['model_complexity_multiplier'] # = 0.095 818 | 819 | # 影响指标 820 | impact_score = (vulnerability['confidentiality_impact'] + vulnerability['integrity_impact'] + vulnerability['availability_impact'] + vulnerability['societal_impact']) / 4 # = 0.543 821 | 822 | # 时间指标 - 为简单起见使用平均值: 823 | temporal_score = (vulnerability['temporal_metrics']['exploitability'] + 824 | vulnerability['temporal_metrics']['remediation_level'] + 825 | vulnerability['temporal_metrics']['report_confidence']) / 3 # = 0.97 826 | 827 | # 环境组件 828 | environmental_component = (vulnerability['environmental_metrics']['cr'] * vulnerability['environmental_metrics']['ir'] * vulnerability['environmental_metrics']['ar'] * vulnerability['environmental_metrics']['sir']) * ais_score # = 0.072 829 | # 环境分数 830 | environmental_score = min(10, ((modified_base_score + environmental_component) * temporal_score) * vulnerability['mitigation_multiplier']) # = 0.232 831 | 832 | # 最终的 AIVSS 分数 833 | final_score = ((0.3 * modified_base_score) + (0.5 * ais_score) + (0.2 * impact_score)) * temporal_score * vulnerability['mitigation_multiplier'] # = 0.323 834 | ``` 835 | 836 | **示例解读:** 837 | 838 | 在此示例中,最终的 AIVSS 分数约为 **0.323**,属于**低**风险类别。考虑了环境分数,根据具体的环境要求稍微修改了基本分数。 839 | 840 | **重要注意事项:** 841 | 842 | * **说明性:** 这只是一个简化的示例。实际评估将涉及对每个子类别的更全面评估。 843 | * **加权:** 权重 (w₁, w₂, w₃) 会显著影响最终分数。组织应根据其特定的风险状况仔细考虑适当的权重。 844 | * **上下文:** 应始终在特定人工智能系统、其预期用途和安全事件潜在后果的背景下解释 AIVSS 分数。 845 | 846 | ## 10. 报告和沟通: ## 847 | 848 | * **评估报告:** 应生成一份综合报告,包括: 849 | * 被评估人工智能系统的摘要。 850 | * 最终的 AIVSS 分数和风险类别。 851 | * 每个指标和子类别的详细分数。 852 | * 分配分数的理由,参考评分细则和收集的证据。 853 | * 关键漏洞及其潜在影响的分析。 854 | * 根据漏洞的严重程度优先考虑的缓解建议。 855 | * 附录,包含支持性文档(例如,威胁模型、评估数据)。 856 | * **沟通:** 应将评估结果传达给相关的利益相关者,包括: 857 | * **技术团队:** 为补救工作提供信息。 858 | * **管理层:** 支持风险管理决策。 859 | * **董事会:** 提供组织人工智能安全状况的概述。 860 | 861 | ## 11. 与风险管理框架集成: ## 862 | 863 | * **映射:** 可以将 AIVSS 指标映射到组织风险管理框架(例如,NIST 网络安全框架、ISO 27001)中的现有风险类别。 864 | * **风险评估:** 可以将 AIVSS 评估纳入更广泛的风险评估中。 865 | * **审核:** 可以将 AIVSS 用作审核人工智能系统的框架。 866 | 867 | ## 12. 附录:人工智能威胁分类法 ## 868 | 869 | | 分类法 | 描述 | 链接 | 870 | | :--------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | 871 | | **MITRE ATLAS** | 基于真实世界观察的对手策略和技术知识库,专门针对机器学习系统的威胁。它提供了一个理解对抗性机器学习生命周期的框架,并包括攻击案例研究。 | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 872 | | **NIST AI 100-2 E2023** | 对抗性机器学习的分类法,包括攻击、防御和后果。它提供了一个详细的框架,用于理解和分类人工智能系统的威胁,并提供风险管理指导。 | [https://csrc.nist.gov/pubs/ai/100/2/e2023/final](https://csrc.nist.gov/pubs/ai/100/2/e2023/final) | 873 | | **欧盟 HLEG 可信人工智能** | 由欧盟委员会人工智能高级专家组制定的可信人工智能道德准则。它侧重于以人为本的人工智能原则,包括公平性、透明度、问责制和社会福祉。 | [https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai](https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai) | 874 | | **ISO/IEC JTC 1/SC 42** | 一个制定人工智能国际标准的国际标准机构。它涵盖人工智能的各个方面,包括风险管理、可信度、偏见和治理。 | [https://www.iso.org/committee/6794475.html](https://www.iso.org/committee/6794475.html) | 875 | | **AI Incident Database** | 一个真实世界人工智能系统事件的数据库,包括故障、事故和恶意攻击。它为理解与人工智能相关的风险和制定风险管理策略提供了有价值的数据。 | [https://incidentdatabase.ai/](https://incidentdatabase.ai/) | 876 | | **DARPA's GARD** | 保证人工智能抵抗欺骗 (GARD) 计划旨在开发针对人工智能系统对抗性攻击的防御措施。它侧重于开发能够抵御欺骗或操纵企图的强大人工智能。 | [https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception](https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception) | 877 | | **OECD AI Principles** | 经济合作与发展组织 (OECD) 采纳的负责任管理可信人工智能的原则。它们涵盖了包容性增长、以人为本的价值观、透明度、稳健性和问责制等方面。 | [https://oecd.ai/en/ai-principles](https://oecd.ai/en/ai-principles) | 878 | | **MITRE Atlas Matrix** | 对抗性机器学习威胁矩阵是一个框架,用于捕获攻击者攻击机器学习系统所使用的策略、技术和程序。它的结构类似于 ATT&CK 框架,但专门针对机器学习领域。 | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 879 | | **CSA LLM Threat Taxonomy** | 定义了与云中大型语言模型相关的常见威胁。关键类别包括模型操纵、数据中毒、敏感数据泄露、模型窃取以及特定于基于云的 LLM 部署的其他威胁。 | [https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy](https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy) | 880 | | **MIT AI Threat Taxonomy** | 全面分类人工智能的攻击面、对抗性技术和治理漏洞。它详细介绍了各种类型的攻击,并提供了缓解策略。 | [https://arxiv.org/pdf/2408.12622](https://arxiv.org/pdf/2408.12622) | 881 | | **OWASP Top 10 for LLMs** | 强调了大型语言模型应用程序最关键的安全风险。它涵盖了诸如提示注入、数据泄漏、不安全的输出处理和模型拒绝服务等漏洞。 | [https://owasp.org/www-project-top-10-for-large-language-model-applications/](https://owasp.org/www-project-top-10-for-large-language-model-applications/) | 882 | 883 | **注意:** 此表格旨在作为入门,可能并不详尽。随着人工智能安全领域的不断发展,可能会出现新的分类法和框架。 884 | 885 | ## 13. 持续改进 ## 886 | 887 | 此 AIVSS 框架应被视为一份活文件。这是 0.1 版,它将被修订和更新。鼓励组织提供反馈,为其发展做出贡献,并根据其特定需求进行调整。将定期发布更新,以纳入新的研究、威胁情报和最佳实践。 888 | 889 | ## 14. 结论 ## 890 | 891 | AIVSS 框架提供了一种结构化和全面的方法来评估和量化人工智能系统的安全风险。通过使用此框架以及提供的检查表,组织可以更好地了解其人工智能特定的漏洞,优先考虑补救工作,并改善其整体人工智能安全状况。详细的评分细则、包含相关的人工智能威胁分类法、增加环境分数以及关注实际实施,使 AIVSS 成为保护人工智能未来的宝贵工具。持续改进、社区参与以及适应不断变化的威胁形势,对于 AIVSS 的长期成功和采用至关重要。 892 | 893 | **免责声明:** 894 | 895 | AIVSS 是一个评估和评分人工智能系统安全风险的框架。它不能保证安全,也不应作为做出安全决策的唯一依据。组织应将 AIVSS 与其他安全最佳实践结合使用,并在评估和减轻人工智能安全风险时始终运用自己的判断。 896 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Artificial Intelligence Vulnerability Scoring System (AIVSS) # 3 | 4 | ## 1. Introduction ## 5 | 6 | The Artificial Intelligence Vulnerability Scoring System (AIVSS) provides a standardized, comprehensive framework for evaluating and quantifying the security risks associated with AI systems. This framework focuses particularly on Large Language Models (LLMs) and cloud-based deployments, while remaining applicable to a wide range of AI systems. AIVSS adapts traditional security vulnerability scoring concepts to the unique characteristics and challenges of AI, drawing insights from leading AI threat taxonomies and security standards. This document outlines the AIVSS framework, including detailed scoring rubrics, an implementation checklist, and considerations for environmental factors. 7 | 8 | ## 2. The Need for AIVSS ## 9 | 10 | Traditional vulnerability scoring systems, such as the Common Vulnerability Scoring System (CVSS), are insufficient for addressing the unique security challenges posed by AI systems. These challenges include: 11 | 12 | * **Adversarial Attacks:** AI systems are vulnerable to adversarial attacks that manipulate model behavior through crafted inputs, a threat not adequately captured by traditional systems. 13 | * **Model Degradation:** AI models can degrade over time due to concept drift or data poisoning, impacting their accuracy and reliability. 14 | * **Lifecycle Vulnerabilities:** AI systems have complex lifecycles, from data collection and training to deployment and maintenance, each stage introducing potential vulnerabilities. 15 | * **Ethical and Societal Impacts:** AI systems can have significant ethical and societal implications, such as bias and discrimination, which are not considered in traditional security assessments. 16 | * **Dynamic Nature of AI:** AI systems are often dynamic and adaptive, making static scoring methods less effective. 17 | 18 | AIVSS addresses these challenges by providing a comprehensive framework tailored to the specific security risks of AI. 19 | 20 | ## 3. Framework Components ## 21 | 22 | AIVSS consists of the following key components: 23 | 24 | ### 3.1. Base Metrics ### 25 | 26 | Base Metrics capture the fundamental characteristics of a vulnerability that are constant over time and across different environments. 27 | 28 | * **Attack Vector (AV):** Reflects the context by which vulnerability exploitation is possible. 29 | * Network (N): 0.85 30 | * Adjacent Network (A): 0.62 31 | * Local (L): 0.55 32 | * Physical (P): 0.2 33 | * **Attack Complexity (AC):** Measures the conditions beyond the attacker's control that must exist to exploit the vulnerability. 34 | * Low (L): 0.77 35 | * High (H): 0.44 36 | * **Privileges Required (PR):** Describes the level of privileges an attacker must possess before successfully exploiting the vulnerability. 37 | * None (N): 0.85 38 | * Low (L): 0.62 39 | * High (H): 0.27 40 | * **User Interaction (UI):** Captures the requirement for a user, other than the attacker, to participate in the successful compromise of the vulnerable component. 41 | * None (N): 0.85 42 | * Required (R): 0.62 43 | * **Scope (S):** Measures whether a vulnerability in one vulnerable component impacts resources in components beyond its security scope. 44 | * Unchanged (U): 1.0 45 | * Changed (C): 1.5 46 | 47 | ### 3.2. AI-Specific Metrics ### 48 | 49 | AI-Specific Metrics capture the unique vulnerabilities and risks associated with AI systems. These metrics are evaluated based on a detailed scoring rubric (provided in Section 4). 50 | 51 | ``` 52 | AISpecificMetrics = [MR × DS × EI × DC × AD × AA × LL × GV × CS] × ModelComplexityMultiplier 53 | ``` 54 | 55 | * **MR (Model Robustness):** Assesses the system's resilience to adversarial attacks and model degradation. 56 | * **DS (Data Sensitivity):** Evaluates the risks associated with the confidentiality, integrity, and provenance of the data used by the AI system. 57 | * **EI (Ethical Implications):** Considers potential biases, transparency issues, accountability concerns, and societal impacts. 58 | * **DC (Decision Criticality):** Measures the potential consequences of incorrect or malicious decisions made by the AI system. 59 | * **AD (Adaptability):** Assesses the system's ability to adapt to evolving threats and maintain security over time. 60 | * **AA (Adversarial Attack Surface):** Evaluates the system's exposure to various adversarial attack techniques. 61 | * **LL (Lifecycle Vulnerabilities):** Considers security risks at different stages of the AI system's lifecycle. 62 | * **GV (Governance and Validation):** Assesses the presence and effectiveness of governance mechanisms and validation processes. 63 | * **CS (Cloud Security Alliance LLM Taxonomy):** Addresses specific threats to LLMs in cloud environments, as defined by the CSA LLM Threat Taxonomy. 64 | * **ModelComplexityMultiplier:** A factor that adjusts the AI-Specific Metrics score based on the complexity of the AI model (ranging from 1.0 for simple models to 1.5 for highly complex models). 65 | 66 | ### 3.3. Environmental Metrics ### 67 | 68 | Environmental Metrics reflect the characteristics of the AI system's deployment environment that can influence the overall risk. 69 | 70 | * **Confidentiality Requirement (CR):** Measures the importance of maintaining the confidentiality of the data processed by the AI system. 71 | * **Integrity Requirement (IR):** Measures the importance of maintaining the integrity of the data and the AI system's outputs. 72 | * **Availability Requirement (AR):** Measures the importance of ensuring the AI system's availability for its intended purpose. 73 | * **Societal Impact Requirement (SIR):** Measures the importance of mitigating potential negative societal impacts of the AI system. 74 | 75 | These are rated: 76 | 77 | * Not Defined (X): 1.0 (default, does not modify the score) 78 | * Low (L): 0.5 79 | * Medium (M): 1.0 80 | * High (H): 1.5 81 | 82 | ### 3.4. Modified Base Metrics ### 83 | 84 | These metrics are based on the Base Metrics but can be modified according to the specific environment: 85 | 86 | * Modified Attack Vector (MAV) 87 | * Modified Attack Complexity (MAC) 88 | * Modified Privileges Required (MPR) 89 | * Modified User Interaction (MUI) 90 | * Modified Scope (MS) 91 | 92 | These metrics are rated the same way as Base Metrics, with the addition of: 93 | 94 | * Not Defined (X): Uses the unmodified Base Metric value. 95 | 96 | ## 4. Detailed Scoring Rubric for AI-Specific Metrics (Higher Score = More Severe) ## 97 | 98 | Each sub-category within the AI-Specific Metrics is scored on a scale of 0.0 to 1.0, with the following general interpretation: 99 | 100 | * **0.0:** **No Known Vulnerability:** Indicates no known vulnerability or a formally proven resistance to the specific threat. 101 | * **0.1 - 0.3:** **Low Vulnerability:** Indicates a low vulnerability with strong mitigation in place, but some minor weaknesses may still exist. 102 | * **0.4 - 0.6:** **Medium Vulnerability:** Indicates a moderate vulnerability with some mitigation, but significant weaknesses remain. 103 | * **0.7 - 1.0:** **Critical/High Vulnerability:** Indicates a severe vulnerability with little to no mitigation in place. 104 | 105 | **MR (Model Robustness)** 106 | 107 | * **Evasion Resistance** 108 | * **0.0:** Formally verified robustness against a wide range of evasion attacks. 109 | * **0.1-0.3:** Robust to most known evasion attacks, multiple defense mechanisms employed (e.g., adversarial training, input sanitization, certified robustness). 110 | * **0.4-0.6:** Susceptible to some evasion attacks, basic adversarial training or input validation in place. 111 | * **0.7-1.0:** Highly susceptible to common evasion attacks (e.g., FGSM, PGD). No or minimal defenses. 112 | * **Examples:** 113 | * **0.0:** Model's robustness proven through formal methods. 114 | * **0.2:** Model uses a combination of adversarial training, input filtering, and certified robustness techniques. 115 | * **0.5:** Model trained with adversarial examples, but still vulnerable to more sophisticated attacks. 116 | * **0.8:** Model easily fooled by adding small perturbations to input images. 117 | 118 | * **Gradient Masking/Obfuscation** 119 | * **0.0:** Gradients are completely hidden or formally proven to be unrecoverable. 120 | * **0.1-0.3:** Strong gradient masking techniques used (e.g., Shattered Gradients, Thermometer Encoding), making gradient-based attacks significantly more difficult. 121 | * **0.4-0.6:** Basic gradient obfuscation methods employed (e.g., adding noise), but gradients can still be partially recovered. 122 | * **0.7-1.0:** Gradients are easily accessible and interpretable, no masking techniques used. 123 | * **Examples:** 124 | * **0.0:** Model uses homomorphic encryption or other methods to make gradients completely inaccessible. 125 | * **0.2:** Model uses advanced techniques like shattered gradients to make gradient-based attacks computationally expensive. 126 | * **0.5:** Some noise is added to gradients, but they still reveal information about the model. 127 | * **0.9:** Model's gradients can be easily calculated and visualized. 128 | 129 | * **Robustness Certification** 130 | * **0.0:** Formal robustness certification obtained from a reputable third-party organization. 131 | * **0.1-0.3:** Rigorous robustness testing against a wide range of attacks and using multiple metrics (e.g., CLEVER, Robustness Gym). 132 | * **0.4-0.6:** Basic robustness testing against a limited set of attacks or using simple metrics. 133 | * **0.7-1.0:** No robustness testing performed. 134 | * **Examples:** 135 | * **0.0:** Model certified by a recognized certification body for robustness against specific attack types. 136 | * **0.2:** Model evaluated using a comprehensive robustness testing framework like Robustness Gym. 137 | * **0.5:** Model tested against FGSM attacks with a limited range of perturbation budgets. 138 | * **0.8:** No testing for robustness against adversarial examples. 139 | 140 | **DS (Data Sensitivity)** 141 | 142 | * **Data Confidentiality** 143 | * **0.0:** Data fully anonymized using techniques like differential privacy or homomorphic encryption. 144 | * **0.1-0.3:** Strong encryption (e.g., AES-256) used for data at rest and in transit, strict access controls and key management practices in place. 145 | * **0.4-0.6:** Sensitive data with basic access controls (e.g., passwords), but no encryption. 146 | * **0.7-1.0:** Highly sensitive data (e.g., PII, financial data) stored or processed with no or minimal protection. 147 | * **Examples:** 148 | * **0.0:** Data is fully anonymized and provably unlinkable to individuals. 149 | * **0.2:** Data encrypted at rest and in transit, with strict access controls and key rotation policies. 150 | * **0.5:** Data access restricted by user roles, but data is stored in plain text. 151 | * **0.9:** Training data includes unencrypted PII accessible to all developers. 152 | 153 | * **Data Integrity** 154 | * **0.0:** Data integrity formally verified using techniques like blockchain or Merkle trees. 155 | * **0.1-0.3:** Strong integrity checks (e.g., digital signatures, cryptographic hashes) and tamper detection mechanisms in place. 156 | * **0.4-0.6:** Basic integrity checks (e.g., checksums) used, but no tamper-proof mechanisms. 157 | * **0.7-1.0:** No data integrity checks, data can be easily modified without detection. 158 | * **Examples:** 159 | * **0.0:** Data is stored on a blockchain, ensuring immutability and tamper-proof integrity. 160 | * **0.2:** Data is digitally signed, and any modification is detected and alerted. 161 | * **0.5:** Checksums used to verify data integrity upon access. 162 | * **0.8:** Data can be altered without any detection. 163 | 164 | * **Data Provenance** 165 | * **0.0:** Data provenance formally verified and auditable, with mechanisms to ensure the authenticity and trustworthiness of the data source. 166 | * **0.1-0.3:** Detailed data lineage tracked, including all transformations and processing steps, with a clear audit trail. 167 | * **0.4-0.6:** Basic information about data sources available, but lineage is incomplete or unclear. 168 | * **0.7-1.0:** No information about data origin, collection methods, or transformations. 169 | * **Examples:** 170 | * **0.0:** Data provenance is cryptographically verified and tamper-proof. 171 | * **0.2:** Full data lineage is tracked, including all processing steps and data owners. 172 | * **0.5:** Data sources are documented, but the transformations applied are not clearly recorded. 173 | * **0.9:** Origin and collection method of the data are unknown. 174 | 175 | **EI (Ethical Implications)** 176 | 177 | * **Bias and Discrimination** 178 | * **0.0:** System demonstrably fair and unbiased across different groups, with ongoing monitoring and auditing for bias. 179 | * **0.1-0.3:** Rigorous fairness testing using multiple metrics (e.g., equal opportunity, predictive rate parity) and bias mitigation techniques applied (e.g., re-weighting, adversarial debiasing). 180 | * **0.4-0.6:** Some awareness of potential bias, basic fairness metrics (e.g., demographic parity) monitored, but no active mitigation. 181 | * **0.7-1.0:** High risk of discriminatory outcomes, no bias detection or mitigation methods used. 182 | * **Examples:** 183 | * **0.0:** System's fairness is formally verified and continuously monitored. 184 | * **0.2:** System is trained using techniques like adversarial debiasing and regularly audited for fairness. 185 | * **0.5:** Fairness metrics are monitored, but no actions are taken to address identified biases. 186 | * **0.9:** System consistently produces biased outputs against certain demographic groups. 187 | 188 | * **Transparency and Explainability** 189 | * **0.0:** System's decision-making process is fully transparent and formally explainable, with clear causal relationships established. 190 | * **0.1-0.3:** Highly explainable, system uses inherently interpretable models (e.g., decision trees) or provides reliable and comprehensive explanations for all decisions. 191 | * **0.4-0.6:** Limited explainability, some post-hoc explanations (e.g., LIME, SHAP) can be generated, but they may not be reliable or comprehensive. 192 | * **0.7-1.0:** Black-box system, no insight into decision-making process. 193 | * **Examples:** 194 | * **0.0:** System's logic is fully transparent and can be formally verified. 195 | * **0.2:** System uses an interpretable model or provides detailed and reliable explanations for each decision. 196 | * **0.5:** Post-hoc explanations can be generated, but they are not always accurate or complete. 197 | * **0.8:** No explanation provided for the system's decisions. 198 | 199 | * **Accountability** 200 | * **0.0:** Full accountability with mechanisms for redress, remediation, and independent oversight. 201 | * **0.1-0.3:** Clear accountability framework in place, with defined roles, responsibilities, and processes for addressing errors and disputes. 202 | * **0.4-0.6:** Some responsibility assigned to developers or operators, but no formal accountability framework. 203 | * **0.7-1.0:** No clear lines of accountability for system's actions or errors. 204 | * **Examples:** 205 | * **0.0:** System has a formal accountability framework with mechanisms for independent audits and public reporting. 206 | * **0.2:** Clear roles and responsibilities defined for development, deployment, and operation, with an incident response plan. 207 | * **0.5:** Development team is generally responsible, but there are no clear procedures for handling errors. 208 | * **0.9:** Unclear who is responsible when the system makes a mistake. 209 | 210 | * **Societal Impact** 211 | * **0.0:** System designed to maximize positive societal impact and minimize negative consequences, with ongoing monitoring and engagement with affected communities. 212 | * **0.1-0.3:** Thorough societal impact assessment conducted, considering a wide range of stakeholders and potential harms, with mitigation strategies in place. 213 | * **0.4-0.6:** Some consideration of potential societal impacts, but no comprehensive assessment or proactive mitigation. 214 | * **0.7-1.0:** High risk of negative societal impacts (e.g., job displacement, manipulation, erosion of trust), no assessment or mitigation. 215 | * **Examples:** 216 | * **0.0:** System is designed with a strong ethical framework, promoting fairness, transparency, and societal well-being. 217 | * **0.2:** A comprehensive societal impact assessment has been conducted, and mitigation strategies are in place. 218 | * **0.5:** Developers acknowledge potential negative impacts but have not taken concrete steps to address them. 219 | * **0.8:** System could be used for mass surveillance or to spread misinformation without any safeguards. 220 | 221 | **DC (Decision Criticality)** 222 | 223 | * **Safety-Critical** 224 | * **0.0:** System formally verified to meet safety-critical standards (e.g., ISO 26262 for automotive, IEC 62304 for medical devices). 225 | * **0.1-0.3:** Rigorous safety testing performed, including edge cases and failure scenarios, with failsafe mechanisms and human oversight. 226 | * **0.4-0.6:** Basic safety measures in place (e.g., some redundancy), but no rigorous safety testing or formal verification. 227 | * **0.7-1.0:** System used in safety-critical applications (e.g., autonomous driving, medical diagnosis) without proper safety considerations or failsafe mechanisms. 228 | * **Examples:** 229 | * **0.0:** System is certified to meet relevant safety standards for its application domain. 230 | * **0.2:** System undergoes rigorous safety testing and has multiple failsafe mechanisms in place. 231 | * **0.5:** System has some backup systems, but they have not been thoroughly tested. 232 | * **0.9:** System used to control a critical function without any redundancy or failsafe mechanisms. 233 | 234 | * **Financial Impact** 235 | * **0.0:** System designed to minimize financial risks, with real-time fraud prevention, anomaly detection, and comprehensive insurance coverage. 236 | * **0.1-0.3:** Robust financial controls and fraud detection mechanisms in place, regular audits conducted to identify and mitigate financial risks. 237 | * **0.4-0.6:** Some measures to mitigate financial risks (e.g., transaction limits), but no comprehensive risk assessment or fraud prevention mechanisms. 238 | * **0.7-1.0:** High risk of significant financial loss due to system errors or malicious attacks, no safeguards in place. 239 | * **Examples:** 240 | * **0.0:** System has multiple layers of financial controls, real-time fraud prevention, and insurance against financial losses. 241 | * **0.2:** System uses advanced fraud detection algorithms and undergoes regular financial audits. 242 | * **0.5:** System has some transaction limits and basic fraud monitoring. 243 | * **0.8:** System errors could lead to large unauthorized transactions without any detection. 244 | 245 | * **Reputational Damage** 246 | * **0.0:** System designed to minimize reputational risks, with ongoing monitoring of public perception, proactive engagement with stakeholders, and a robust crisis management plan. 247 | * **0.1-0.3:** Reputational risk assessment conducted, considering various scenarios and stakeholders, with communication plans and mitigation strategies in place. 248 | * **0.4-0.6:** Some awareness of reputational risks, limited monitoring of public perception, but no proactive measures to address negative publicity. 249 | * **0.7-1.0:** High risk of severe reputational damage due to system errors, biases, or security breaches, no mitigation strategies. 250 | * **Examples:** 251 | * **0.0:** System is designed to be transparent and ethical, minimizing the risk of reputational damage, and the company has a strong track record of responsible AI practices. 252 | * **0.2:** A reputational risk assessment has been conducted, and a crisis communication plan is in place. 253 | * **0.5:** Company monitors social media for negative comments but has no plan to address them. 254 | * **0.9:** System errors or biases could lead to widespread public criticism and loss of trust. 255 | 256 | * **Operational Disruption** 257 | * **0.0:** System designed for high availability and resilience, with real-time monitoring, automated recovery, and regular testing of failover mechanisms. 258 | * **0.1-0.3:** Robust operational controls, including redundancy, failover mechanisms, and a comprehensive business continuity and disaster recovery plan. 259 | * **0.4-0.6:** Some measures to mitigate operational risks (e.g., limited redundancy), but no comprehensive business continuity plan. 260 | * **0.7-1.0:** High risk of significant operational disruption due to system failures or attacks, no backup systems or recovery plans. 261 | * **Examples:** 262 | * **0.0:** System is designed for 24/7 availability with multiple layers of redundancy and automated recovery. 263 | * **0.2:** System has a comprehensive business continuity plan that is regularly tested and updated. 264 | * **0.5:** System has some redundant components, but failover procedures are not regularly tested. 265 | * **0.8:** System failure could bring down critical business operations with no backup. 266 | 267 | **AD (Adaptability)** 268 | 269 | * **Continuous Monitoring** 270 | * **0.0:** Real-time monitoring with automated response to detected threats, including dynamic model adaptation and rollback capabilities. 271 | * **0.1-0.3:** Comprehensive monitoring of system inputs, outputs, and internal states, with anomaly detection algorithms and automated alerts for suspicious activity. 272 | * **0.4-0.6:** Basic monitoring in place (e.g., logging system outputs), but limited analysis and no automated alerts. 273 | * **0.7-1.0:** No monitoring for adversarial attacks, anomalies, or performance degradation. 274 | * **Examples:** 275 | * **0.0:** System has real-time intrusion detection and automated response capabilities. 276 | * **0.2:** System uses a SIEM system to monitor for anomalies and generate alerts. 277 | * **0.5:** System logs are stored but only analyzed manually on a periodic basis. 278 | * **0.9:** No logs are collected, and no monitoring is performed. 279 | 280 | * **Retraining Capabilities** 281 | * **0.0:** Continuous and automated retraining triggered by performance degradation, concept drift, or the availability of new data, with minimal human intervention. 282 | * **0.1-0.3:** Automated retraining pipeline in place, allowing for regular updates with new data and model improvements. 283 | * **0.4-0.6:** Manual retraining possible, but infrequent and time-consuming, with limited automation. 284 | * **0.7-1.0:** No capability to retrain the model, or retraining requires significant manual effort and downtime. 285 | * **Examples:** 286 | * **0.0:** Model continuously learns and adapts to new data and changing conditions. 287 | * **0.2:** Model is automatically retrained on a regular schedule using an automated pipeline. 288 | * **0.5:** Model can be retrained manually, but it requires significant effort and downtime. 289 | * **0.8:** Model cannot be updated without rebuilding it from scratch. 290 | 291 | * **Threat Intelligence Integration** 292 | * **0.0:** Proactive threat hunting based on threat intelligence, with automated analysis and correlation of threat data to identify and mitigate potential risks before they impact the system. 293 | * **0.1-0.3:** Threat intelligence feeds integrated into security monitoring and response systems, providing automated alerts and updates on emerging threats. 294 | * **0.4-0.6:** Basic threat intelligence used (e.g., manually reviewing threat reports), but not systematically integrated into security operations. 295 | * **0.7-1.0:** No integration with threat intelligence feeds or other sources of security information. 296 | * **Examples:** 297 | * **0.0:** System uses threat intelligence to proactively identify and mitigate potential vulnerabilities. 298 | * **0.2:** System automatically ingests and analyzes threat intelligence feeds, generating alerts for relevant threats. 299 | * **0.5:** Security team occasionally reviews threat intelligence reports but takes no specific actions. 300 | * **0.9:** Security team is not aware of current threats to AI systems. 301 | 302 | * **Adversarial Training** 303 | * **0.0:** Continuous adversarial training with evolving attack techniques, incorporating new attacks as they are discovered, and using formal verification methods to ensure robustness. 304 | * **0.1-0.3:** Robust adversarial training against a wide range of attacks (e.g., PGD, C&W) with larger perturbation budgets, using multiple techniques (e.g., ensemble adversarial training, certified defenses). 305 | * **0.4-0.6:** Basic adversarial training with a limited set of attack types (e.g., FGSM) and small perturbation budgets. 306 | * **0.7-1.0:** No adversarial training used during model development. 307 | * **Examples:** 308 | * **0.0:** Model undergoes continuous adversarial training and is formally verified for robustness against specific attack models. 309 | * **0.2:** Model is trained using a combination of different adversarial training techniques and attack types. 310 | * **0.5:** Model is trained with FGSM-generated adversarial examples. 311 | * **0.8:** Model is not trained to be resistant to any adversarial examples. 312 | 313 | **AA (Adversarial Attack Surface)** 314 | 315 | * **Model Inversion** 316 | * **0.0:** Model provably resistant to model inversion attacks, with formal guarantees on the privacy of the training data. 317 | * **0.1-0.3:** Strong defenses against model inversion, such as differential privacy or data sanitization techniques, significantly increasing the difficulty of reconstructing training data. 318 | * **0.4-0.6:** Some measures to mitigate model inversion (e.g., limiting model output precision), but significant risks remain. 319 | * **0.7-1.0:** High risk of model inversion attacks, sensitive training data can be easily reconstructed from model outputs or gradients. 320 | * **Examples:** 321 | * **0.0:** Model is formally proven to be resistant to model inversion under specific attack models. 322 | * **0.2:** Model is trained with differential privacy, providing strong guarantees against model inversion. 323 | * **0.5:** Model's output is rounded or perturbed to make inversion more difficult, but some information may still be leaked. 324 | * **0.9:** An attacker can easily reconstruct faces or other sensitive data from the model's outputs. 325 | 326 | * **Model Extraction** 327 | * **0.0:** Model provably resistant to model extraction, with formal guarantees on the difficulty of creating a functional copy. 328 | * **0.1-0.3:** Strong defenses against model extraction, such as anomaly detection on API queries, model watermarking, and legal agreements with users, making it significantly more difficult and costly to steal the model. 329 | * **0.4-0.6:** Some measures to mitigate model extraction (e.g., rate limiting, watermarking), but a determined attacker can still succeed. 330 | * **0.7-1.0:** High risk of model extraction, attackers can easily create a functional copy of the model by querying its API. 331 | * **Examples:** 332 | * **0.0:** Model is designed to be resistant to model extraction, and its functionality cannot be replicated through black-box queries. 333 | * **0.2:** Model uses watermarking and anomaly detection to detect and prevent extraction attempts. 334 | * **0.5:** API access is rate-limited, but an attacker can still extract the model over a longer period. 335 | * **0.8:** An attacker can create a copy of the model by making a large number of API calls. 336 | 337 | * **Membership Inference** 338 | * **0.0:** Model provably resistant to membership inference attacks, with formal guarantees on the privacy of individual training data points. 339 | * **0.1-0.3:** Strong defenses against membership inference, such as differential privacy or model stacking, significantly reducing the attacker's ability to infer membership. 340 | * **0.4-0.6:** Some measures to mitigate membership inference (e.g., regularization, dropout), but significant risks remain. 341 | * **0.7-1.0:** High risk of membership inference attacks, attackers can easily determine whether a specific data point was used in the model's training set. 342 | * **Examples:** 343 | * **0.0:** Model is formally proven to be resistant to membership inference under specific attack models. 344 | * **0.2:** Model is trained with differential privacy, providing strong protection against membership inference. 345 | * **0.5:** Model uses regularization techniques that may reduce the risk of membership inference, but no formal guarantees. 346 | * **0.9:** An attacker can easily determine if a particular individual's data was used to train the model. 347 | 348 | **LL (Lifecycle Vulnerabilities)** 349 | 350 | * **Development** 351 | * **0.0:** Secure development environment with formal verification of code, strict access controls, and continuous monitoring for security threats. 352 | * **0.1-0.3:** Secure development lifecycle (SDL) practices followed, including code reviews, static analysis, and vulnerability scanning, with access controls on development resources. 353 | * **0.4-0.6:** Basic security measures in the development environment (e.g., developer workstations have antivirus software), some secure coding guidelines, but no formal secure development lifecycle (SDL). 354 | * **0.7-1.0:** Insecure development environment, no secure coding practices, no access controls on development resources. 355 | * **Examples:** 356 | * **0.0:** Development environment is isolated and continuously monitored, with formal methods used to verify the security of critical code components. 357 | * **0.2:** SDL practices are followed, including code reviews, static analysis, and vulnerability scanning, with access to code repositories restricted based on roles. 358 | * **0.5:** Developers use company-provided laptops with basic security software, and some secure coding guidelines are in place. 359 | * **0.8:** Developers work on personal laptops with no security controls, and code is stored in a public repository without access restrictions. 360 | 361 | * **Training** 362 | * **0.0:** Secure and isolated training environment with formal verification of the training process, strict access controls, and continuous monitoring for intrusions and anomalies. 363 | * **0.1-0.3:** Secure training environment with access controls, data encryption at rest and in transit, and regular security audits. 364 | * **0.4-0.6:** Basic security measures in the training environment (e.g., training data stored on a password-protected server), but no encryption or strict access controls. 365 | * **0.7-1.0:** Insecure training environment, no data security or access controls, training data stored and processed on unsecured systems. 366 | * **Examples:** 367 | * **0.0:** Training is performed in a secure enclave with strict access controls, continuous monitoring, and formal verification of the training process. 368 | * **0.2:** Training data is encrypted at rest and in transit, access is restricted based on roles, and the training environment is regularly audited for security. 369 | * **0.5:** Training data is stored on a password-protected server, but access is not strictly controlled. 370 | * **0.8:** Training data is stored on a public cloud server without any encryption or access controls. 371 | 372 | * **Deployment** 373 | * **0.0:** Secure and isolated deployment environment with continuous monitoring, automated security patching, and formal verification of the deployment process. 374 | * **0.1-0.3:** Secure deployment environment with strong authentication and authorization, regular security updates, and intrusion detection systems. 375 | * **0.4-0.6:** Basic security measures in the deployment environment (e.g., model deployed behind a firewall), but no strong authentication or authorization mechanisms. 376 | * **0.7-1.0:** Insecure deployment environment, no access controls or security monitoring, model deployed on publicly accessible servers without any protection. 377 | * **Examples:** 378 | * **0.0:** Model is deployed in a secure enclave with strict access controls, continuous monitoring, and automated security patching. 379 | * **0.2:** Model is deployed in a secure cloud environment with strong authentication, authorization, and regular security updates. 380 | * **0.5:** Model is deployed behind a firewall, but API keys are shared among multiple users. 381 | * **0.8:** Model is deployed on a public server with no authentication required to access its API. 382 | 383 | * **Operations** 384 | * **0.0:** Continuous security monitoring with automated incident response capabilities, regular security audits, and a dedicated security operations center (SOC). 385 | * **0.1-0.3:** Comprehensive security monitoring using a SIEM system, automated alerts for suspicious activity, and a well-defined incident response plan that is regularly tested. 386 | * **0.4-0.6:** Basic security monitoring (e.g., manually reviewing logs), limited incident response capabilities, no formal incident response plan. 387 | * **0.7-1.0:** No security monitoring or incident response plan, system logs not collected or analyzed. 388 | * **Examples:** 389 | * **0.0:** A dedicated SOC monitors the system 24/7, with automated incident response capabilities and regular security audits. 390 | * **0.2:** A SIEM system is used to monitor security events, generate alerts, and trigger incident response procedures. 391 | * **0.5:** System logs are collected and manually reviewed on a weekly basis, and there is a basic incident response plan. 392 | * **0.8:** No logs are collected, and there is no process for responding to security incidents. 393 | 394 | **GV (Governance and Validation)** 395 | 396 | * **Compliance** 397 | * **0.0:** System exceeds regulatory requirements and sets industry best practices for compliance, with a proactive approach to adapting to new regulations. 398 | * **0.1-0.3:** Full compliance with relevant regulations and industry standards, with a dedicated compliance team and regular audits. 399 | * **0.4-0.6:** Basic understanding of regulations, some ad-hoc compliance efforts, but no formal compliance program. 400 | * **0.7-1.0:** No awareness of or compliance with relevant regulations (e.g., GDPR, CCPA, HIPAA) or industry standards. 401 | * **Examples:** 402 | * **0.0:** System is designed to be compliant by design, exceeding regulatory requirements and setting industry best practices. 403 | * **0.2:** System is fully compliant with all applicable regulations, with regular audits and a dedicated compliance team. 404 | * **0.5:** Some efforts are made to comply with regulations, but there are significant gaps and no formal compliance program. 405 | * **0.8:** System collects and processes personal data without user consent or proper safeguards, violating data privacy regulations. 406 | 407 | * **Auditing** 408 | * **0.0:** Regular independent audits by reputable third parties, with formal verification of the system's security, fairness, and ethical performance. 409 | * **0.1-0.3:** Regular internal audits conducted, covering all aspects of the AI system lifecycle, with clear audit trails and documentation. 410 | 411 | * **0.4-0.6:** Infrequent or limited audits (e.g., only auditing code for security vulnerabilities), with no independent verification. 412 | * **0.7-1.0:** No auditing of the AI system's design, development, deployment, or operation. 413 | * **Examples:** 414 | * **0.0:** Independent audits are conducted annually by a reputable third party, with the results publicly reported. 415 | * **0.2:** Regular internal audits are conducted, covering security, fairness, and performance, with detailed audit trails. 416 | * **0.5:** Code is audited for security vulnerabilities before deployment, but no other audits are conducted. 417 | * **0.8:** No audit logs are maintained, and no audits are performed. 418 | 419 | * **Risk Management** 420 | * **0.0:** Proactive and continuous AI risk management, with a dedicated AI risk management team, regular risk assessments, and a strong focus on anticipating and mitigating emerging AI risks. 421 | * **0.1-0.3:** Comprehensive AI risk management framework in place, with specific processes for identifying, assessing, mitigating, and monitoring AI risks, fully integrated into the organizational risk framework. 422 | * **0.4-0.6:** Basic risk assessment for AI systems, limited mitigation strategies, AI risks partially integrated into the organizational risk framework. 423 | * **0.7-1.0:** No AI-specific risk management processes, AI risks not considered in the overall organizational risk framework. 424 | * **Examples:** 425 | * **0.0:** AI risk management is a continuous process, integrated with the organization's overall risk management and governance structures. 426 | * **0.2:** A comprehensive AI risk management framework is in place, with regular risk assessments and mitigation plans. 427 | * **0.5:** AI risks are assessed on an ad-hoc basis, with limited mitigation strategies. 428 | * **0.8:** AI risks are not considered in the organization's risk management processes. 429 | 430 | * **Human Oversight** 431 | * **0.0:** Human-in-the-loop system with well-defined roles and responsibilities, clear procedures for human-machine collaboration, and mechanisms for human oversight at various stages of the system's operation. 432 | * **0.1-0.3:** Clear mechanisms for human review and intervention in the system's decision-making process, with well-defined roles and responsibilities for human operators. 433 | * **0.4-0.6:** Limited human oversight, primarily reactive (e.g., users can report errors), no clear mechanisms for human intervention or override. 434 | * **0.7-1.0:** No human oversight or intervention in the AI system's decision-making process. 435 | * **Examples:** 436 | * **0.0:** System is designed for human-machine collaboration, with humans playing a central role in the decision-making process. 437 | * **0.2:** System has mechanisms for human operators to review and override its decisions in specific cases. 438 | * **0.5:** Users can report errors, but there is no process for human intervention in the system's decisions. 439 | * **0.8:** System operates autonomously without any human control or monitoring. 440 | 441 | * **Ethical Framework Alignment** 442 | * **0.0:** System demonstrably adheres to and promotes ethical AI principles, with ongoing monitoring and auditing of ethical performance. 443 | * **0.1-0.3:** System design and operation align with established ethical frameworks (e.g., OECD AI Principles, Montreal Declaration for Responsible AI), with mechanisms for addressing ethical concerns. 444 | * **0.4-0.6:** Basic awareness of ethical guidelines, limited implementation, no formal ethical review process. 445 | * **0.7-1.0:** No consideration of ethical frameworks or principles in the design, development, or deployment of the AI system. 446 | * **Examples:** 447 | * **0.0:** System's ethical performance is regularly assessed, and it actively promotes ethical AI principles. 448 | * **0.2:** System design incorporates principles from relevant ethical frameworks, and there is a process for addressing ethical concerns. 449 | * **0.5:** Developers are aware of ethical guidelines but have not formally integrated them into the system's design. 450 | * **0.8:** System is developed and deployed without any consideration for ethical implications. 451 | 452 | **CS (Cloud Security Alliance LLM Taxonomy)** 453 | 454 | * **Model Manipulation** 455 | * **0.0:** System provably resistant to model manipulation, with formal verification of robustness against prompt injection and other adversarial techniques. 456 | * **0.1-0.3:** Strong defenses against model manipulation (e.g., input filtering, adversarial training, output validation), making it difficult to manipulate the model's behavior. 457 | * **0.4-0.6:** Some defenses against manipulation (e.g., basic input sanitization), but vulnerabilities remain, and the model can be manipulated with some effort. 458 | * **0.7-1.0:** Highly vulnerable to model manipulation, including prompt injection and other adversarial techniques, with no or minimal defenses in place. 459 | * **Examples:** 460 | * **0.0:** Model's resistance to prompt injection is formally verified. 461 | * **0.2:** Model uses a combination of input filtering, adversarial training, and output validation to defend against manipulation. 462 | * **0.5:** Model has basic input sanitization, but can still be manipulated by carefully crafted prompts. 463 | * **0.8:** Model is easily manipulated by prompt injection attacks. 464 | 465 | * **Data Poisoning** 466 | * **0.0:** System provably resistant to data poisoning, with formal guarantees on the integrity and security of the training data. 467 | * **0.1-0.3:** Strong data validation, anomaly detection, and provenance tracking mechanisms in place, making it very difficult to successfully poison the training data. 468 | * **0.4-0.6:** Some measures to mitigate data poisoning (e.g., outlier detection), but risks remain, and targeted poisoning attacks may still be possible. 469 | * **0.7-1.0:** High risk of data poisoning, with no or minimal measures to ensure the integrity and security of the training data. 470 | * **Examples:** 471 | * **0.0:** Training data is stored on an immutable ledger with cryptographic verification of its integrity. 472 | * **0.2:** Robust data validation, anomaly detection, and provenance tracking mechanisms are used to prevent and detect data poisoning. 473 | * **0.5:** Basic outlier detection is used, but sophisticated poisoning attacks may still succeed. 474 | * **0.8:** Training data can be easily tampered with, and there are no mechanisms to detect poisoning. 475 | 476 | * **Sensitive Data Disclosure** 477 | * **0.0:** System provably prevents sensitive data disclosure, with formal guarantees on the privacy of sensitive information. 478 | * **0.1-0.3:** Strong access controls, encryption, and output sanitization mechanisms in place, making it very difficult to extract sensitive data from the system. 479 | * **0.4-0.6:** Some measures to prevent data leakage (e.g., output filtering), but vulnerabilities remain, and sensitive information may be disclosed under certain circumstances. 480 | * **0.7-1.0:** High risk of sensitive data disclosure, with no or minimal measures to protect sensitive information processed or stored by the system. 481 | * **Examples:** 482 | * **0.0:** System uses homomorphic encryption or other privacy-preserving techniques to prevent any sensitive data disclosure. 483 | * **0.2:** Strong access controls, encryption, and output sanitization are used to prevent data leakage. 484 | * **0.5:** Model outputs are filtered to remove potentially sensitive information, but some leakage may still occur. 485 | * **0.8:** Model may reveal sensitive information in its outputs, and there are no safeguards against data exfiltration. 486 | 487 | * **Model Stealing** 488 | * **0.0:** Model provably resistant to model stealing, with formal guarantees on the difficulty of creating a functional copy. 489 | * **0.1-0.3:** Strong defenses against model stealing (e.g., anomaly detection on API queries, model watermarking, legal agreements), making it significantly more difficult and costly to steal the model. 490 | * **0.4-0.6:** Some measures to mitigate model stealing (e.g., rate limiting), but a determined attacker can still succeed. 491 | * **0.7-1.0:** High risk of model stealing, and attackers can easily create a functional copy of the model by querying its API. 492 | * **Examples:** 493 | * **0.0:** Model is designed to be resistant to model extraction, and its functionality cannot be replicated through black-box queries. 494 | * **0.2:** Model uses a combination of watermarking, anomaly detection, and legal agreements to deter and detect model stealing. 495 | * **0.5:** API access is rate-limited, but an attacker can still extract the model over a longer period. 496 | * **0.8:** An attacker can create a copy of the model by making a large number of API calls. 497 | 498 | * **Failure/Malfunctioning** 499 | * **0.0:** System designed for high availability and fault tolerance, with formal verification of its reliability. 500 | * **0.1-0.3:** Robust error handling, monitoring, and redundancy mechanisms in place, significantly reducing the risk of failures or malfunctions. 501 | * **0.4-0.6:** Some measures to ensure reliability (e.g., basic error handling), but risks remain, and the system may experience downtime or produce incorrect outputs under certain conditions. 502 | * **0.7-1.0:** High risk of failures or malfunctions, with no or minimal measures to ensure the system's reliability. 503 | * **Examples:** 504 | * **0.0:** System is designed with multiple layers of redundancy and failover mechanisms, and its reliability is formally verified. 505 | * **0.2:** System has robust error handling, monitoring, and self-healing capabilities. 506 | * **0.5:** System has basic error handling and logging, but may experience downtime due to unexpected errors. 507 | * **0.8:** System is prone to crashes or errors, and there are no mechanisms to ensure its continuous operation. 508 | 509 | * **Insecure Supply Chain** 510 | * **0.0:** Secure and auditable supply chain, with formal verification of all third-party components and dependencies. 511 | * **0.1-0.3:** Strong supply chain security practices in place (e.g., code signing, dependency verification, regular audits), minimizing the risk of supply chain attacks. 512 | * **0.4-0.6:** Some measures to mitigate supply chain risks (e.g., using trusted sources), but vulnerabilities may still exist in third-party components. 513 | * **0.7-1.0:** High risk of supply chain vulnerabilities, with no or minimal measures to ensure the security of third-party components and dependencies. 514 | * **Examples:** 515 | * **0.0:** All third-party components are formally verified for security, and the supply chain is continuously monitored for vulnerabilities. 516 | * **0.2:** Strong security practices are followed throughout the supply chain, including code signing, dependency verification, and regular audits. 517 | * **0.5:** Third-party libraries are used from reputable sources, but they are not thoroughly vetted for security vulnerabilities. 518 | * **0.8:** System relies on outdated or unpatched third-party components with known vulnerabilities. 519 | 520 | * **Insecure Apps/Plugins** 521 | * **0.0:** Secure development and integration practices for apps/plugins enforced, with formal verification of their security. 522 | * **0.1-0.3:** Strong security guidelines and vetting process for apps/plugins, minimizing the risk of vulnerabilities introduced by third-party integrations. 523 | * **0.4-0.6:** Some security measures for apps/plugins (e.g., sandboxing), but risks remain, and vulnerabilities may be introduced through insecure integrations. 524 | * **0.7-1.0:** High risk of vulnerabilities from insecure apps/plugins, with no or minimal measures to ensure the security of third-party integrations. 525 | * **Examples:** 526 | * **0.0:** All apps/plugins undergo a rigorous security review and are formally verified before being allowed to integrate with the system. 527 | * **0.2:** Strong security guidelines are in place for app/plugin development, and a vetting process is used to minimize risks. 528 | * **0.5:** Apps/plugins are sandboxed, but they may still have access to sensitive data or functionalities. 529 | * **0.8:** Apps/plugins can be easily installed without any security checks, potentially introducing vulnerabilities into the system. 530 | 531 | * **Denial of Service (DoS)** 532 | * **0.0:** System provably resistant to DoS attacks, with formal guarantees on its availability under high load or malicious traffic. 533 | * **0.1-0.3:** Strong defenses against DoS attacks (e.g., traffic filtering, rate limiting, auto-scaling), making it very difficult to disrupt the system's availability. 534 | * **0.4-0.6:** Some measures to mitigate DoS attacks (e.g., basic rate limiting), but the system may still be vulnerable to sophisticated attacks. 535 | * **0.7-1.0:** Highly vulnerable to DoS attacks, with no or minimal measures to protect the system's availability. 536 | * **Examples:** 537 | * **0.0:** System is designed to withstand massive traffic spikes and is formally verified for its resistance to DoS attacks. 538 | * **0.2:** System uses a combination of traffic filtering, rate limiting, and auto-scaling to mitigate DoS attacks. 539 | * **0.5:** System has basic rate limiting, but can still be overwhelmed by a large number of requests. 540 | * **0.8:** System can be easily made unavailable by sending a large number of requests or malicious traffic. 541 | 542 | * **Loss of Governance/Compliance** 543 | * **0.0:** System meets or exceeds all relevant regulatory and governance requirements, with a proactive approach to adapting to new regulations and a strong focus on maintaining compliance. 544 | * **0.1-0.3:** Strong compliance framework and controls in place, ensuring adherence to relevant regulations and governance policies. 545 | * **0.4-0.6:** Some compliance efforts, but gaps remain, and the system may not fully meet all regulatory or governance requirements. 546 | * **0.7-1.0:** High risk of non-compliance with regulations or governance policies, with no or minimal measures to ensure adherence. 547 | * **Examples:** 548 | * **0.0:** System is designed to be compliant by design, with automated mechanisms to ensure adherence to regulations and policies. 549 | * **0.2:** System is regularly audited for compliance, and a dedicated team ensures that all requirements are met. 550 | * **0.5:** Some efforts are made to comply with regulations, but there are significant gaps and no formal compliance program. 551 | * **0.8:** System does not meet data privacy regulations, and there are no mechanisms to ensure compliance with internal policies. 552 | 553 | ## 5. Scoring Methodology ## 554 | 555 | **Base Formula** 556 | 557 | ``` 558 | AIVSS_Score = [ 559 | (w₁ × ModifiedBaseScore) + 560 | (w₂ × AISpecificMetrics) + 561 | (w₃ × ImpactMetrics) 562 | ] × TemporalMetrics × MitigationMultiplier 563 | 564 | Where: 0 ≤ AIVSS_Score ≤ 10 565 | ``` 566 | 567 | * **w₁, w₂, w₃:** Weights assigned to each component (Modified Base, AI-Specific, Impact). Suggested starting point: w₁ = 0.3, w₂ = 0.5, w₃ = 0.2 (giving more weight to AI-specific risks). Adjust based on the specific AI system and its risk profile. 568 | * **TemporalMetrics:** Adjustments based on exploitability, remediation level, and report confidence (similar to CVSS Temporal Score). 569 | * **Exploitability (E):** 570 | * Not Defined (ND): 1.0 571 | * Unproven (U): 0.9 572 | * Proof-of-Concept (P): 0.95 573 | * Functional (F): 1.0 574 | * High (H): 1.0 575 | * **Remediation Level (RL):** 576 | * Not Defined (ND): 1.0 577 | * Official Fix (O): 0.95 578 | * Temporary Fix (T): 0.96 579 | * Workaround (W): 0.97 580 | * Unavailable (U): 1.0 581 | * **Report Confidence (RC):** 582 | * Not Defined (ND): 1.0 583 | * Unknown (U): 0.92 584 | * Reasonable (R): 0.96 585 | * Confirmed (C): 1.0 586 | * **MitigationMultiplier:** A factor (ranging from 1.0 to 1.5) that *increases* the score based on the *lack* of effective mitigations. 1.0 = Strong Mitigation; 1.5 = No/Weak Mitigation. 587 | 588 | ## 6. Component Calculations ## 589 | 590 | **1. Modified Base Metrics** 591 | 592 | ``` 593 | ModifiedBaseScore = min(10, [MAV × MAC × MPR × MUI × MS] × ScopeMultiplier) 594 | ``` 595 | Where the Modified Base Metrics (MAV, MAC, MPR, MUI, MS) are derived from the Base Metrics, adjusted according to the specific environment and using the Environmental Metrics. Each Modified Base Metric can be rated the same way as the Base Metrics, with the addition of: 596 | 597 | * Not Defined (X): Uses the unmodified Base Metric value. 598 | 599 | **2. AI-Specific Metrics** 600 | 601 | ``` 602 | AISpecificMetrics = [MR × DS × EI × DC × AD × AA × LL × GV × CS] × ModelComplexityMultiplier 603 | ``` 604 | 605 | * Each metric (MR, DS, EI, DC, AD, AA, LL, GV, CS) is scored from 0.0 to 1.0 based on the severity of the vulnerability in each sub-category, using the detailed scoring rubric provided above (higher score = more severe issue). 606 | * **ModelComplexityMultiplier:** A factor (1.0 to 1.5) to account for the increased attack surface and complexity of more advanced models. 607 | 608 | **3. Impact Metrics** 609 | 610 | ``` 611 | ImpactMetrics = (C + I + A + SI) / 4 612 | ``` 613 | 614 | * **C (Confidentiality Impact):** Impact on data confidentiality. 615 | * **I (Integrity Impact):** Impact on data and system integrity. 616 | * **A (Availability Impact):** Impact on system availability. 617 | * **SI (Societal Impact):** Broader societal harms (e.g., discrimination, manipulation). Informed by the EI (Ethical Implications) sub-categories. 618 | 619 | **Severity Levels (for C, I, A, SI):** 620 | 621 | * None: 0.0 622 | * Low: 0.22 623 | * Medium: 0.55 624 | * High: 0.85 625 | * Critical: 1.0 626 | 627 | **4. Environmental Score** 628 | 629 | The Environmental Score is calculated by modifying the Base Score with the Environmental metrics. The formula integrates these considerations: 630 | 631 | ``` 632 | EnvironmentalScore = [(ModifiedBaseScore + (Environmental Component)) × TemporalMetrics] × (1 + EnvironmentalMultiplier) 633 | ``` 634 | 635 | Environmental Component is derived from the AI-Specific Metrics, adjusted based on the environmental context: 636 | 637 | ``` 638 | EnvironmentalComponent = [CR × IR × AR × SIR] × AISpecificMetrics 639 | ``` 640 | 641 | Where: 642 | 643 | * CR, IR, AR, SIR are the Confidentiality, Integrity, Availability, and Societal Impact Requirements, respectively. 644 | * EnvironmentalMultiplier adjusts the score based on specific environmental factors not covered by CR, IR, AR, SIR. 645 | 646 | **Risk Categories** 647 | 648 | ``` 649 | Critical: 9.0 - 10.0 650 | High: 7.0 - 8.9 651 | Medium: 4.0 - 6.9 652 | Low: 0.1 - 3.9 653 | None: 0.0 654 | ``` 655 | 656 | ## 7. Implementation Guide ## 657 | 658 | **Prerequisites** 659 | 660 | * Access to AI system architecture details 661 | * Security assessment tools 662 | * Understanding of ML/AI concepts and the specific AI model under assessment. 663 | * Expertise in ethical AI principles and potential societal impacts. 664 | * Familiarity with cloud security principles, particularly the CSA LLM Threat Taxonomy if the AI system is cloud-based or an LLM. 665 | * Experience with vulnerability analysis, particularly in the context of AI/ML systems. 666 | 667 | **Roles and Responsibilities:** 668 | 669 | * **AI Security Team/Specialist:** Leads the AIVSS assessment, coordinates with other teams, ensures accuracy and completeness. 670 | * **AI Developers/Data Scientists:** Provide technical details, assist in identifying vulnerabilities, implement mitigations. 671 | * **Security Engineers:** Assess base metrics, evaluate the security of development, training, and deployment environments, contribute to the overall assessment. 672 | * **Compliance/Risk Officer:** Ensures alignment with regulations and organizational risk management frameworks. 673 | * **Ethical AI Officer/Review Board:** Evaluates ethical implications and provides guidance on mitigating ethical risks. 674 | 675 | ## 8. AIVSS Assessment Checklist ## 676 | 677 | This checklist provides a simplified and actionable guide for organizations to conduct an AIVSS assessment. 678 | 679 | **Phase 1: System and Environment Definition** 680 | 681 | * [ ] **1.1** Identify the AI system to be assessed, including its components, data flows, and dependencies. 682 | * [ ] **1.2** Define the system's operational environment, including its deployment model (cloud, on-premise, hybrid), network configuration, and user base. 683 | * [ ] **1.3** Determine the Environmental metrics (CR, IR, AR, SIR) based on the system's specific context. 684 | * [ ] **1.4** Document the Modified Base Metrics (MAV, MAC, MPR, MUI, MS) based on environmental factors. 685 | 686 | **Phase 2: Base and AI-Specific Metrics Evaluation** 687 | 688 | * [ ] **2.1** Evaluate the Base Metrics (AV, AC, PR, UI, S) based on the identified vulnerability. 689 | * [ ] **2.2** Assess each AI-Specific Metric (MR, DS, EI, DC, AD, AA, LL, GV, CS) using the detailed scoring rubric: 690 | * [ ] **2.2.1** Model Robustness (MR): Evasion Resistance, Gradient Masking, Robustness Certification. 691 | * [ ] **2.2.2** Data Sensitivity (DS): Data Confidentiality, Data Integrity, Data Provenance. 692 | * [ ] **2.2.3** Ethical Implications (EI): Bias and Discrimination, Transparency and Explainability, Accountability, Societal Impact. 693 | * [ ] **2.2.4** Decision Criticality (DC): Safety-Critical, Financial Impact, Reputational Damage, Operational Disruption. 694 | * [ ] **2.2.5** Adaptability (AD): Continuous Monitoring, Retraining Capabilities, Threat Intelligence Integration, Adversarial Training. 695 | * [ ] **2.2.6** Adversarial Attack Surface (AA): Model Inversion, Model Extraction, Membership Inference. 696 | * [ ] **2.2.7** Lifecycle Vulnerabilities (LL): Development, Training, Deployment, Operations. 697 | * [ ] **2.2.8** Governance and Validation (GV): Compliance, Auditing, Risk Management, Human Oversight, Ethical Framework Alignment. 698 | * [ ] **2.2.9** Cloud Security Alliance LLM Taxonomy (CS): Model Manipulation, Data Poisoning, Sensitive Data Disclosure, Model Stealing, Failure/Malfunctioning, Insecure Supply Chain, Insecure Apps/Plugins, Denial of Service (DoS), Loss of Governance/Compliance. 699 | * [ ] **2.3** Determine the Model Complexity Multiplier based on the assessed AI model. 700 | 701 | **Phase 3: Impact and Temporal Assessment** 702 | 703 | * [ ] **3.1** Assess the Impact Metrics (C, I, A, SI) based on the potential consequences of the vulnerability. 704 | * [ ] **3.2** Evaluate the Temporal Metrics (E, RL, RC) based on the current exploitability, available remediation, and report confidence. 705 | 706 | **Phase 4: Mitigation and Scoring** 707 | 708 | * [ ] **4.1** Evaluate the effectiveness of existing mitigations and determine the Mitigation Multiplier. 709 | * [ ] **4.2** Calculate the Modified Base Score. 710 | * [ ] **4.3** Calculate the AI-Specific Metrics Score. 711 | * [ ] **4.4** Calculate the Impact Metrics Score. 712 | * [ ] **4.5** Calculate the Environmental Component. 713 | * [ ] **4.6** Calculate the Environmental Score. 714 | * [ ] **4.7** Generate the final AIVSS Score using the formula. 715 | 716 | **Phase 5: Reporting and Remediation** 717 | 718 | * [ ] **5.1** Document the assessment findings in a comprehensive report, including the AIVSS score, detailed metric scores, justifications, and supporting evidence. 719 | * [ ] **5.2** Communicate the assessment results to relevant stakeholders (technical teams, management, board of directors). 720 | * [ ] **5.3** Develop and prioritize recommendations for remediation based on the AIVSS score and the identified vulnerabilities. 721 | * [ ] **5.4** Implement the recommended mitigations and track progress. 722 | * [ ] **5.5** Re-assess the AI system after implementing mitigations to validate their effectiveness and update the AIVSS score. 723 | 724 | ## 9. Example Assessment: ## 725 | 726 | ```python 727 | # Example vulnerability assessment (Illustrative and Simplified) 728 | vulnerability = { 729 | 'attack_vector': 'Network', # 0.85 730 | 'attack_complexity': 'High', # 0.44 731 | 'privileges_required': 'Low', # 0.62 732 | 'user_interaction': 'None', # 0.85 733 | 'scope': 'Unchanged', # 1.0 734 | 'model_robustness': { 735 | 'evasion_resistance': 0.7, # High susceptibility to evasion 736 | 'gradient_masking': 0.8, # Gradients easily accessible 737 | }, 738 | 'data_sensitivity': { 739 | 'data_confidentiality': 0.9, # Sensitive data with minimal protection 740 | 'data_integrity': 0.7 # No data integrity checks 741 | }, 742 | 'ethical_impact': { 743 | 'bias_discrimination': 0.8, # High risk of discriminatory outcomes 744 | 'transparency_explainability': 0.7, # Black-box system 745 | }, 746 | 'cloud_security': { 747 | 'model_manipulation': 0.8, # Vulnerable to prompt injection 748 | 'data_poisoning': 0.6, # Some risk of data poisoning 749 | 'sensitive_data_disclosure': 0.7, # Risk of sensitive data leakage 750 | 'model_stealing': 0.5, # Some model stealing mitigations, but risks remain 751 | 'failure_malfunctioning': 0.7, # Risk of failures 752 | 'insecure_supply_chain': 0.6, # Some supply chain risks 753 | 'insecure_apps_plugins': 0.4, # Some app/plugin security, but risks remain 754 | 'denial_of_service': 0.8, # Vulnerable to DoS 755 | 'loss_of_governance_compliance': 0.7 # Risk of non-compliance 756 | }, 757 | # ... (Other AI-specific metrics with sub-categories) 758 | 'confidentiality_impact': 'High', # 0.85 759 | 'integrity_impact': 'Medium', # 0.55 760 | 'availability_impact': 'Low', # 0.22 761 | 'societal_impact': 'Medium', # 0.55 762 | 'temporal_metrics': { 763 | 'exploitability': 'Proof-of-Concept', # 0.95 764 | 'remediation_level': 'Temporary Fix', # 0.96 765 | 'report_confidence': 'Confirmed' # 1.0 766 | }, 767 | 'mitigation_multiplier': 1.4, # Example: Weak Mitigation 768 | 'model_complexity_multiplier': 1.4, # Example: Complex model (e.g., large language model) 769 | 'environmental_metrics': { 770 | 'cr': 'High', # 1.5 771 | 'ir': 'Medium', # 1.0 772 | 'ar': 'Low', # 0.5 773 | 'sir': 'Medium', # 1.0 774 | }, 775 | 776 | } 777 | 778 | # Adjust Base Metrics according to environment 779 | vulnerability['modified_attack_vector'] = vulnerability['attack_vector'] # No change 780 | vulnerability['modified_attack_complexity'] = vulnerability['attack_complexity'] * 0.5 # Example: Lower complexity in this environment 781 | vulnerability['modified_privileges_required'] = vulnerability['privileges_required'] # No change 782 | vulnerability['modified_user_interaction'] = vulnerability['user_interaction'] # No change 783 | vulnerability['modified_scope'] = vulnerability['scope'] # No change 784 | 785 | # Calculate score (simplified and illustrative) 786 | 787 | # Modified Base Metrics 788 | modified_base_score = min(10, vulnerability['modified_attack_vector'] * vulnerability['modified_attack_complexity'] * vulnerability['modified_privileges_required'] * vulnerability['modified_user_interaction'] * vulnerability['modified_scope']) # = 0.098 789 | 790 | # AI-Specific Metrics - Example calculations (using average for simplicity): 791 | mr_score = (vulnerability['model_robustness']['evasion_resistance'] + 792 | vulnerability['model_robustness']['gradient_masking']) / 2 # = 0.75 793 | ds_score = (vulnerability['data_sensitivity']['data_confidentiality'] + 794 | vulnerability['data_sensitivity']['data_integrity']) / 2 # = 0.8 795 | ei_score = (vulnerability['ethical_impact']['bias_discrimination'] + 796 | vulnerability['ethical_impact']['transparency_explainability']) / 2 # = 0.75 797 | 798 | # Cloud Security (CS) - using the detailed rubric and averaging: 799 | cs_score = (vulnerability['cloud_security']['model_manipulation'] + 800 | vulnerability['cloud_security']['data_poisoning'] + 801 | vulnerability['cloud_security']['sensitive_data_disclosure'] + 802 | vulnerability['cloud_security']['model_stealing'] + 803 | vulnerability['cloud_security']['failure_malfunctioning'] + 804 | vulnerability['cloud_security']['insecure_supply_chain'] + 805 | vulnerability['cloud_security']['insecure_apps_plugins'] + 806 | vulnerability['cloud_security']['denial_of_service'] + 807 | vulnerability['cloud_security']['loss_of_governance_compliance']) / 9 # = 0.64 808 | 809 | # Assume other AI-specific metrics are calculated and we have these scores (Illustrative): 810 | dc_score = 0.7 811 | ad_score = 0.55 812 | aa_score = 0.6 813 | ll_score = 0.75 814 | gv_score = 0.8 815 | 816 | # Calculate the overall AI-Specific Metrics score: 817 | ais_score = (mr_score * ds_score * ei_score * dc_score * ad_score * aa_score * ll_score * gv_score * cs_score) * vulnerability['model_complexity_multiplier'] # = 0.095 818 | 819 | # Impact Metrics 820 | impact_score = (vulnerability['confidentiality_impact'] + vulnerability['integrity_impact'] + vulnerability['availability_impact'] + vulnerability['societal_impact']) / 4 # = 0.543 821 | 822 | # Temporal Metrics - using average for simplicity: 823 | temporal_score = (vulnerability['temporal_metrics']['exploitability'] + 824 | vulnerability['temporal_metrics']['remediation_level'] + 825 | vulnerability['temporal_metrics']['report_confidence']) / 3 # = 0.97 826 | 827 | # Environmental Component 828 | environmental_component = (vulnerability['environmental_metrics']['cr'] * vulnerability['environmental_metrics']['ir'] * vulnerability['environmental_metrics']['ar'] * vulnerability['environmental_metrics']['sir']) * ais_score # = 0.072 829 | # Environmental Score 830 | environmental_score = min(10, ((modified_base_score + environmental_component) * temporal_score) * vulnerability['mitigation_multiplier']) # = 0.232 831 | 832 | # Final AIVSS Score 833 | final_score = ((0.3 * modified_base_score) + (0.5 * ais_score) + (0.2 * impact_score)) * temporal_score * vulnerability['mitigation_multiplier'] # = 0.323 834 | ``` 835 | 836 | **Interpretation of the Example:** 837 | 838 | In this example, the final AIVSS score is approximately **0.323**, which falls into the **Low** risk category. The Environmental Score is factored in, slightly modifying the Base Score based on the specific environmental requirements. 839 | 840 | **Important Considerations:** 841 | 842 | * **Illustrative:** This is a simplified example. Real-world assessments will involve a more thorough evaluation of each sub-category. 843 | * **Weighting:** The weights (w₁, w₂, w₃) can significantly influence the final score. Organizations should carefully consider the appropriate weights based on their specific risk profiles. 844 | * **Context:** The interpretation of the AIVSS score should always be done in the context of the specific AI system, its intended use, and the potential consequences of a security incident. 845 | 846 | ## 10. Reporting and Communication: ## 847 | 848 | * **Assessment Report:** A comprehensive report should be generated, including: 849 | * A summary of the AI system being assessed. 850 | * The final AIVSS score and risk category. 851 | * Detailed scores for each metric and sub-category. 852 | * Justification for the assigned scores, referencing the scoring rubric and evidence gathered. 853 | * An analysis of the key vulnerabilities and their potential impact. 854 | * Recommendations for mitigation, prioritized based on the severity of the vulnerabilities. 855 | * Appendices with supporting documentation (e.g., threat models, assessment data). 856 | * **Communication:** Assessment findings should be communicated to relevant stakeholders, including: 857 | * **Technical Teams:** To inform remediation efforts. 858 | * **Management:** To support risk management decisions. 859 | * **Board of Directors:** To provide an overview of the organization's AI security posture. 860 | 861 | ## 11. Integration with Risk Management Frameworks: ## 862 | 863 | * **Mapping:** AIVSS metrics can be mapped to existing risk categories within organizational risk management frameworks (e.g., NIST Cybersecurity Framework, ISO 27001). 864 | * **Risk Assessments:** AIVSS assessments can be incorporated into broader risk assessments. 865 | * **Audits:** AIVSS can be used as a framework for conducting audits of AI systems. 866 | 867 | ## 12. Appendix: AI Threat Taxonomies ## 868 | 869 | | Taxonomy | Description | Link | 870 | | :--------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- | 871 | | **MITRE ATLAS** | A knowledge base of adversary tactics and techniques based on real-world observations, specifically focused on threats to machine learning systems. It provides a framework for understanding the adversarial ML lifecycle and includes case studies of attacks. | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 872 | | **NIST AI 100-2 E2023** | A taxonomy of adversarial machine learning, including attacks, defenses, and consequences. It provides a detailed framework for understanding and categorizing threats to AI systems and offers guidance on risk management. | [https://csrc.nist.gov/pubs/ai/100/2/e2023/final](https://csrc.nist.gov/pubs/ai/100/2/e2023/final) | 873 | | **EU HLEG Trustworthy AI** | Ethics guidelines for trustworthy AI developed by the European Commission's High-Level Expert Group on Artificial Intelligence. It focuses on human-centric AI principles, including fairness, transparency, accountability, and societal well-being. | [https://digital-strategy.ec.europa 874 | | **ISO/IEC JTC 1/SC 42** | An international standards body developing standards for artificial intelligence. It covers various aspects of AI, including risk management, trustworthiness, bias, and governance. | [https://www.iso.org/committee/6794475.html](https://www.iso.org/committee/6794475.html) | 875 | | **AI Incident Database** | A database of real-world incidents involving AI systems, including failures, accidents, and malicious attacks. It provides valuable data for understanding the risks associated with AI and informing risk management strategies. | [https://incidentdatabase.ai/](https://incidentdatabase.ai/) | 876 | | **DARPA's GARD** | The Guaranteeing AI Robustness against Deception (GARD) program aims to develop defenses against adversarial attacks on AI systems. It focuses on developing robust AI that can withstand attempts to deceive or manipulate it. | [https://www.darpa.mil/research/programs/guaranteeing-ai-robustness-against-deception](https://www.darpa.mil/research/programs/guaranteeing-ai-robustness-against-deception) | 877 | | **OECD AI Principles** | Principles for responsible stewardship of trustworthy AI, adopted by the Organisation for Economic Co-operation and Development (OECD). They cover aspects such as inclusive growth, human-centered values, transparency, robustness, and accountability. | [https://oecd.ai/en/ai-principles](https://oecd.ai/en/ai-principles) | 878 | | **MITRE Atlas Matrix** | Adversarial ML Threat Matrix is a framework that captures the tactics, techniques, and procedures used by adversaries to attack ML systems. It is structured similarly to the ATT&CK framework but specialized for the domain of machine learning. | [https://atlas.mitre.org/](https://atlas.mitre.org/) | 879 | | **CSA LLM Threat Taxonomy** | Defines common threats related to large language models in the cloud. Key categories include model manipulation, data poisoning, sensitive data disclosure, model stealing, and others specific to cloud-based LLM deployments. | [https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy](https://cloudsecurityalliance.org/artifacts/csa-large-language-model-llm-threats-taxonomy) | 880 | | **MIT AI Threat Taxonomy** | Comprehensive classification of attack surfaces, adversarial techniques, and governance vulnerabilities of AI. It details various types of attacks and provides mitigation strategies. | [https://arxiv.org/pdf/2408.12622](https://arxiv.org/pdf/2408.12622) | 881 | | **OWASP Top 10 for LLMs** | Highlights the most critical security risks for large language model applications. It covers vulnerabilities like prompt injection, data leakage, insecure output handling, and model denial of service, among others. | [https://owasp.org/www-project-top-10-for-large-language-model-applications/](https://owasp.org/www-project-top-10-for-large-language-model-applications/) | 882 | 883 | **Note:** This table is intended to be a starting point and may not be exhaustive. New taxonomies and frameworks may emerge as the field of AI security continues to evolve. 884 | 885 | ## 13. Continuous Improvement ## 886 | 887 | This AIVSS framework should be treated as a living document. It will be revised and updated. Organizations are encouraged to provide feedback, contribute to its development, and adapt it to their specific needs. Regular updates will be released to incorporate new research, threat intelligence, and best practices. 888 | 889 | ## 14. Conclusion ## 890 | 891 | The AIVSS framework provides a structured and comprehensive approach to assessing and quantifying the security risks of AI systems. By using this framework, along with the provided checklist, organizations can gain a better understanding of their AI-specific vulnerabilities, prioritize remediation efforts, and improve their overall AI security posture. The detailed scoring rubric, the inclusion of relevant AI threat taxonomies, the addition of the Environmental Score, and the focus on practical implementation make AIVSS a valuable tool for securing the future of AI. Continuous improvement, community engagement, and adaptation to the evolving threat landscape will be crucial for the long-term success and adoption of AIVSS. 892 | 893 | **Disclaimer:** 894 | 895 | AIVSS is a framework for assessing and scoring the security risks of AI systems. It is not a guarantee of security, and it should not be used as the sole basis for making security decisions. Organizations should use AIVSS in conjunction with other security best practices and should always exercise their own judgment when assessing and mitigating AI security risks. 896 | --------------------------------------------------------------------------------