├── component-diagram.png ├── state-machine-diagram.png ├── LICENSE ├── SECURITY.md ├── .github └── workflows │ └── linter.yml ├── README.md ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md └── DESIGN.md /component-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/godaddy/ans-registry/main/component-diagram.png -------------------------------------------------------------------------------- /state-machine-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/godaddy/ans-registry/main/state-machine-diagram.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2025 GoDaddy Operating Company, LLC. 2 | 3 | Licensed under the Apache License, Version 2.0 (the "License"); 4 | you may not use this file except in compliance with the License. 5 | You may obtain a copy of the License at 6 | 7 | http://www.apache.org/licenses/LICENSE-2.0 8 | 9 | Unless required by applicable law or agreed to in writing, software 10 | distributed under the License is distributed on an "AS IS" BASIS, 11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | See the License for the specific language governing permissions and 13 | limitations under the License. -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Reporting Security Issues 2 | 3 | We take security very seriously at GoDaddy. We appreciate your efforts to 4 | responsibly disclose your findings, and will make every effort to acknowledge 5 | your contributions. 6 | 7 | ## Where should I report security issues? 8 | 9 | In order to give the community time to respond and upgrade, we strongly urge you 10 | report all security issues privately. 11 | 12 | To report a security issue in one of our Open Source projects email us directly 13 | at **oss@godaddy.com** and include the word "SECURITY" in the subject line. 14 | 15 | This mail is delivered to our Open Source Security team. 16 | 17 | After the initial reply to your report, the team will keep you informed of the 18 | progress being made towards a fix and announcement, and may ask for additional 19 | information or guidance. -------------------------------------------------------------------------------- /.github/workflows/linter.yml: -------------------------------------------------------------------------------- 1 | --- 2 | name: Lint 3 | 4 | on: 5 | push: 6 | 7 | permissions: {} 8 | 9 | jobs: 10 | lint: 11 | name: Lint 12 | runs-on: ubuntu-latest 13 | 14 | permissions: 15 | contents: read 16 | packages: read 17 | # To report GitHub Actions status checks 18 | statuses: write 19 | 20 | steps: 21 | - name: Checkout code 22 | uses: actions/checkout@v5 23 | with: 24 | # super-linter needs the full git history to get the 25 | # list of files that changed across commits 26 | fetch-depth: 0 27 | persist-credentials: false 28 | 29 | - name: Super-linter 30 | uses: super-linter/super-linter@v8.2.1 # x-release-please-version 31 | env: 32 | # To report GitHub Actions status checks 33 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 34 | # Default branch for comparison (only changed files are linted) 35 | DEFAULT_BRANCH: main 36 | # Only lint changed files for better performance 37 | VALIDATE_ALL_CODEBASE: false 38 | # Enable markdown linting (primary file type in this repo) 39 | VALIDATE_MARKDOWN: true 40 | # Enable YAML linting (for workflow files) 41 | VALIDATE_YAML: true 42 | # Generate GitHub Actions step summary for better visibility 43 | ENABLE_GITHUB_ACTIONS_STEP_SUMMARY: true 44 | 45 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Agent Name Service (ANS) Registry 2 | 3 | A production-ready registry system for secure AI agent discovery and identity verification. The ANS Registry enables autonomous agents to find and trust each other across organizational boundaries without requiring bilateral agreements. 4 | 5 | ## Status 6 | 7 | The registry is operational with REST APIs for agent discovery and search. 8 | 9 | > **Repository Intent**: This repository follows a design-first approach with OpenAPI specification alignment. The architecture and API contracts are defined in the design documentation, ensuring implementation consistency and enabling API-first development practices. 10 | 11 | ## Overview 12 | 13 | The ANS Registry provides cryptographic identity and trust infrastructure for AI agents. Every agent identity is anchored to a verifiable Fully Qualified Domain Name (FQDN), creating a permanent, discoverable address that remains stable while agent software versions evolve. 14 | 15 | ## Core Design Principles 16 | 17 | 1. **FQDN-Anchored Identity**: Agent identities are cryptographically bound to globally resolvable domain names, providing a verifiable trust anchor. 18 | 2. **Dual Certificate Model**: 19 | - **Public Server Certificate**: Long-lasting certificate for the stable FQDN (time-based lifecycle) 20 | - **Private Identity Certificate**: Event-driven certificate for specific versioned agents (immutability-enforced) 21 | 3. **Strict Immutability**: Any code or metadata change requires a new version and registration, ensuring accountability per software version. 22 | 4. **Transparency Log**: Immutable, append-only ledger using Merkle trees, providing cryptographic proof of all registration history. 23 | 5. **Decentralized Discovery**: Registration Authority publishes lifecycle events to pub/sub; third-party services build competitive discovery indexes. 24 | 25 | ## Key Features 26 | 27 | - **PKI-Based Trust**: Certificate Authority and Registration Authority issue X.509 certificates for agent authentication 28 | - **Version-Centric Lifecycle**: Event-driven registration where each software version gets its own immutable identity 29 | - **Protocol-Agnostic**: Supports A2A, MCP, and ACP protocols through a unified registration model 30 | - **Domain Control Validation**: ACME DNS-01 challenge verifies agent ownership before registration 31 | - **Cryptographic Verification**: Merkle inclusion proofs enable independent verification of agent registrations 32 | 33 | ## Architecture 34 | 35 | The system consists of three main components: 36 | 37 | - **Registration Authority (RA)**: Orchestrates validation, certificate issuance, and log sealing 38 | - **Transparency Log (TL)**: Immutable, cryptographically verifiable ledger of all agent lifecycle events 39 | - **Key Management System (KMS)**: Centralized root of trust for signing Merkle tree roots 40 | 41 | ## Documentation 42 | 43 | - **[DESIGN.md](DESIGN.md)**: Complete architecture and design documentation 44 | 45 | ## Design Goals 46 | 47 | The ANS Registry addresses the O(n²) scaling problem of bilateral agent agreements by providing: 48 | 49 | - **Universal Discovery**: Standard method for agents to discover each other across protocols 50 | - **Automated Trust**: Cryptographic identity verification without manual configuration 51 | - **Auditability**: Complete, verifiable history of all agent registrations and lifecycle events 52 | - **Ecosystem Enablement**: Foundation for competitive marketplaces and discovery services 53 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, sex characteristics, gender identity and expression, 9 | level of experience, education, socio-economic status, nationality, personal 10 | appearance, race, religion, or sexual identity and orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies within all project spaces, and it also applies when 49 | an individual is representing the project or its community in public spaces. 50 | Examples of representing a project or community include using an official 51 | project e-mail address, posting via an official social media account, or acting 52 | as an appointed representative at an online or offline event. Representation of 53 | a project may be further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team at oss@godaddy.com. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | ## Attribution 69 | 70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 71 | available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html 72 | 73 | [homepage]: https://www.contributor-covenant.org 74 | 75 | For answers to common questions about this code of conduct, see 76 | https://www.contributor-covenant.org/faq 77 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | Everyone is welcome to contribute to GoDaddy's Open Source Software. 4 | Contributing doesn’t just mean submitting pull requests. To get involved, 5 | you can report or triage bugs, and participate in discussions on the 6 | evolution of each project. 7 | 8 | No matter how you want to get involved, we ask that you first learn what’s 9 | expected of anyone who participates in the project by reading the Contribution 10 | Guidelines and our [Code of Conduct][coc]. 11 | 12 | ## Answering Questions 13 | 14 | One of the most important and immediate ways you can support this project is 15 | to answer questions in [Github][issues]. Whether you’re 16 | helping a newcomer understand a feature or troubleshooting an edge case with a 17 | seasoned developer, your knowledge and experience with a programming language 18 | can go a long way to help others. 19 | 20 | ## Reporting Bugs 21 | 22 | **Do not report potential security vulnerabilities here. Refer to 23 | [SECURITY.md](./SECURITY.md) for more details about the process of reporting 24 | security vulnerabilities.** 25 | 26 | Before submitting a ticket, please search our [Issue Tracker][issues] to make 27 | sure it does not already exist and have a simple replication of the behavior. If 28 | the issue is isolated to one of the dependencies of this project, please create 29 | a Github issue in that project. All dependencies should be open source software 30 | and can be found on Github. 31 | 32 | Submit a ticket for your issue, assuming one does not already exist: 33 | - Create it on the project's [issue Tracker][issues]. 34 | - Clearly describe the issue by following the template layout 35 | - Make sure to include steps to reproduce the bug. 36 | - A reproducible (unit) test could be helpful in solving the bug. 37 | - Describe the environment that (re)produced the problem. 38 | 39 | ## Triaging bugs or contributing code 40 | 41 | If you're triaging a bug, first make sure that you can reproduce it. Once a bug 42 | can be reproduced, reduce it to the smallest amount of code possible. Reasoning 43 | about a sample or unit test that reproduces a bug in just a few lines of code 44 | is easier than reasoning about a longer sample. 45 | 46 | From a practical perspective, contributions are as simple as: 47 | 1. Fork and clone the repo, [see Github's instructions if you need help.][fork] 48 | 1. Create a branch for your PR with `git checkout -b pr/your-branch-name` 49 | 1. Make changes on the branch of your forked repository. 50 | 1. When committing, reference your issue (if present) and include a note about 51 | the fix. 52 | 1. Please also add/update unit tests for your changes. 53 | 1. Push the changes to your fork and submit a pull request to the 'main 54 | development branch' branch of the projects' repository. 55 | 56 | If you are interested in making a large change and feel unsure about its overall 57 | effect, start with opening an Issue in the project's [Issue Tracker][issues] 58 | with a high-level proposal and discuss it with the core contributors through 59 | Github comments. After reaching a consensus with core 60 | contributors about the change, discuss the best way to go about implementing it. 61 | 62 | > Tip: Keep your master branch pointing at the original repository and make 63 | > pull requests from branches on your fork. To do this, run: 64 | > ``` 65 | > git remote add upstream https://github.com/godaddy/ans-registry.git 66 | > git fetch upstream 67 | > git branch --set-upstream-to=upstream/master master 68 | > ``` 69 | > This will add the original repository as a "remote" called "upstream," Then 70 | > fetch the git information from that remote, then set your local master 71 | > branch to use the upstream master branch whenever you run git pull. Then you 72 | > can make all of your pull request branches based on this master branch. 73 | > Whenever you want to update your version of master, do a regular git pull. 74 | 75 | ## Code Review 76 | 77 | Any open source project relies heavily on code review to improve software 78 | quality. All significant changes, by all developers, must be reviewed before 79 | they are committed to the repository. Code reviews are conducted on GitHub 80 | through comments on pull requests or commits. The developer responsible for a 81 | code change is also responsible for making all necessary review-related changes. 82 | 83 | Sometimes code reviews will take longer than you would hope for, especially for 84 | larger features. Here are some accepted ways to speed up review times for your 85 | patches: 86 | 87 | - Review other people’s changes. If you help out, others will more likely be 88 | willing to do the same for you. 89 | - Split your change into multiple smaller changes. The smaller your change, 90 | the higher the probability that somebody will take a quick look at it. 91 | 92 | **Note that anyone is welcome to review and give feedback on a change, but only 93 | people with commit access to the repository can approve it.** 94 | 95 | ## Attribution of Changes 96 | 97 | When contributors submit a change to this project, after that change is 98 | approved, other developers with commit access may commit it for the author. When 99 | doing so, it is important to retain correct attribution of the contribution. 100 | Generally speaking, Git handles attribution automatically. 101 | 102 | ## Code Style and Documentation 103 | 104 | Ensure that your contribution follows the standards set by the project's style 105 | guide with respect to patterns, naming, documentation and testing. 106 | 107 | # Additional Resources 108 | 109 | - [General GitHub Documentation](https://help.github.com/) 110 | - [GitHub Pull Request documentation](https://help.github.com/send-pull-requests/) 111 | 112 | [issues]: https://github.com/godaddy/ans-registry/issues 113 | [coc]: ./CODE_OF_CONDUCT.md 114 | [fork]: https://help.github.com/en/articles/fork-a-repo -------------------------------------------------------------------------------- /DESIGN.md: -------------------------------------------------------------------------------- 1 | # Enhanced ANS/RA Software Architecture Document 2 | 3 | ## 1.0 Introduction and goals 4 | 5 | ### 1.1 Purpose 6 | This document describes the Agent Name Service (ANS) Registry system architecture. It shows different architectural views for communication between software architects, developers, and stakeholders. 7 | 8 | ### 1.2 Context and background 9 | This architecture implements the principles from initial industry drafts. It creates immutable, version-bound agent identities that coexist with traditional time-based public TLS certificates. 10 | 11 | ### 1.3 Foundational principles and departures from prior art 12 | 13 | This architecture builds on concepts from "Agent Name Service for Secure AI Agent Discovery" by Narajala, Huang, Habler, and Sheriff (OWASP). The Enhanced ANS/RA evolves these concepts for production with departures for operational robustness, verifiability, and internet-scale deployment. 14 | 15 | **1. Foundational principle: Identity is anchored to a stable FQDN** 16 | 17 | Every agent identity must anchor to a unique, globally resolvable Fully Qualified Domain Name (FQDN). 18 | 19 | The `ANSName` structure ties to a verifiable DNS asset rather than a logical construct. The `ANSName` components construct the agent's FQDN using: `agentName + . + extension`. 20 | 21 | This domain name becomes the agent's permanent, discoverable address. It remains consistent for Agent Hosting Platforms and discovery services while the versioned `ANSName` changes with software updates. Trust artifacts bind cryptographically to the AHP's verified domain control, making the domain the trust chain root. 22 | 23 | **2. Other key architectural enhancements** 24 | 25 | Building on this FQDN foundation, the architecture introduces: 26 | 27 | * **Trust bootstrapping via domain control validation:** The Registration Authority (RA) uses ACME DNS-01 challenge to verify ownership. The RA attests to agent identity only after the owner proves domain control. 28 | 29 | * **Decentralized discovery model:** A publish-subscribe model enables decentralized discovery. The RA publishes signed attestation events, creating a competitive market of third-party discovery services rather than centralized queries. 30 | 31 | * **Version-centric lifecycle:** Any change to agent software or capabilities requires a new versioned `ANSName` and registration. This event-driven lifecycle creates a granular audit trail compared to time-based models. 32 | 33 | * **Dual-certificate identity model:** Two certificates resolve the conflict between public web trust and software versioning. A Public Server Certificate grants universal trust for the stable FQDN. A Private Identity Certificate enables fast, automated attestation for the version-bound `ANSName`. 34 | 35 | * **Explicit discoverable schemas:** The Agent Card requires each protocol to link to a canonical JSON Schema URL. This makes schemas first-class, discoverable artifacts for developers, unlike prior models where schema logic remains internal to server-side protocol adapters. 36 | 37 | ### 1.4 Definitions, Acronyms, and Abbreviations 38 | 39 | | Term | Definition | 40 | | :--- | :--- | 41 | | **A2A** | Agent-to-Agent protocol: A protocol for agent collaboration, notably used by Google. | 42 | | **ACP** | Agent Communication Protocol: An open standard for interoperability between AI agents, built on a REST-based architecture. | 43 | | **AHP** | Agent Hosting Platform: The client system that hosts agent code and initiates all lifecycle requests on behalf of its customer. | 44 | | **ANS** | Agent Name Service: A universal directory service for secure AI agent discovery and interoperability. | 45 | | **CA** | Certificate Authority: An entity that issues digital certificates. Public APIs (OCSP) and distribution points (CRLs) are hosted by a CA to check the real-time revocation status of a certificate. | 46 | | **CSR** | Certificate Signing Request: A message sent from an applicant to a CA in order to apply for a digital identity certificate. | 47 | | **DNS** | Domain Name System: The decentralized, hierarchical naming system for computers, services, or other resources connected to the Internet. | 48 | | **FQDN** | Fully Qualified Domain Name: The complete, unambiguous domain name that specifies a specific host's absolute location in the DNS hierarchy. It is composed of the hostname and all parent domains. | 49 | | **KMS** | Key Management System: The centralized system that manages and protects the cryptographic root of trust for the registry. | 50 | | **MCP** | Model Context Protocol: A protocol that focuses on an agent's communication with its tools. | 51 | | **PKI** | Public Key Infrastructure: A set of roles, policies, and procedures needed to create, manage, and revoke digital certificates. | 52 | | **RA** | Registration Authority: The central orchestrator that automates the agent registration, validation, and attestation process. | 53 | | **SAD** | Software Architecture Document: This document. | 54 | | **TL** | Transparency Log: The immutable, cryptographically verifiable ledger that contains a record of all registered agent identities. | 55 | 56 | ### 1.5 Project goals 57 | 58 | The project creates a single immutable Transparency Log to record all agent identity events. Server Certificates separate from Identity Certificates due to different lifecycles: Server Certificates track web server uptime while Identity Certificates track software versions. Every code change requires a new identity registration. 59 | 60 | The `ra_id` field identifies which Registration Authority instance processed each request for forensic investigation. Agent Hosting Platforms manage public key infrastructure and DNS operations, removing these burdens from agent developers. 61 | 62 | ### 1.6 Business goals and use cases 63 | 64 | The ANS Registry enables autonomous AI agents to find and trust each other across organizational boundaries. Without this trust layer, every agent provider needs bilateral agreements with every potential partner, creating an O(n²) scaling problem. 65 | 66 | The registry automates certificate lifecycle management, DNS record provisioning, and cryptographic identity binding. SMBs can integrate their customer support chatbots with third-party payment processors and CRM systems without manual configuration. The verifiable identity model enables business models where agents charge per API call or require subscriptions with cryptographic proof of service delivery. 67 | 68 | ## 2.0 Architectural views 69 | 70 | ### 2.1 Logical view (static component relationships) 71 | 72 | The system has three security domains interconnected by the Registration Authority. 73 | 74 | The Agent Platform Domain contains the Agent Hosting Platform and the public-facing DNS Provider. The Trust Authority Domain includes the Centralized Key Management System, the Transparency Log, and the Provider Registry as core trust infrastructure. The Certificate Domain contains both the Public and Private Certificate Authorities. 75 | 76 | ![High-Level Component Diagram of the ANS Registry Ecosystem](component-diagram.png) 77 | *Figure 1: High-Level Component Diagram of the ANS Registry Ecosystem* 78 | 79 | ### 2.2 Process view (dynamic interaction and flow) 80 | 81 | The Agent Registration Flow dominates system behavior. The RA receives a request, proxies validation and provisioning, aggregates the final state, seals a non-repudiable entry into the Transparency Log upon success, then delivers artifacts to the AHP. 82 | 83 | A continuous, post-registration Integrity Verification Flow runs in the background via the ANS Integrity Monitor (AIM). The AIM queries provisioned DNS records for active agents, compares them to expected states from registration, and triggers remediation when detecting unauthorized changes or discrepancies. 84 | 85 | ## 3.0 Component model 86 | 87 | The ANS Registry ecosystem contains major systems and components. The architecture connects a central Registration Authority System with the Agent Hosting Platform (AHP) System and Internet Infrastructure Dependencies. 88 | 89 | ### 3.1 The Registration Authority system 90 | 91 | The RA serves as the trusted third party at the center of the ANS ecosystem. It validates agent identities, orchestrates certificate issuance, and seals immutable records into the Transparency Log. 92 | 93 | **3.1.1 Registration Authority:** The stateful orchestration engine processes registration requests, coordinates with external services, and manages the agent lifecycle from registration through revocation. 94 | 95 | **3.1.2 Centralized Key Management System (KMS):** AWS KMS (or similar) holds the private key that signs the Transparency Log's Merkle Tree Root. This key provides the cryptographic root of trust for the system. 96 | 97 | **3.1.3 Provider Registry:** Maps immutable ProviderIDs (`PID-1234`) to current legal entity names. When "AcmeCorp" becomes "MegaCorp", one record updates instead of re-registering thousands of agents. 98 | 99 | **3.1.4 ANS Integrity Monitor:** The AIM is now defined as an external, third-party ecosystem role (see Section 3.4.2). The RA system itself may run its own AIM instance as a reference implementation and for its own operational awareness, but the architecture treats monitoring as a decentralized function. The RA's internal AIM instance validates the attestation chain continuously. Verifies DNS records, Agent Card cryptographic integrity, and linked capability schemas against registration hashes. Triggers remediation when discrepancies occur. 100 | 101 | **3.1.4 ANS Integrity Monitor:** The AIM is defined as an external, third-party ecosystem role (see Section 3.4.2). The RA may run its own AIM instance as a reference implementation and for operational awareness. The architecture treats monitoring as a decentralized function. 102 | 103 | The RA's internal AIM instance validates the attestation chain continuously. It verifies DNS records, Agent Card cryptographic integrity, and linked capability schemas against registration hashes. It triggers remediation when discrepancies occur. 104 | 105 | **3.1.5 Interfaces Hosted by the RA System:** 106 | * **Dynamic Badge Lander:** Shows real-time trust status for agents - green checkmark if valid, red X if compromised 107 | * **Audit Log Viewer:** Forensic history of state changes with cryptographic proofs 108 | * **Lifecycle Management APIs:** Private API endpoints where AHPs submit registrations, renewals, and revocations 109 | 110 | ### 3.2 The Agent Hosting Platform system 111 | 112 | The AHP client system hosts the agent's code and business logic. It initiates lifecycle requests to the RA System on behalf of the agent's owner. 113 | 114 | The AHP maintains public-facing resources associated with the agent's FQDN: the Agent Functional Endpoints, the Agent Card metadata file, and JSON Schema files linked from the Agent Card for each supported protocol. 115 | 116 | **3.2.1 Agent Hosting Platform:** The core client platform integrates with the RA's APIs. 117 | 118 | **3.2.2 Interfaces Hosted by the AHP System:** 119 | * **Agent Functional Endpoint (API):** Live service with the agent's core functionality 120 | * **Agent Card (Data):** Machine-readable JSON file at a canonical URL containing agent capability metadata, with URLs pointing to each protocol's JSON Schema definition 121 | 122 | ### 3.3 Foundational infrastructure & dependencies 123 | 124 | Core third-party systems support the ANS/RA architecture. 125 | 126 | **3.3.1 Transparency Log (TL):** 127 | 128 | The immutable, cryptographically verifiable ledger contains permanent records of agent identity lifecycle events. The TL uses a global Merkle tree architecture. 129 | 130 | Events process in batches every 5 seconds or after 1000 events. Each event receives a globally unique, monotonically increasing sequence number for append-only semantics. 131 | 132 | The system uses SHA-256 binary Merkle trees for cryptographic integrity. Leaf nodes contain deterministic hashes of canonicalized event data. The Key Management System signs each new Merkle root after batch completion, creating a verifiable checkpoint. 133 | 134 | The TL generates cryptographic proofs for event existence at specific positions. The architecture supports O(log n) append operations by caching intermediate node hashes. 135 | 136 | **3.3.1.1 Public verification interface:** 137 | 138 | The Transparency Log exposes a REST API for external verifiers. The API provides: 139 | - Current and historical Signed Tree Heads (STH) 140 | - Merkle inclusion proofs for specific events 141 | - Consistency proofs between tree states 142 | - Public signing keys for verification 143 | - Event queries by ANS name or batch identifier 144 | 145 | **3.3.1.2 Key distribution mechanism:** 146 | 147 | The Transparency Log implements multi-channel key distribution: 148 | - Primary: HTTPS endpoints at `/v1/keys/*` for current and historical public keys 149 | - Secondary: DNS TXT records for key fingerprint verification 150 | 151 | Each key associates with a `tree_version` that increments on rotation. Public keys support caching with ETags and cache-control headers. 152 | 153 | **3.3.1.3 Producer Signature Validation:** 154 | 155 | Producer signatures validate internally upon event receipt and include in sealed log entries: 156 | - Internal validation confirms event authenticity from authorized RA instances 157 | - Producer signatures become part of the immutable record 158 | - Producer public keys remain private, disclosed only for authorized forensic investigations 159 | - External verifiers trust the TL's validation as part of the trust model 160 | 161 | **3.3.2 Pub/Sub system:** 162 | 163 | The asynchronous messaging bus receives lifecycle events from the RA for Discovery Service consumption. Each event payload includes a digital signature from the RA for authenticity and integrity verification. 164 | 165 | **3.3.3 Producer authentication and event submission** 166 | 167 | The Transparency Log uses two-phase trust establishment for event producers (RA instances). 168 | 169 | **Phase 1 - Key registration:** 170 | Registration Authority instances register public keys with the Transparency Log before submitting events. Registration establishes cryptographic identity through an internal API. Keys register with instance identifier (`ra_id`), signing algorithm, and validity period. 171 | 172 | Authentication uses JWT tokens from the IAM system. The `ra_id` in token claims must match the registered public key identifier. Key rotation supports configurable overlap periods. Keys can pre-register with future `valid_from` dates. 173 | 174 | **Phase 2 - Event submission:** 175 | Producers submit events signed with registered private keys. Each event includes a detached JWS signature and `producer_key_id`. The Transparency Log validates signatures using registered public keys before accepting events. 176 | 177 | Validated events receive globally unique sequence numbers and enter the batch processing queue. The `log_id` provides unique identification for idempotency checking. 178 | 179 | **Internal validation flow:** 180 | Producers submit events with signatures to `/internal/v1/events`. The TL retrieves producer public keys, validates signatures and event structure. Valid events receive sequence numbers and queue for batch processing. Producers receive acknowledgments with sequence numbers. Events seal into the Merkle tree during batch processing. 181 | 182 | **3.3.4 DNS provider:** 183 | 184 | Manages public DNS zone files for agent FQDNs. The RA interacts via API (e.g., Domain Connect) to provision records. 185 | 186 | ### 3.3.5 Certificate Authorities (CAs) 187 | 188 | The architecture uses two distinct Certificate Authority types: 189 | 190 | * **Public Certificate Authority (Public CA):** Standard, universally trusted CA (e.g., Let's Encrypt, DigiCert) issues the Public Server Certificate (`PubSC`) securing the agent's stable FQDN. Public revocation services (OCSP/CRL) serve any internet client. 191 | 192 | * **Private Certificate Authority (Private CA):** Dedicated CA operated by the Registration Authority issues event-driven Private Identity Certificates (`PriCC`) attesting to version-bound `ANSName`. Revocation services remain private to the ANS ecosystem. 193 | 194 | This separation allows the `PubSC` to follow slow, time-based public WebPKI rules for universal compatibility while the `PriCC` maintains the fast, event-driven lifecycle needed for agent identities. 195 | 196 | ### 3.4 Third-party ecosystem 197 | 198 | The RA is the source of truth for registration, while independent external systems consume RA System public outputs for value-added services. 199 | 200 | **3.4.1 Discovery service:** 201 | 202 | Third-party applications consume the RA's Pub/Sub feed to build searchable agent indexes accessible through their own UI and API. 203 | 204 | **3.4.2 ANS Monitoring Service:** 205 | 206 | Third-party ANS Monitoring Services provide continuous integrity verification for registered agents. These services consume the RA's public event feed to build a list of active agents and their expected states (metadata hashes). They perform the ANS Integrity Monitor (AIM) function as independent, external services. 207 | 208 | A competitive marketplace of monitoring services can emerge, offering different service levels: verification frequency, geographic distribution of workers, and alerting features. The RA is the source of truth for registration. ANS Monitors audit the live state of the internet against that truth. 209 | 210 | ## 4.0 Data model & integrity 211 | 212 | ### 4.1 The canonical ANSName structure 213 | 214 | The immutable, six-part identifier for a registered agent acts as its primary, version-bound identity. Canonical Example (post-ADR 005): `mcp://sentimentAnalyzer.textAnalysis.PID-1234.v1.0.0.example.com` 215 | 216 | | Component | Description | Example | 217 | | :--- | :--- | :--- | 218 | | **Protocol** | The mandatory scheme that specifies the agent's communication protocol. | `mcp` | 219 | | **agentName** | The unique, provider-assigned name for the agent service. Also the subdomain for constructing the agent's FQDN (e.g., `agentName.extension`). | `sentimentAnalyzer`| 220 | | **capability** | The high-level function the agent exposes. | `textAnalysis` | 221 | | **ProviderID**| A non-semantic, unique, and immutable identifier for the owning entity (per ADR 005). | `PID-1234` | 222 | | **version** | A semantic version (`major.minor.patch`) that is strictly bound to the agent's code. | `v1.0.0` | 223 | | **extension** | A fully-qualified domain name acting as the trust anchor. | `example.com` | 224 | 225 | ### 4.2 Cryptographic and operational identifiers 226 | 227 | #### 4.2.1 Operational Instance ID (ra_id) 228 | 229 | The `ra_id` uniquely identifies the specific runtime instance of the RA that processed a request - for example `RA-USEAST1-PROD-03` for the third production instance in AWS us-east-1. It supplies granular data for forensic auditing and traceability. The specific format and generation method for the `ra_id` are detailed in `IMPLEMENTATION_GUIDE.md`. 230 | 231 | #### 4.2.2 Cryptographic Root ID (kms_key_id) 232 | Identifies the unique key within the KMS used to sign the Merkle Tree Root (e.g., `arn:aws:kms:us-west-2:key/RootKey-A`). 233 | 234 | ### 4.3 Certificate integrity 235 | 236 | **Server Certificate:** Issued by a Public CA, contains the FQDN in the SAN for standard TLS. 237 | 238 | **Identity Certificate:** Issued by a Private CA, contains the full `ANSName` as a `uniformResourceIdentifier` within the SAN for an unbreakable cryptographic binding. 239 | 240 | ### 4.4 Agent state lifecycle 241 | 242 | The `agent_state` field follows a defined lifecycle and transitions between states based on specific events recorded in the Transparency Log. The diagram below illustrates all possible states and their triggering events. 243 | 244 | ![State Machine Diagram of the Agent Registration Lifecycle](state-machine-diagram.png) 245 | *Figure 2: State Machine Diagram of the Agent Registration Lifecycle* 246 | 247 | ### 4.5 Cryptographic data integrity standards 248 | 249 | The ANS Registry components require these cryptographic standards for consistent and verifiable data integrity: 250 | 251 | #### 4.5.1 JSON Canonicalization Scheme (JCS) 252 | 253 | All JSON data requiring cryptographic signing or hashing must use JSON Canonicalization Scheme (RFC 8785) before cryptographic operations. 254 | 255 | JCS ensures the same logical JSON object produces the same byte sequence through deterministic serialization. Different implementations produce identical hashes for the same data, maintaining cross-platform consistency. Leaf hashes in the Transparency Log remain stable and verifiable. 256 | 257 | #### 4.5.2 JSON Web Signature (JWS) format 258 | 259 | All digital signatures in the ANS ecosystem MUST use JWS Detached Signature format. 260 | 261 | **Detached Signature Requirement:** 262 | 263 | Signatures MUST be stored separately from their payloads at all levels of the system. The payload MUST NOT be embedded within the JWS structure (no Base64URL-encoded payload in the JWS). 264 | 265 | **Preventing Circular Dependencies:** 266 | 267 | When a signature resides in the same JSON object as the data it signs, the signature fields must be excluded from the signed payload. The signature scope must be explicitly defined (e.g., "all fields except signature and signature_kms_key_id"). During verification, implementations must extract only the signed fields before canonicalization and verification. Without exclusion, a signature would need to include itself, creating an impossible scenario. 268 | 269 | **Benefits:** 270 | 271 | The detached approach enables: 272 | - Independent storage and transmission of data and signatures 273 | - Processing without Base64 encoding/decoding overhead 274 | - Separation of data and attestation concerns 275 | - Adding or verifying signatures without modifying original data 276 | - JSON structures without circular dependency issues 277 | 278 | **Technical Requirements:** 279 | 280 | The default algorithm is ES256 (ECDSA with P-256 and SHA-256), with provisions for algorithm agility. 281 | 282 | Protected headers must include: 283 | - `alg`: signing algorithm 284 | - `kid`: key identifier for the signing key 285 | - `typ`: type indicator (e.g., "JWT" for attestations) 286 | - `tsp`: Unix timestamp of signature creation 287 | - `raid`: RA instance identifier that created the signature 288 | 289 | The payload consists of the JCS-canonicalized JSON object being signed (stored separately). The signature format follows `..signature` (note the two dots with empty payload section). 290 | 291 | Use cases include Transparency Log batch signatures (Signed Tree Heads), RA attestation badges, Pub/Sub event payloads, and critical lifecycle requests such as revocations. 292 | 293 | **JSON Structure and Signature Scope:** 294 | To prevent circular dependencies when signatures are stored within JSON objects: 295 | 296 | 1. **Explicit Field Exclusion Pattern:** 297 | ```json 298 | { 299 | "data_field_1": "value1", 300 | "data_field_2": "value2", 301 | "signature": "..." // This field is NOT included in the signed payload 302 | } 303 | ``` 304 | Signature covers: `{"data_field_1": "value1", "data_field_2": "value2"}` 305 | 306 | 2. **Nested Structure Pattern (Recommended):** 307 | ```json 308 | { 309 | "data": { 310 | "field_1": "value1", 311 | "field_2": "value2" 312 | }, 313 | "signature": "..." // Signs only the "data" object 314 | } 315 | ``` 316 | Signature covers: The entire `data` object 317 | 318 | 3. **Multi-Level Signature Pattern:** 319 | ```json 320 | { 321 | "event": { 322 | "type": "registered", 323 | "ans_name": "...", 324 | "producer_signature": "..." // Signs the event minus this field 325 | }, 326 | "batch_signature": "..." // Signs the entire event object 327 | } 328 | ``` 329 | - `producer_signature` covers: event object excluding itself 330 | - `batch_signature` covers: entire event object (including producer_signature) 331 | 332 | ## 5.0 Trust, security, & attestation 333 | 334 | Core security boundaries and trust mechanisms define the ANS Registry ecosystem. 335 | 336 | ### 5.1 Principle of layered trust 337 | 338 | A three-layer hierarchy anchors trust. The Identity Layer uses the strictly defined, version-bound `ANSName`. The Cryptographic Layer signs the Merkle Tree Root of the Transparency Log with a key controlled by the Centralized KMS. The Operational Layer uniquely identifies the specific RA instance that performed validation through the `ra_id` for forensic accountability. 339 | 340 | #### 5.1.1 Tiers of Trust: Layering DANE on the PKI Foundation 341 | 342 | The ANS Registry uses DANE (DNS-Based Authentication of Named Entities) to strengthen the PKI trust model. DANE binds X.509 certificates to DNS names through TLSA records, creating cryptographic proof that a certificate belongs to a specific domain. Different tiers of trust require more robust use of DANE. 343 | 344 | **Bronze Tier: Standard PKI**. Basic TLS using certificates from public CAs. Agents accept any valid certificate signed by a trusted CA. This provides encryption and basic authentication but remains vulnerable to CA compromise or mistaken issuance. 345 | 346 | **Silver Tier: DANE-Enhanced PKI**. Adds TLSA record validation to Bronze tier checks. The agent's certificate fingerprint is published in DNS via a TLSA record at _443._tcp.agent.example.com. Clients verify both the certificate chain and the TLSA record match. If either check fails, the connection is rejected. 347 | 348 | This requires: 349 | - Valid certificate from trusted CA 350 | - Matching TLSA record in DNS 351 | - DNSSEC signatures on the DNS zone 352 | 353 | **Gold Tier: DANE with Trust-on-First-Use**. Extends Silver tier with persistent trust storage. On first connection, clients store the validated certificate fingerprint locally. Future connections require the same certificate, preventing undetected substitution even if both a CA and DNS are compromised simultaneously. 354 | 355 | **Implementation in ANS Registry**. The RA provisions TLSA records during agent registration: `_443._tcp.agent.example.com IN TLSA 3 1 1 [sha256_hash]` 356 | 357 | Where: 358 | - 3 = Domain-issued certificate 359 | - 1 = Match full certificate 360 | - 1 = SHA-256 hash 361 | 362 | Clients implementing Silver tier validation: 363 | 1. Resolve TLSA record for the agent's domain 364 | 2. Validate DNSSEC signatures 365 | 3. Connect via TLS and obtain certificate 366 | 4. Verify certificate hash matches TLSA record 367 | 5. Proceed only if all checks pass 368 | 369 | This approach prevents certificate substitution attacks that standard PKI cannot detect, while remaining compatible with clients that only support Bronze tier validation. 370 | 371 | ### 5.2 Attestation process and verifiability 372 | Attestation proves that an agent's identity was successfully validated. A multi-layered system anchored in DNS and the Transparency Log makes this verifiable. 373 | 374 | #### 5.2.1 DNS trust anchor 375 | The RA provisions a `_ra-badge` TXT record containing the URL to the agent's unique Dynamic Badge Lander. This record establishes the initial discovery point for verification. 376 | 377 | #### 5.2.2 Immediate status check (Dynamic Badge Lander) 378 | The RA-hosted page answers "Is this agent trustworthy right now?" by displaying the current agent state from the latest Transparency Log entry, the Merkle inclusion proof for the registration event, and the signed attestation badge with verifiable JWS signature. 379 | 380 | #### 5.2.3 Cryptographic verification path 381 | High-assurance verification requires validating this chain: 382 | 1. **DNS Record Integrity:** Verify the `_ra-badge` TXT record via DNSSEC 383 | 2. **Badge Signature:** Validate the JWS signature on the attestation badge using the RA's public key 384 | 3. **Merkle Inclusion:** Verify the inclusion proof showing the event exists in the Transparency Log 385 | 4. **Root Signature:** Validate the Signed Tree Head (STH) using the KMS key identifier 386 | 5. **State Consistency:** Check that the agent's current state matches the latest log entry 387 | 388 | #### 5.2.4 Deep forensic history (Audit Log Viewer) 389 | The Badge Lander links to the Audit Log Viewer, which contains complete chronological history of all state transitions, Merkle inclusion proofs for each historical event, the ability to verify the entire chain of custody, and cross-references to related events (registrations, renewals, revocations). 390 | 391 | ### 5.3 Operational and forensic integrity 392 | Any version change to the `ANSName` mandates revocation and re-registration of the Identity Certificate to enforce immutability. The `ra_id` in the Transparency Log allows auditors to isolate all log entries processed by a single compromised operational instance. Using separate Server and Identity Certificates prevents a compromise or expiration of one from affecting the other. 393 | 394 | ### 5.4 Key management and storage 395 | The RA never generates, handles, or has access to an agent's private keys. The AHP remains solely responsible for the entire lifecycle of the private key. The RA uses separate, unique authentication credentials for each external service integration. These credentials are rotated on a regular schedule and stored in a dedicated secret management system. 396 | 397 | ### 5.5 Agent discovery model 398 | Agent discovery is intentionally decoupled from the core RA. The architecture creates a competitive ecosystem of third-party Discovery Service Providers by broadcasting all public lifecycle events via the Pub/Sub System. 399 | 400 | The discovery process works in two stages. First, real-time indexing occurs when a Discovery Service subscribes to the Pub/Sub feed. It must cryptographically verify the signature on each event payload to confirm it originated from a trusted RA. Upon receiving a verified event payload, it immediately indexes the essential metadata contained within the message's `meta` object, using the stable `FQDN` as the primary key. New agents become discoverable for basic queries within seconds. 401 | 402 | Second, asynchronous augmentation happens when the Discovery Service's crawler uses the `agent_card_url` from the event payload to optionally fetch the full, provider-hosted Agent Card. It then parses this file to augment its index with advanced metadata including detailed capability descriptions and parameter schemas. 403 | 404 | ### 5.6 Coexistence with other trust models 405 | The ANS Registry architecture serves as a foundational identity layer without replacing existing authentication schemes. An agent hosted at a stable FQDN supports multiple authentication protocols simultaneously. 406 | 407 | A client using a token-based protocol like OpenAI's ACP connects to the agent's stable FQDN, secured by the standard, 408 | time-based Public Server Certificate. An ANS-aware agent connects to the same FQDN but initiates mTLS, 409 | presenting its event-driven Private Identity Certificate to prove its specific, version-bound `ANSName`. 410 | 411 | This coexistence model addresses two scenarios. Simple agents or legacy clients may fall back to token-based 412 | authentication over standard TLS, which is a lower-assurance interaction pattern. High-assurance, ANS-to-ANS 413 | communication between agents registered with different RAs requires a different solution to bridge the private trust 414 | domains. ADR 009 (Solving the Trust Bootstrap Problem via a Client-Side Trust Provisioner) details the architectural 415 | solution and its evolution in a multi-provider ecosystem. 416 | 417 | ### 5.7 Channel vs. message-level security 418 | The architecture uses two layers of security: channel security and message-level security. 419 | 420 | All point-to-point API calls (e.g., AHP-to-RA, RA-to-CA) are secured using TLS for channel security. TLS protects authentication, confidentiality, and integrity for the communication session. For most transient, synchronous commands between trusted parties, channel security suffices. 421 | 422 | Payloads receive digital signatures when they represent durable, non-repudiable artifacts intended for third-party or asynchronous verification. Digital signatures prove origin and integrity that persists long after the communication session ends. The RA Attestation Badge JSON carries the RA's public attestation. The Pub/Sub Event Payload allows subscribers to verify event authenticity. Critical AHP requests like Agent Revocation Requests require signatures for non-repudiation. 423 | 424 | The signature verification hierarchy operates at three levels. Level 1 makes TL root signatures publicly verifiable using keys from `/v1/keys/*`. Level 2 makes RA attestation badges publicly verifiable using the RA's published public key. Level 3 includes producer signatures in log entries but verifies them only internally. Their presence demonstrates the complete chain of custody without requiring external verification infrastructure. 425 | 426 | The multi-level signature pattern looks like this: 427 | ```json 428 | { 429 | "event": { 430 | "type": "registered", 431 | "ans_name": "...", 432 | "producer_signature": "...", // Included but not publicly verifiable 433 | "producer_key_id": "..." // For forensic reference 434 | }, 435 | "tree_head": { 436 | "root_hash": "...", 437 | "tree_signature": "..." // Publicly verifiable via /v1/keys/* 438 | } 439 | } 440 | ``` 441 | 442 | Producer signatures form part of the immutable record. Tree signatures can be verified publicly via `/v1/keys/*` endpoints. Producer signatures maintain the complete audit trail. 443 | 444 | For internal verification, the TL validates producer signatures using registered keys from an internal registry. For public verification, anyone can verify tree signatures using publicly distributed TL keys. Producer signatures are included in responses to maintain record completeness. The separation reduces complexity for external verifiers while preserving forensic capabilities. 445 | 446 | ### 5.8 Private vs. public audit trails 447 | The ANS Registry maintains two distinct types of logs for different purposes: a private operational log and the public Transparency Log. 448 | 449 | The Private Operational Log resides in the RA's internal database for detailed, fine-grained forensic analysis and debugging of the RA's internal workflows. The RA's operators and developers use it to track all internal milestone events like `domain_validation_complete` and `certificate_issued`. 450 | 451 | The Public Transparency Log (TL) acts as the immutable, cryptographically verifiable ledger for public consumption. It attests to finalized state changes in a non-repudiable way. The TL contains only final, meaningful public events like `ra_badge_created`, `agent_revoked`, and `agent_renewed`. 452 | 453 | ### 5.9 Producer key management 454 | 455 | The TL maintains a private registry of producer (RA instance) public keys for internal signature validation. 456 | 457 | Each RA instance must register at least one active public key before submitting events. Keys must specify the signing algorithm (ES256, RS256, etc.) and should include an expiration date to enforce rotation practices. The `ra_id` in the key registration must match the `ra_id` in the IAM JWT token claims. 458 | 459 | The key rotation protocol allows new keys to be registered with future `valid_from` dates. During rotation, both old and new keys remain active for an overlap period (default 24 hours). Zero-downtime rotation handles in-flight events gracefully. Old keys are automatically marked as expired after the overlap period. 460 | 461 | Producer private keys never leave the RA instance. Public keys are only accessible via the internal API with proper authentication. Compromised keys can be immediately revoked via the `/internal/v1/producer-keys/{key_id}` DELETE endpoint. Historical signatures remain valid even after key expiration but not after revocation. 462 | 463 | All producer keys are retained indefinitely for forensic analysis. Key usage statistics track total signatures and last used timestamps. During security incidents, specific producer keys can be queried to identify affected events. 464 | 465 | ### 5.10 Ecosystem security considerations 466 | Query privacy is out of scope for the RA since it does not handle discovery queries. However, third-party Discovery Services built on the ANS platform need critical security and privacy protections. Discovery Services should implement privacy-preserving techniques mentioned in the OWASP paper: Private Information Retrieval and Anonymized Query Relays. 467 | 468 | ### 5.11 Ecosystem Integrity and Remediation ### 469 | 470 | To support the externalization of the ANS Integrity Monitor (AIM) role, the RA requires a secure mechanism to receive and act upon integrity failure reports without creating an attack vector. A malicious monitoring service could attempt to disable valid agents by flooding the RA with false reports. 471 | 472 | The remediation process must follow these principles: 473 | 474 | 1. Monitors report, the RA adjudicates: External ANS Monitoring Services cannot command the RA to change an agent's state. They publish findings. The RA remains the sole authority for state changes. 475 | 476 | 2. Public, signed reports: Monitors should publish findings (successes and failures) to their own public, cryptographically signed feeds. This creates transparency and a verifiable reputation for the monitor. Monitors that frequently report false positives lose credibility. 477 | 478 | 3. Adjudication by quorum: The RA's internal Remediation Service must not take automated action (e.g., SUSPENDED state) based on a single report from one monitor. It must require corroborating reports from multiple, independent, reputable monitoring services before triggering automated state changes. 479 | 480 | 4. Verifiable evidence required: Any integrity failure report submitted to the RA must contain verifiable, cryptographic proof of the detected discrepancy. The RA's remediation service must independently re-verify this evidence before accepting the report as valid. 481 | 482 | ## 6.0 Operational flow 483 | 484 | The complete lifecycle of an agent's identity within the ANS ecosystem differs from simpler models. Unlike simpler, time-based directory models where a single registration has a Time-to-Live (TTL), the ANS architecture employs a more granular, event-driven lifecycle. Each unique software version of an agent (including metadata such as Agent Card) is registered with its own immutable `ANSName`. The validity of this identity is not tied to a registration TTL, but to the validity period of its underlying, cryptographically-bound Identity Certificate. This version-centric approach, governed by the principle of "Strict Immutability," provides a much more precise and auditable trust model. 485 | 486 | ### 6.1 Initial registration flow (full orchestration) 487 | The end-to-end process for new agent registration follows a multi-stage approach. An `ANSName` can be reserved in a `pending` state before all technical validations are complete, then transition to an `active` state. 488 | 489 | ```mermaid 490 | sequenceDiagram 491 | participant AHP 492 | participant RA 493 | participant CA as Public CA 494 | participant DNS as DNS Provider 495 | participant TL as Transparency Log
(TL) 496 | participant PubSub as Pub/Sub System 497 | participant Discovery as Discovery Service 498 | 499 | autonumber 500 | Note over AHP,Discovery: Registration Request 501 | AHP->>RA: POST /register
(Request Payload) 502 | activate RA 503 | RA-->>AHP: 202 Accepted 504 | deactivate RA 505 | 506 | Note over RA,DNS: Certificate Issuance with ACME DNS Challenge 507 | RA->>DNS: Add ACME
Challenge Record 508 | activate DNS 509 | DNS-->>RA: 200 OK 510 | deactivate DNS 511 | 512 | RA->>CA: Request
Validation 513 | activate CA 514 | 515 | Note over CA,DNS: CA validator queries
DNS for ACME
challenge record 516 | 517 | CA->>CA: Finalize &
Issue Certs 518 | 519 | CA-->>RA: Return
Certificates 520 | deactivate CA 521 | 522 | Note over RA,DNS: DNS Provisioning 523 | RA->>DNS: Provision
Final Records 524 | activate DNS 525 | DNS-->>RA: 200 OK 526 | deactivate DNS 527 | 528 | Note over RA,TL: Transparency Log Sealing 529 | RA->>TL: Seal
Attestation Event 530 | activate TL 531 | TL-->>RA: Return log_id 532 | deactivate TL 533 | 534 | Note over RA,AHP: Certificate Delivery 535 | RA->>AHP: Deliver
Certificates 536 | 537 | Note over RA,Discovery: Event Publishing 538 | RA->>PubSub: Publish Registration Event 539 | activate PubSub 540 | PubSub->>Discovery: Notify Subscriber 541 | deactivate PubSub 542 | ``` 543 | *Figure 3: Sequence Diagram of the Initial Agent Registration Flow* 544 | 545 | #### 6.1.1 Stage 1: Pending registration 546 | The AHP initiates the registration process. 547 | 548 | * **Submission:** The AHP submits a `Registration Request` via a secure `POST` to the RA's Lifecycle Management API. This JSON payload must contain: 549 | * Transactional Intent (e.g., `request_type: "new_registration"`). 550 | * Identity Components (the agent's stable FQDN and the `ANSName` components). 551 | * Cryptographic Materials (CSRs for both Server and Identity certificates). 552 | * Full Agent Card (the complete JSON object, embedded as `agent_card_payload`). 553 | * **Initial Validation & State Change:** The RA validates the payload schema. If valid, the RA creates an internal record for the `ANSName` and sets its status to `pending`. In this state, no public actions are taken. 554 | 555 | #### 6.1.2 Stage 2: Activation 556 | Once the RA has a complete and valid `pending` registration, activation begins. 557 | 558 | * **Asynchronous Validation:** The RA orchestrates the required external validations. These checks MUST all pass before activation can proceed and include: 559 | * **Organization Identity Verification:** Verifying the legal entity of the provider for OV-level attestations. 560 | * **Domain Control Validation:** The Registration Authority performs the ACME DNS-01 challenge. This cryptographically verifies the AHP's control over the domain. 561 | * **Schema Integrity Validation:** For each protocol in the Agent Card's `protocolExtensions` block, the RA fetches the schema content from the provided URL. It then calculates the hash and verifies it matches the `schema.hash` value. 562 | 563 | * **Atomic Activation Process:** Upon the successful completion of all validations, the RA performs the following irreversible sequence: 564 | * **a. Hybrid Certificate Issuance:** The RA procures the time-based Server Certificate and the event-driven, version-bound Identity Certificate. 565 | * **b. DNS Provisioning:** The RA publishes the full suite of agent DNS records (`HTTPS`, `TLSA`, `_ans`, `_ra-badge`). 566 | * **c. Event Payload Generation:** The RA prepares the `agent_registered` event payload for public notification and auditing. This involves two key actions with the `agent_card_payload`: 567 | * **Hashing for Integrity:** It calculates a cryptographic hash of the full `Agent Card` content and stores **only this hash**. This serves as the authoritative "fingerprint" for future integrity checks by the AIM. 568 | * **Summarizing for Discovery:** It extracts a lightweight summary (the `description` and `capabilities` list) to include in the event's `meta` object for efficient indexing by discovery services. 569 | * **d. Log Sealing (Point of No Return):** The RA submits the prepared and signed event payload to the central Transparency Log service, where it is batched, sealed into the Merkle tree, and made immutable. 570 | * **e. Artifact Delivery:** The RA securely delivers the new certificates to the AHP. 571 | * **f. Public Notification:** As the final action, the RA publishes the rich "hybrid event" payload (generated in step c) to the Pub/Sub system. This broadcast announces the new, valid agent to the discovery ecosystem. 572 | 573 | #### 6.1.3 Key information flows 574 | * **AHP to RA:** `Registration Request` JSON payload. 575 | * **RA to External Services:** Validation requests to CAs and DNS Providers. 576 | * **RA to AHP:** Validation challenges, status updates, final issued certificates, and `log_id`. 577 | * **RA to Pub/Sub System:** The final `agent_registered` event payload. 578 | 579 | ### 6.2 Agent update/version bump 580 | Any code change triggers a complete re-registration. Even fixing a typo in the Agent Card requires a new version number, a new Identity Certificate, and eventually, the retirement of the old identity—but the old version remains ACTIVE while the new one is validated. 581 | 582 | When the AHP detects a change: 583 | 584 | 1. The AHP increments the semantic version in the ANSName (e.g., `v1.0.0` becomes `v1.0.1`) 585 | 2. The AHP submits a new registration request with the incremented version number and a fresh CSR for the Identity Certificate 586 | 3. CRITICAL: The old version remains ACTIVE during this entire process 587 | 588 | The RA runs full validation again - checking organization identity and domain control - because version changes require the same trust verification as initial registration. The RA issues a new Private Identity Certificate bound to the new version. The Public Server Certificate stays active since it's tied to the FQDN, not the version. 589 | 590 | When the RA successfully validates and seals the new version into the Transparency Log: 591 | - The new version is marked as `ACTIVE` 592 | - The old version is simultaneously marked as `DEPRECATED` (not retired immediately) 593 | - Discovery services receive this signal and hide the old version from search results 594 | - After a 24-72 hour grace period, the old Identity Certificate gets cryptographically revoked 595 | 596 | During this transition, the AHP runs both versions in parallel behind a load balancer. Once the new version is stable, the AHP atomically switches all traffic to it and decommissions the old code. The FQDN never changes - partners keep connecting to the same address while the identity silently updates. 597 | 598 | If the new registration fails validation: 599 | - The old version remains ACTIVE and unaffected 600 | - No service interruption occurs 601 | - The AHP can retry with corrected information 602 | 603 | The entire successful process exchanges just two messages: the AHP sends the new ANSName and CSR, the RA returns the new Identity Certificate and log ID. 604 | 605 | ```mermaid 606 | sequenceDiagram 607 | participant AHP 608 | participant RA 609 | participant DNS as DNS Provider 610 | participant CA as Private CA 611 | participant TL as Transparency Log
(TL) 612 | participant PubSub as Pub/Sub System 613 | participant Discovery as Discovery Service 614 | 615 | autonumber 616 | Note over AHP,Discovery: Agent Update/Version Bump Flow 617 | 618 | Note over AHP: Step 1: Immutability Trigger 619 | AHP->>AHP: Code/Config Change
v1.0.0 → v1.0.1 620 | 621 | Note over AHP,RA: Step 2: New Registration Request 622 | AHP->>RA: POST /register
(New ANSName v1.0.1
+ Identity CSR) 623 | activate RA 624 | Note right of RA: Alternatively, the Identity CSR
can be submitted later
in a separate call 625 | RA->>RA: Identify the previous
version by FQDN 626 | RA-->>AHP: 202 Accepted 627 | deactivate RA 628 | 629 | Note over RA: Step 3: Re-validate Domain Control 630 | RA->>DNS: Validate the domain control 631 | activate DNS 632 | DNS->>DNS: Add ACME
Challenge Record
Query and Delete 633 | DNS-->>RA: Confirm domain is
under control 634 | deactivate DNS 635 | 636 | Note over RA,CA: Step 4: Certificate Issuance 637 | RA->>CA: Request Identity
Certificate (v1.0.1) 638 | activate CA 639 | Note left of RA: Only identity cert requested here.
Server cert inheritance flow:
1. Check if previous version has valid cert
2. If yes, validate domain control (ACME)
3. If validation passes, inherit cert
4. Otherwise, AHP must submit server CSR
640 | CA->>CA: Issue Identity
Certificate 641 | CA-->>RA: Return new
Identity Certificate 642 | deactivate CA 643 | 644 | Note over RA,TL: Step 5: Log Sealing & Retirement 645 | 646 | RA->>TL: Seal New Version
(v1.0.1 registered) 647 | activate TL 648 | TL-->>RA: 200 Return log_id 649 | deactivate TL 650 | Note left of RA: New Version (v1.0.1)
Pending → Active 651 | 652 | RA->>TL: Update Old Version
(v1.0.0 → Deprecated) 653 | activate TL 654 | TL-->>RA: Confirmed 655 | deactivate TL 656 | Note left of RA: Old Version (v1.0.0)
Active → Deprecated 657 | 658 | Note over RA,AHP: Certificate Delivery 659 | AHP->>RA: Poll Certificates 660 | RA-->>AHP: Deliver new
Identity Certificate 661 | 662 | Note over RA,Discovery: Event Publishing 663 | 664 | RA->>PubSub: Publish Deprecation
Event (v1.0.0) 665 | activate PubSub 666 | PubSub->>Discovery: Notify: Old Version
Deprecated 667 | deactivate PubSub 668 | 669 | RA->>PubSub: Publish Registration
Event (v1.0.1) 670 | activate PubSub 671 | PubSub->>Discovery: Notify: New Version 672 | deactivate PubSub 673 | 674 | Note over Discovery: Update Index:
Hide v1.0.0
Show v1.0.1 675 | 676 | Note over AHP: Step 6: Atomic Transition 677 | AHP->>AHP: Reconfigure Load Balancer
Route traffic to v1.0.1 678 | AHP->>AHP: Decommission v1.0.0 679 | 680 | Note over RA,CA: After Grace Period (24-72h) 681 | RA->>CA: Revoke the old (v1.0.0)
Identity Cert 682 | activate CA 683 | CA-->>RA: Confirmed 684 | deactivate CA 685 | ``` 686 | *Figure 4: Sequence Diagram of the Agent Update/Version Bump Flow* 687 | 688 | ### 6.3 Agent renewal 689 | Certificate renewal happens when an Identity Certificate approaches expiration but the agent code hasn't changed. The AHP submits a new CSR for the exact same ANSName - no version increment. The RA performs lightweight re-validation, issues a fresh Identity Certificate with extended validity, and seals an `agent_renewed` event into the log. The renewed certificate gets delivered to the AHP for seamless rotation. 690 | 691 | ### 6.4 Agent deregistration/revocation 692 | When an agent shuts down permanently, the AHP sends a signed revocation request to the RA. The RA immediately revokes the Identity Certificate at the Private CA, seals an `agent_revoked` event into the Transparency Log, and removes all DNS service discovery records. The revocation takes effect within minutes through OCSP/CRL distribution. 693 | 694 | ### 6.5 Operational roles and responsibilities in DNS management 695 | 696 | | Actor | Initial Registration Tasks | Ongoing Lifecycle Tasks | Deregistration Tasks | 697 | | :--- | :--- | :--- | :--- | 698 | | **Agent Provider**| Owns domain, authorizes RA via API, manages A/AAAA records. | Monitors renewals, submits config changes. | Submits deregistration request, revokes RA access. | 699 | | **Registration Authority**| Performs ACME DNS-01 challenge, provisions permanent records. | Re-runs ACME challenge, updates records. | Deletes all agent-specific records. | 700 | | **DNS Provider** | Hosts authorization endpoint, processes RA's API requests. | Processes RA's modification requests. | Processes deletion requests from the RA. | 701 | 702 | ### 6.6 Managing parallel release tracks 703 | A `releaseChannel` field in the Agent Card (e.g., "stable", "beta") manages parallel tracks. Each version has a unique `ANSName`, and each channel's lifecycle is managed independently. 704 | 705 | ### 6.7 Handling rollbacks 706 | Rollbacks use a "roll-forward" procedure. To roll back from a buggy `v1.0.1`, the AHP deploys the old stable code as a new version (`v1.0.2`), registers it, and performs an atomic cutover, triggering deprecation of the buggy `v1.0.1` identity. 707 | 708 | ## 6.8 Ongoing integrity verification 709 | Third-party ANS Monitoring Services perform continuous integrity verification. 710 | 711 | A monitoring service's Scheduler periodically enqueues verification jobs for all active agents learned from the RA's public event feed. Geographically distributed Verification Workers consume these jobs in parallel. 712 | 713 | Each worker performs deep verification of the attestation hash chain: 714 | 715 | 1. **DNS pointer validation:** Workers perform authoritative DNS queries to retrieve `_ans` and `_ra-badge` records with full DNSSEC validation. 716 | 2. **Agent card integrity check:** Workers fetch the Agent Card, calculate its hash, and verify it against the authoritative `capabilities_hash` recorded by the RA at registration. 717 | 3. **Schema integrity check:** Workers parse the Agent Card, fetch content from each schema.url, calculate its hash, and verify it against the schema.hash from the Agent Card. 718 | 719 | When workers detect failures, they report to the monitoring service's central system. The service can alert customers (agent owners) and publish signed findings for consumption by the RA and broader ecosystem, as detailed in Section 5.11. 720 | 721 | ### 6.9 Managing private CA migration (root rotation) 722 | The RA will eventually need to change its Private CA provider due to security incidents, contract changes, or technical upgrades. This "root rotation" requires careful orchestration - a simple cutover would break trust for all existing agents. 723 | 724 | The migration will be handled via a gradual, multi-step process that establishes a Transitional Period where both the old and new Private CAs are trusted simultaneously. 725 | 726 | 1. **Update the Trust Bundle:** The RA will instruct all AHPs to update the `trusted_private_ca_chain.pem` file used by their agents. This updated file will contain the public root certificates for BOTH the old and the new Private CAs. After this step, all agents in the ecosystem will be capable of trusting Identity Certificates issued by either CA. 727 | 2. **Begin Issuance from New CA:** The RA will switch its internal systems to issue all new and renewed Identity Certificates from the new Private CA. 728 | 3. **Decommission Old CA:** After a transition period of 1-2 years, once all active Identity Certificates in the ecosystem have been naturally replaced with ones from the new CA, the old Private CA can be safely decommissioned. The RA will then notify AHPs that they can update their trust bundles to remove the old CA's root. 729 | 730 | ## 7.0 Architectural decisions (ADRs) 731 | 732 | ### 7.1 ADR 001: Separation of certificates for identity vs. TLS 733 | 734 | | Item | Description | 735 | | :--- | :--- | 736 | | **Context** | An agent requires both a stable, publicly trusted endpoint for HTTPS communication and a separate, strictly version-bound identity for secure agent-to-agent interactions. Tying the highly volatile `ANSName` to a Server Certificate with a fixed validity period would create a massive and operationally unsustainable certificate re-issuance burden. | 737 | | **Decision** | The system uses two separate certificates: a **Server Certificate** from a Public CA with a time-based lifecycle, and an **Identity Certificate** from a Private CA with an event-driven lifecycle. | 738 | | **Certificate Comparison** | To clarify their distinct roles, their attributes are compared below:

AttributePublic Server CertificatePrivate Identity Certificate
PurposeSecure the endpoint (like a website)Prove the agent's specific identity
SubjectStable FQDN (e.g., agent.example.com)Volatile ANSName (e.g., ...v1.0.1...)
LifecycleTime-based (e.g., 90 days)Event-driven (revoked on any update)
IssuerPublic CAPrivate CA
Primary Use CaseStandard TLS for clients and simple agentsHigh-assurance mTLS for ANS-to-ANS collaboration
| 739 | | **Rationale** | Separation isolates the high-frequency churn of the Identity Certificate from the stable renewal cycle of the Server Certificate. The dual-certificate model allows a single agent endpoint to support both ANS-to-ANS mTLS and traditional clients with token-based protocols. To make this distinction clear, this architecture refers to the Public Server Certificate's lifecycle as 'time-based', as it is governed by external CA/B Forum standards. This is in direct contrast to the 'event-driven' lifecycle of the Private Identity Certificate, which is dictated by the agent's software version. | 740 | 741 | ### 7.2 ADR 002: Necessity of the ra_id with a centralized KMS 742 | 743 | | Item | Description | 744 | | :--- | :--- | 745 | | **Context** | The system uses a Centralized KMS to sign the TL's Merkle Tree Root, so all valid log entries are signed with the same cryptographic key (`kms_key_id`). The question is whether this single cryptographic link is sufficient for all failure scenarios, particularly a non-cryptographic compromise (e.g., a buggy or breached server instance). | 746 | | **Decision** | Every Transparency Log entry must include both the Cryptographic Root ID (`kms_key_id`) and the Operational Instance ID (`ra_id`). | 747 | | **Rationale** | **Forensic Accountability.** The `kms_key_id` proves the signature is cryptographically valid, while the `ra_id` identifies the specific operational server that performed the validation and initiated the signing request. This distinction is necessary for auditing and allows selective revocation of attestations processed by a single faulty instance without distrusting the entire log. | 748 | 749 | ### 7.3 ADR 003: RA as orchestrator, not primary identity validator 750 | 751 | | Item | Description | 752 | | :--- | :--- | 753 | | **Context** | The Registration Authority's role requires complex validation checks, such as legal entity verification (Organization Identity) and technical domain control. The architectural question is whether the RA should implement and own all of this complex, specialized logic internally. | 754 | | **Decision** | The RA acts as an Orchestrator and State Aggregator, proxying validation requests to specialized internal and external services and aggregating their pass/fail responses. | 755 | | **Rationale** | **Specialization and Security.** Using existing, hardened services for identity verification and DNS management reduces the RA's complexity and attack surface. The RA can focus on its core function: acting as the gateway to the log-sealing process. | 756 | 757 | ### 7.4 ADR 004: Enforcing strict ANSName immutability 758 | 759 | | Item | Description | 760 | | :--- | :--- | 761 | | **Context** | When an agent's code is updated, its `ANSName` version is incremented. A decision must be made on how to handle the identity certificate for the old version to prevent an ambiguous state where multiple versions are simultaneously "valid," while also allowing for a graceful migration. | 762 | | **Decision** | Any change to the semantic version of the `ANSName` MUST be treated as a new identity. This mandates the formal retirement of the old Identity Certificate, managed in two stages: 1. The old version's status is changed to `DEPRECATED` for a short grace period (e.g., 24-72 hours). 2. At the end of the grace period, the old Identity Certificate MUST be explicitly and cryptographically revoked. | 763 | | **Rationale** | Trust and Non-Repudiation. The two-stage process allows a brief migration window while maintaining the principle of one active version. The `DEPRECATED` status signals the transition publicly, and cryptographic revocation prevents trust in the old version after the grace period. | 764 | 765 | ### 7.5 ADR 005: Decoupling provider identity for operational flexibility 766 | 767 | | Item | Description | 768 | | :--- | :--- | 769 | | **Context** | Using a mutable, human-readable provider name (e.g., AcmeCorp) as a component of the immutable `ANSName` created a significant operational risk. A corporate acquisition or rebranding would force a mass re-registration of every single agent owned by that provider. | 770 | | **Decision** | The provider name component in the `ANSName` is replaced with a non-semantic, unique, and immutable `ProviderID` (e.g., `PID-1234`). A separate, high-trust `Provider Registry` is introduced to manage the mapping between this immutable `ProviderID` and its current, mutable legal entity name. | 771 | | **Rationale** | Decoupling technical identity from business identity keeps the `ANSName` immutable while allowing flexibility for business events like acquisitions. A single update in the `Provider Registry` handles rebranding without mass re-registration. | 772 | 773 | ### 7.6 ADR 006: Bring-your-own-certificate (BYOC) policy 774 | 775 | | Item | Description | 776 | | :--- | :--- | 777 | | **Context** | An AHP may already possess a valid X.509 certificate for their service and may wish to use it in the registration process instead of having the RA issue a new one. A formal policy is needed to define if and when this is permissible. | 778 | | **Decision** | The BYOC policy is different for the two certificate types:

1. **Public Server Certificates:** BYOC is **PERMITTED, with a critical caveat.**
2. **Private Identity Certificates:** BYOC is strictly PROHIBITED. | 779 | | **Rationale** | The two-part policy balances customer convenience with trust model integrity.

**For Server Certificates:** AHPs can use existing public certificates for convenience, but the RA must still perform independent Domain Control Validation (e.g., ACME DNS-01) at registration. The certificate does not replace live validation.

**For Identity Certificates:** The Private Identity Certificate represents the RA's attestation of a validated `ANSName`. Accepting third-party certificates would compromise the RA's role as trust root - like a notary signing an unwitnessed document. The RA must control issuance to maintain integrity. | 780 | 781 | ### 7.7 ADR 007: Multi-protocol agent support 782 | 783 | | Item | Description | 784 | | :--- | :--- | 785 | | **Context** | Agents often support multiple communication protocols (e.g., both conversational `a2a` and transactional `mcp`) from a single FQDN. The architecture must define how one agent identity represents multiple protocols. | 786 | | **Decision** | To balance a singular identity with functional flexibility, the following model is adopted:

1. **One Version, One ANSName`:** Each unique software version of an agent is represented by one and only one canonical `ANSName`. The AHP must designate a single "primary" protocol to be used in this identifier.

2. **Agent Card is Authoritative for Functionality:** The Agent Card is the sole authoritative source for the *complete list* of all supported protocols, endpoints, and capabilities.

3. **Schemas are External and Linked:** Each protocol listed in the `protocolExtensions` block of the Agent Card MUST link to its own canonical JSON Schema via a `schema` URL. | 787 | | **Rationale** | The model trades complete functional description in the `ANSName` for a singular cryptographic identity with one Identity Certificate. Functional complexity moves to the Agent Card, a richer and more flexible document. External linked schemas promote modularity and prevent bloat. The design favors unified identity and operational efficiency over separate FQDNs per protocol, though AHPs can still register multiple single-protocol agents if preferred. | 788 | 789 | ### 7.8 ADR 008: Detached signature storage requirement 790 | 791 | | Item | Description | 792 | | :--- | :--- | 793 | | **Context** | Digital signatures in the ANS ecosystem need to be stored and transmitted alongside their corresponding data payloads. The architectural question is whether signatures should be embedded within the data (as in standard JWS Compact Serialization with Base64URL-encoded payloads) or stored separately as detached signatures. | 794 | | **Decision** | All digital signatures in the ANS Registry MUST be stored detached from their payloads at every level of the system. This applies to:

1. Transparency Log signatures; worker signatures on events and KMS signatures on Merkle roots.
2. RA attestation badges; signatures are stored in separate fields from the attestation data.
3. Pub/Sub event payloads; event data and signatures are distinct JSON fields.
4. API request signatures; request bodies and their signatures are transmitted separately. | 795 | | **Rationale** | Detached signatures separate data from cryptographic proof architecturally. Benefits include:

1. Performance: No Base64 encoding/decoding overhead for large payloads.
2. Storage Efficiency: Native format storage without ~33% Base64 expansion.
3. Processing Flexibility: Data can be processed, indexed, or queried without JWS extraction.
4. Signature Composability: Multiple parties can add signatures without modifying original data.
5. Streaming Support: Large payloads can be streamed while signatures are handled separately.
6. Clear Data Model: Structure clearly distinguishes signed data from signatures. | 796 | 797 | ### 7.9 ADR 009: Solving the trust bootstrap problem via a client-side trust provisioner 798 | 799 | | Item | Description | 800 | | :--- | :--- | 801 | | **Context** | The Private CA for Identity Certificates (ADR 001) enables the event-driven, version-bound lifecycle required by the architecture's trust model. Non-ANS-aware agents cannot trust certificates from this private authority. Hard mTLS failures occur as expected per TLS protocol analysis. This barrier could prevent a competitive marketplace of multiple, interoperable RAs. A mechanism must distribute and manage trust in these private roots. | 802 | | **Decision** | The architecture uses a client-side ANS Trust Provisioner (or "Bootstrapper") to solve the trust bootstrap problem. This component manages the agent's trust store through two phases:

1. Initial (Single-RA) Phase: The provisioner contains the root certificate of the bootstrapping RA and ensures secure installation.

2. Federated (Multi-RA) Phase: The provisioner becomes a "Federated Trust Manager" configured with a master trust anchor for a Federation Registry. It fetches, verifies, and caches a trust bundle containing root certificates of all compliant RAs. | 803 | | **Rationale** | Infrastructure-level alternatives were evaluated and rejected. Public CAs cannot support the required lifecycle and validation model (ADR 001). Private CA inclusion in universal trust stores violates CA/B Forum requirements.

Trust requires explicit participant consent. A client-side component provides the only scalable management mechanism. The provisioner abstracts trust bundle complexity and enables interoperability. Agents verify peers from any compliant RA without manual AHP configuration. This transforms the trust bootstrap problem into automated tooling, enabling the federated model in Section 9.2. The provisioner behavior and Federation Registry format are candidates for RFC standardization. | 804 | 805 | ### 7.10 ADR 010: Enforcing separation of duties for Gold Tier attestation 806 | 807 | | Item | Description | 808 | | :--- | :--- | 809 | | **Context** | The Gold Tier trust model (Section 5.1.1) requires two independent checks: PKI validation against a trusted RA root and DANE validation against an owner-controlled `TLSA` record in DNSSEC. The RA can manage DNS records via Domain Connect. A compromised RA could issue a fraudulent Identity Certificate and publish a matching fraudulent `TLSA` record, defeating defense-in-depth. | 810 | | **Decision** | Gold Tier status requires strict separation between certificate issuance and DNS attestation. The RA's automated DNS permissions must exclude write access to the `_ans-identity._tls` `TLSA` record. The Agent Owner or AHP manages this DNS record. The RA issues the certificate; the owner publishes its hash in DNS. | 811 | | **Rationale** | This separation maintains independence between the two verification paths required for defense-in-depth. The owner-controlled DNS zone becomes the root of trust for agent identity, protecting against RA compromise or misbehavior. Gold Tier becomes a zero-trust mechanism where the owner continuously affirms trust through a separate secure channel rather than delegating to the RA. | 812 | 813 | ### 7.11 ADR 011: Establishing a canonical registrar identifier (`registrar_id`) for federation 814 | 815 | | Item | Description | 816 | | :--- | :--- | 817 | | **Context** | A competitive ANS/RA marketplace requires unambiguous identification of each compliant RA for federated discovery, verification, and auditing. The `ANSName` must remain registrar-agnostic for agent portability and to prevent vendor lock-in. The operational `ra_id` is too granular. | 818 | | **Decision** | The Registrar ID (`registrar_id`) is a unique, stable, public string assigned to each approved RA (e.g., `ra-prime`). The `registrar_id` is excluded from the `ANSName`. It appears in the `_ra-badge` DNS record (as `registrar`) and in all Transparency Log event payloads to identify the originating RA. | 819 | | **Rationale** | This decouples agent identity from the current registrar. Agents move between RAs by updating DNS records (`url` and `registrar`) without changing the `ANSName`. This preserves identity portability while enabling federated routing and trust verification. The `registrar_id` serves as the Federation Registry primary key for a scalable, auditable multi-provider ecosystem. | 820 | 821 | ## 8.0 Non-functional requirements (NFRs) 822 | 823 | ### 8.1 Operational requirements (performance and availability) 824 | 825 | | Category | ID | Requirement Description | 826 | | :--- | :--- | :--- | 827 | | Availability | NFR-A-01 | The RA service must maintain a minimum uptime of 99.9%. | 828 | | Availability | NFR-A-02 | The ANS Integrity Monitor subsystem must maintain a minimum uptime of 99.9%. | 829 | | Performance| NFR-P-01 | The end-to-end Agent Registration flow must complete in a median time of < 120 seconds. | 830 | | Performance| NFR-P-02 | The critical step of sealing a validated entry into the TL must take a median time of < 500 milliseconds. | 831 | | Performance| NFR-P-03 | Identity Certificate revocation requests must be reflected in OCSP/CRL data within a maximum of 5 minutes. | 832 | | Performance| NFR-P-04 | Any unauthorized change to a provisioned DNS record must be detected by the AIM within a maximum of 24 hours. | 833 | | Performance| NFR-P-05 | The Transparency Log must process event batches within 5 seconds under normal load conditions. | 834 | | Performance| NFR-P-06 | Merkle tree append operations must complete in O(log n) time by utilizing cached intermediate nodes. | 835 | | Performance| NFR-P-07 | Inclusion proof generation must complete in < 100ms for any event in the log. | 836 | | Scalability | NFR-S-01 | The system must support scaling to process a minimum of 1,000 full agent registrations per hour. | 837 | | Scalability | NFR-S-06 | The AIM system must be architected to complete a full verification cycle of all N active agents within the time window defined by NFR-P-04. | 838 | | Scalability | NFR-S-07 | The Transparency Log must scale to support billions of events while maintaining sub-second append performance. | 839 | | Event Submission | FR-14 | The TL must validate producer signatures before accepting events. | 840 | | Event Submission | FR-15 | The TL must support idempotent event submission using log_id. | 841 | | Event Submission | FR-16 | The TL must assign globally unique sequence numbers atomically. | 842 | | Key Management | FR-17 | The TL must support producer key registration with validity periods. | 843 | | Key Management | FR-18 | The TL must support zero-downtime key rotation with overlap periods. | 844 | 845 | ### 8.2 Security and auditability requirements 846 | 847 | | Category | ID | Requirement Description | 848 | | :--- | :--- |:--------------------------------------------------------------------------------------------------------------------------| 849 | | Integrity | NFR-S-02 | The TL must be cryptographically protected by a Merkle Tree structure to ensure it is tamper-evident. | 850 | | Integrity | NFR-S-03 | All RA instances must use the same centralized KMS (`kms_key_id`) as the Root of Trust for signing the TL. | 851 | | Auditability| NFR-S-04 | Every entry sealed into the TL must contain the unique Operational Instance ID (`ra_id`) for forensic accountability. | 852 | | Binding | NFR-S-05 | The Private Identity Certificate must enforce the cryptographic binding of the full `ANSName` via a URI SAN. | 853 | | Integrity | NFR-S-06 | All event data must be canonicalized using JCS before hashing to ensure deterministic Merkle tree construction. | 854 | | Integrity | NFR-S-07 | The Transparency Log must enforce strict append-only semantics through hash chaining and sequence validation. | 855 | | Auditability| NFR-S-08 | Every batch in the Transparency Log must include both start and end sequence numbers for complete batch reconstruction. | 856 | | Auditability| NFR-S-09 | Historical Merkle tree states must be preserved to enable consistency proofs between any two points in time. | 857 | | Compliance | NFR-C-01 | The RA's validation process must adhere to all relevant policies for Organization Identity and Domain Control validation. | 858 | | Compliance | NFR-C-02 | The Transparency Log implementation must support RFC 6962-compatible monitoring and auditing interfaces. | 859 | | Authentication | NFR-S-10 | All internal API endpoints must require IAM JWT token authentication. | 860 | | Authentication | NFR-S-11 | The ra_id must match the ra_id in requests and registered keys. | 861 | | Integrity | NFR-S-12 | Producer signatures must be validated before event acceptance. | 862 | | Integrity | NFR-S-13 | Events must be immutable once assigned a sequence number. | 863 | | Availability | NFR-S-14 | Key rotation must not cause event submission failures. | 864 | | Auditability| NFR-S-15 | All producer keys must be retained for forensic analysis. | 865 | | Auditability| NFR-S-16 | Failed signature validations must be logged with details. | 866 | 867 | *Note on NFR-S-02 (Integrity): This makes the log tamper-evident. If a malicious actor tries to alter even a single character in a past entry, the final root hash will change completely, providing immediate, mathematical proof of tampering.* 868 | 869 | *Note on NFR-S-04 (Auditability): If a single RA server is ever compromised or discovered to have a bug, auditors can use the `ra_id` to instantly find every single registration that was processed by that specific instance, allowing for precise, surgical remediation.* 870 | 871 | ### 8.3 Failure modes and resilience 872 | The system's expected behavior during key component failures is described below. 873 | 874 | #### Scenario: Extended AHP Unavailability 875 | * **Consequence:** The agent's Identity and Server Certificates will expire, causing ANS-to-ANS mTLS connections to fail and the agent's public endpoint to become inaccessible. 876 | * **RA Role:** The RA will detect the expired status and report the failure of trust on the Dynamic Badge Lander. The RA cannot auto-renew certificates, as the AHP must control its private keys. 877 | 878 | #### Scenario: Extended CA Unavailability 879 | * **Consequence:** All operations requiring new certificate issuance (registrations, renewals, version bumps) will be blocked. 880 | * **RA Role:** The RA will queue or fail the pending requests and report the external dependency failure. Existing, valid agents will remain fully operational. 881 | 882 | #### Scenario: Agent Provider's Domain Name Expiration 883 | * **Consequence:** The agent's public endpoint will become unreachable, and all subsequent Domain Control Validation checks will fail. 884 | * **RA Role:** The RA must treat a persistent Domain Control failure as a security event and update the status of all associated `ANSName`s to `INVALID` or `REVOKED`. 885 | 886 | #### Scenario: DNS Provider or DNS Resolution Unavailability 887 | * **Case 1: DNS Provider API Unavailability:** The RA cannot provision or modify DNS records. New registrations and updates will stall. Existing agents will be unaffected. 888 | * **Case 2: Global DNS Resolution Failure:** This is a catastrophic failure of core internet infrastructure. All DNS-based discovery and endpoint resolution would fail for all agents. The ANS Registry is fundamentally dependent on a functioning global DNS. 889 | 890 | #### Scenario: Service Credential Failure 891 | * **Consequence:** The RA loses the ability to authenticate to a critical external dependency. 892 | * **RA Role:** The RA MUST enter a degraded state where it continues to serve read-only requests (e.g., for the Dynamic Badge Lander) but fails gracefully for any new write operations. New registrations and updates are queued or rejected with clear error messages, and high-priority alerts are sent to system administrators. The system MUST support zero-downtime credential rotation to mitigate this failure scenario. 893 | 894 | #### Scenario: Service Credential Failure 895 | * **Consequence:** The RA temporarily loses the ability to communicate with a critical external dependency needed to complete the activation of a registration. 896 | * **RA Role:** The RA MUST be designed for graceful degradation. 897 | * It continues to accept new registration requests, validates them, and places them in a `pending` state. The AHP receives a `202 Accepted` response, confirming the request has been successfully queued. 898 | * The RA's internal, asynchronous workers will then periodically retry the failed operation (e.g., provisioning DNS records). 899 | * If the credential failure is extended beyond a defined SLA, the RA's monitoring system MUST alert system administrators for manual intervention. 900 | * Read-only operations (like the Dynamic Badge Lander) for already-active agents remain fully operational. 901 | 902 | ## 9.0 Open issues / future work 903 | 904 | ### 9.1 Open issues (current limitations) 905 | * **Full DNSSEC Integration:** The RA must perform a full cryptographic check of the DNSSEC chain when validating a domain, not just check for record existence. 906 | * **KMS Key Rotation Strategy:** A formal, operational playbook for zero-downtime rotation of the central KMS signing key is required. The Transparency Log will use a `tree_version` field that increments with each key rotation, allowing verifiers to identify which key to use for historical proof verification. The strategy must address: 907 | * How to maintain a mapping of tree versions to KMS key IDs 908 | * The transition period where both old and new keys may be in use 909 | * How to communicate key rotations to verifiers 910 | * Long-term storage and accessibility of historical public keys 911 | * **Producer Key Backup and Recovery:** A formal procedure for backing up and recovering producer keys in disaster recovery scenarios needs to be defined. This should address: 912 | * Secure backup storage location for producer public keys 913 | * Recovery procedure if the internal key registry is corrupted 914 | * Coordination with RA instances for key re-registration 915 | 916 | ### 9.2 Future work (planned enhancements) 917 | * Develop Component Registries for Standardization: Create and govern formal registries for `ANSName` components (like `capability`) to ensure long-term ecosystem health. 918 | * Granular ANSName Revocation: Introduce mechanisms to revoke only specific components of the `ANSName` (e.g., a single compromised capability) rather than the entire identity. 919 | * Automated Policy Engine: Externalize the RA's validation rules into a separate policy engine to improve maintainability. 920 | * Transparency Log Consistency Proofs: Implement RFC 6962-compliant consistency proofs to enable cryptographic verification that the log has not been tampered with between any two historical states. 921 | * Client SDK/CLI for High-Assurance Verification: Develop a lightweight client library (e.g., an `ans_verifier` package) for the end-to-end, high-assurance verification flow. The SDK handles the ANS-to-ANS mTLS handshake, DNS record lookups, real-time Badge status checks, and the full hash-chain validation (Agent Card and schemas). A simple, high-level function (e.g., `verifier.connect()`) abstracts security-critical complexity for AHP developers. 922 | * Define Formal Policy for Wildcard Certificates: Develop a comprehensive policy and risk assessment framework for the use of wildcard certificates. The policy must define whether wildcards are permissible for Server and/or Identity Certificates and what specific security controls are required. 923 | * Automated ANSName and Capability Suggestion: Develop an AI-driven feature to inspect an agent's code or documentation and automatically propose a compliant and accurate `ANSName` and capabilities list. 924 | * Automated Credential Rotation: Implement a fully automated, zero-downtime credential rotation mechanism for all external service integrations. 925 | * Develop First-Class Support for ZKP Attestations: Implement the standards and tooling required for Zero-Knowledge Proofs (ZKPs) as a fully supported feature. Define a standard schema for advertising ZKP-enabled capabilities within the Agent Card, enhance RA protocol adapters to validate this schema, and create developer SDKs for ZKP-based agent interactions. 926 | * Evolve to a Federated, Multi-RA Ecosystem: The single-RA model described is the necessary bootstrap phase for the ecosystem. The long-term architectural vision is a competitive, interoperable marketplace of hundreds of compliant RAs, as envisioned in the original IETF and OWASP drafts. This federated state requires the "Federated Trust Manager" mode of the ANS Trust Provisioner (see ADR 009) and governance by a new standards body (e.g., an "ANS Forum" analogous to the CA/B Forum). This body will maintain the central, secure Federation Registry and define the policies for RA compliance, creating an open standard. 927 | * Develop RA-to-RA Federation Protocol: The current architecture enables federated models via client-side trust. A resilient ecosystem requires formal server-side communication between registrars. Future work should define: 928 | * Specialized communication channels: Distinct, prioritized channels for inter-registrar communication: 929 | * Anchor Thread for critical security events (revocations) 930 | * Signal Thread for eventually consistent data (policies, reputation) 931 | * Probe Thread for health monitoring 932 | * Zero-trust identity: Secure federation using cryptographic workload identity standards like SPIFFE, replacing network-based controls with zero-trust security for the registrar network. 933 | * Support generic verifiable claims: Move beyond validating claimed capabilities to verifying how capabilities were built. This extensible approach avoids building support for specific claim types: 934 | * Agent Card extension: Add a generic verifiableClaims array to the Agent Card schema. Each entry contains a type (e.g., "AIBOMv1", "SOC2ComplianceProof"), a hash, and a url. 935 | * Attestation sealing: The RA hashes the entire Agent Card payload, including verifiableClaims, and seals it in the Transparency Log. This allows ecosystem innovation on verifiable evidence types while keeping the RA protocol focused on attestation. 936 | 937 | ### 9.3 Public verification requirements 938 | 939 | The Transparency Log MUST provide the following public verification capabilities: 940 | 941 | * Key Distribution: Public endpoints to retrieve current and historical TL signing keys 942 | * Inclusion Proofs: Mathematical proof that an event exists in the tree at a specific position 943 | * Consistency Proofs: Proof that the tree has grown correctly between two sizes 944 | * Batch Verification: Ability to verify all events within a signed batch 945 | * Complete Records: All log entries include producer signatures as part of the immutable record 946 | 947 | Verification scope: 948 | * External verifiers can mathematically verify tree integrity and inclusion proofs 949 | * Producer signature validation is performed internally by the TL 950 | * The complete event record (including producer signatures) is sealed in the Merkle tree 951 | * Trust in producer validation is inherited from trust in the TL operator 952 | 953 | Verification MUST NOT require: 954 | * Access to producer public keys (internal validation only) 955 | * Authentication for read-only verification operations 956 | * Knowledge of internal RA implementation details 957 | 958 | ### 9.4 Internal API security model 959 | 960 | The TL's internal API implements defense-in-depth security: 961 | 962 | * **Authentication Layers:** 963 | * Layer 1: Network isolation (private VPC/subnet) 964 | * Layer 2: IAM JWT token authentication for all requests 965 | * Layer 3: ra_id validation between signed event and registered keys 966 | 967 | * **Audit Trail:** 968 | * All internal API calls are logged with full request context 969 | * Failed authentication attempts trigger security alerts 970 | * Producer key operations are logged for compliance 971 | 972 | * **Operational Safety:** 973 | * Key revocation includes mandatory reason codes 974 | * Idempotency prevents duplicate event submission 975 | 976 | ## 10.0 Implementation view and technology stack 977 | 978 | ### 10.1 Recommended software architecture 979 | For the internal software design of the RA application, a Hexagonal Architecture (also known as Ports and Adapters) is recommended. This pattern cleanly separates the core domain logic from infrastructure concerns (e.g., databases, external APIs), promoting testability and maintainability. 980 | 981 | ### 10.2 External service integration 982 | The RA must integrate with multiple external services to fulfill its orchestration responsibilities. The authentication methods and purpose for each major external dependency are defined below. All credentials used for these integrations MUST be stored in a secure secret management system and rotated on a regular schedule. 983 | 984 | | Service | Purpose | Authentication Method | 985 | | :--- | :--- | :--- | 986 | | **DNS Provider API** | Provision DNS records | OAuth 2.0 or JWT tokens | 987 | | **Public CA (ACME)** | Issue Server Certificates | ACME protocol | 988 | | **Private CA API** | Issue Identity Certificates | JWT Bearer Token | 989 | | **Transparency Log KMS** | Sign Merkle Tree roots | AWS IAM Instance Role | 990 | 991 | ## Appendix A: Data structure examples 992 | 993 | The following examples show consistent data structures for developers on both the AHP and RA teams. All examples are based on the registration of a single agent: the "Velocity Air Flight Booker." 994 | 995 | ### A.1 Lifecycle API payload example (registration request) 996 | This is the "superset" JSON payload submitted by an AHP to the RA's `/register` endpoint. It is the primary input to the RA system and contains the transactional intent, identity components, and the full Agent Card embedded within it. Note that this contains a "superset" payload submitted by the AHP to register the multi-protocol agent. The CSRs are shown as truncated strings for brevity. 997 | 998 | ```json 999 | { 1000 | "request_type": "new_registration", 1001 | "fqdn": "support-agent.my-support-co.com", 1002 | "ansName_components": { 1003 | "protocol": "a2a", 1004 | "agentName": "support", 1005 | "primary_capability": "customerService", 1006 | "providerID": "PID-MSC-11", 1007 | "version": "v1.5.0" 1008 | }, 1009 | "server_certificate_csr": "...", 1010 | "identity_certificate_csr": "...", 1011 | "agent_card_payload": { 1012 | "ansName": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1013 | "name": "MySupportCo Omni-Channel Agent", 1014 | "description": "Provides customer support via conversational and transactional interfaces.", 1015 | "endpoints": { 1016 | "chat": { "url": "wss://[support-agent.my-support-co.com/a2a](https://support-agent.my-support-co.com/a2a)" }, 1017 | "rest": { "url": "[https://support-agent.my-support-co.com/mcp](https://support-agent.my-support-co.com/mcp)" } 1018 | }, 1019 | "capabilities": [ 1020 | { "name": "lookupOrder", "protocol": "a2a" }, 1021 | { "name": "getTicketStatus", "protocol": "mcp" } 1022 | ], 1023 | "protocolExtensions": { 1024 | "a2a": { 1025 | "version": "1.0", 1026 | "schema": "[https://developer.my-support-co.com/schemas/a2a/v1.json](https://developer.my-support-co.com/schemas/a2a/v1.json)", 1027 | "hash": "sha256-abc123def456..." 1028 | }, 1029 | "mcp": { 1030 | "version": "1.0", 1031 | "schema": "[https://developer.my-support-co.com/schemas/mcp/v1.json](https://developer.my-support-co.com/schemas/mcp/v1.json)", 1032 | "hash": "sha256-abc123def456..." 1033 | } 1034 | } 1035 | } 1036 | } 1037 | ``` 1038 | 1039 | ### A.2 Agent card example 1040 | This is the rich metadata file hosted by the AHP at the URL specified in the _ans DNS record. It is the agent's public "business card." 1041 | 1042 | ```json 1043 | { 1044 | "ansName": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1045 | "name": "MySupportCo Omni-Channel Agent", 1046 | "releaseChannel": "stable", 1047 | "description": "Provides customer support via conversational and transactional interfaces.", 1048 | "endpoints": { 1049 | "chat": { 1050 | "url": "wss://[support-agent.my-support-co.com/a2a](https://support-agent.my-support-co.com/a2a)" 1051 | }, 1052 | "rest": { 1053 | "url": "[https://support-agent.my-support-co.com/mcp](https://support-agent.my-support-co.com/mcp)" 1054 | } 1055 | }, 1056 | "securitySchemes": { 1057 | "agentAuth": { 1058 | "type": "mutual_tls", 1059 | "description": "mTLS using the agent's Private Identity Certificate." 1060 | } 1061 | }, 1062 | "capabilities": [ 1063 | { 1064 | "name": "lookupOrder", 1065 | "protocol": "a2a", 1066 | "description": "Looks up order details in a conversational flow." 1067 | }, 1068 | { 1069 | "name": "getTicketStatus", 1070 | "protocol": "mcp", 1071 | "description": "Gets the status of a support ticket transactionally." 1072 | } 1073 | ], 1074 | "protocolExtensions": { 1075 | "a2a": { 1076 | "version": "1.0", 1077 | "schema": "[https://developer.my-support-co.com/schemas/a2a/v1.json](https://developer.my-support-co.com/schemas/a2a/v1.json)", 1078 | "hash": "sha256-abc123def456..." 1079 | }, 1080 | "mcp": { 1081 | "version": "1.0", 1082 | "schema": "[https://developer.my-support-co.com/schemas/mcp/v1.json](https://developer.my-support-co.com/schemas/mcp/v1.json)", 1083 | "hash": "sha256-abc123def456..." 1084 | } 1085 | } 1086 | } 1087 | ``` 1088 | 1089 | ### A.3 Pub/Sub event payload example 1090 | This is the rich "hybrid event" payload published by the RA upon successful registration. This same JSON object is the data that is cryptographically sealed into the chronological Transparency Log, and made accessible to auditors via the Audit Log Viewer. 1091 | 1092 | **Note on Producer Signatures:** 1093 | - The `producer_signature` field contains the RA instance's signature on the event 1094 | - This signature was validated internally by the TL before accepting the event 1095 | - The signature is included to maintain the complete chain of custody 1096 | - Producer keys are not publicly distributed; trust in validation is part of the TL trust model 1097 | 1098 | ```json 1099 | { 1100 | "log_id": "550e8400-e29b-41d4-a716-446655440000", 1101 | "sequence_number": 1234567, 1102 | "ans_name": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1103 | "fqdn": "support-agent.my-support-co.com", 1104 | "agent_card_url": "https://support-agent.my-support-co.com/agent-card.json", 1105 | "agent_state": "registered", 1106 | "producer_signature": "eyJhbGciOiJFUzI1NiIsImtpZCI6InJhMS1wcm9kLWtleS0yMDI0LTAxIn0...", 1107 | "producer_key_id": "ra1-prod-key-2024-01", 1108 | "meta": { 1109 | "description": "Provides customer support via conversational and transactional interfaces.", 1110 | "capabilities": ["lookupOrder", "getTicketStatus"], 1111 | "provider": "MySupportCo", 1112 | "registered_date": "2025-10-05T18:00:00Z", 1113 | "endpoint": "wss://support-agent.my-support-co.com/a2a", 1114 | "validation_type": "acme-dns-01-ov", 1115 | "cert_types": { 1116 | "server": "x509-ov-server", 1117 | "identity": "x509-ov-client-ans" 1118 | }, 1119 | "ra_badge_url": "https://transparency.ra.ansregistry.com/registration/reg-8v2k7x9p" 1120 | }, 1121 | "signature_kms_key_id": "arn:aws:kms:us-east-1:123456789012:key/RootKey-A", 1122 | "signature": "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9..." 1123 | } 1124 | ``` 1125 | 1126 | ### A.4 RA attestation badge JSON example 1127 | This is the response from the Dynamic Badge Lander UI endpoint. It contains two parts: the signed attestation (which is immutable) and the current Merkle proof (which updates as the tree grows). 1128 | 1129 | **Structure Design (Nested Pattern):** 1130 | This example shows the complete log entry including the producer signature: 1131 | - The `attestation` object contains the complete event record 1132 | - The `producer_signature` is included as part of the immutable record 1133 | - The `attestation_signature` signs the ENTIRE `attestation` object 1134 | - The `current_proof` contains its own signature that covers only the `root_hash` 1135 | 1136 | ```json 1137 | { 1138 | "attestation": { 1139 | "ra_id": "RA-USEAST1-PROD-03", 1140 | "event_timestamp": "2025-10-05T18:00:00Z", 1141 | "log_id": "reg-multi-protocol", 1142 | "sequence_number": 1234567, 1143 | "ans_name": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1144 | "fqdn": "support-agent.my-support-co.com", 1145 | "status": "VERIFIED", 1146 | "attestations": { 1147 | "organization_validation": "success", 1148 | "domain_validation": "acme-dns-01", 1149 | "cert_types": { 1150 | "server": "x509-ov-server", 1151 | "identity": "x509-ov-client-ans" 1152 | }, 1153 | "server_cert_fingerprint": "sha256:a1b2c3d4...", 1154 | "identity_cert_fingerprint": "sha256:e5f6g7h8...", 1155 | "capabilities_hash": "sha256:fedcba98...", 1156 | "dnssec_status": "validated" 1157 | }, 1158 | "producer_signature": "eyJhbGciOiJFUzI1NiIsImtpZCI6IlBST0QtS0VZLTEyMyIsInR5cCI6IkpXVCIsInRzcCI6MTczMzI1MjQwMCwicmFpZCI6IlJBLVVTRUFTVDEtUFJPRC0wMyJ9...", 1159 | "producer_key_id": "ra1-prod-key-2024-01", 1160 | "leaf_hash": "sha256:abc123def456..." 1161 | }, 1162 | "attestation_signature": "eyJhbGciOiJFUzI1NiIsImtpZCI6IlJBLUtFWS00NTYiLCJ0eXAiOiJKV1QiLCJ0c3AiOjE3MzMyNTI0MDAsInJhaWQiOiJSQS1VU0VBU1QxLVBST0QtMDMifQ...", 1163 | "current_proof": { 1164 | "leaf_index": 1234567, 1165 | "tree_size": 9876543, 1166 | "tree_version": 1, 1167 | "path": ["sha256:def456789abc...", "sha256:ghi789012def...", "...up to ~40-47 hashes for very large trees..."], 1168 | "root_hash": "sha256:current1234...", 1169 | "root_signature": "eyJhbGciOiJFUzI1NiIsImtpZCI6ImFybjphd3M6a21zLXVzLWVhc3QtMToxMjM0NTY3ODkwMTI6a2V5L1Jvb3RLZXktQSIsInR5cCI6IkpXVCJ9..." 1170 | } 1171 | } 1172 | ``` 1173 | 1174 | **Detailed Signature Coverage:** 1175 | 1176 | 1. **producer_signature** covers all fields in the attestation EXCEPT: 1177 | - `producer_signature` itself 1178 | - `producer_key_id` 1179 | - `leaf_hash` (which is computed after signing) 1180 | 1181 | 2. **attestation_signature** covers: 1182 | - The ENTIRE `attestation` object as shown (including producer_signature and leaf_hash) 1183 | 1184 | 3. **root_signature** covers: 1185 | - Only the string value of `root_hash` (not the entire proof object) 1186 | 1187 | **Verification Process:** 1188 | 1. Verify `attestation_signature` covers the `attestation` object using the RA's public key 1189 | 2. The `producer_signature` is included for completeness but requires internal keys to verify 1190 | 3. Use the `current_proof` to mathematically verify that `leaf_hash` is included in the tree 1191 | 4. Verify `root_signature` covers the `root_hash` using the TL's public key 1192 | 1193 | **Note:** The producer signature is part of the sealed record but cannot be independently verified without access to internal producer keys. This is by design - external verifiers rely on the TL's operational integrity for producer validation. 1194 | 1195 | **Note on Path Length:** The path contains log₂(tree_size) hashes: 1196 | - 1 billion events ≈ 30 hashes (< 1KB) 1197 | - 1 trillion events ≈ 40 hashes (~1.3KB) 1198 | - 10 trillion events ≈ 44 hashes (~1.4KB) 1199 | 1200 | Even at extreme scale, the path remains a reasonable size to include in responses. 1201 | 1202 | **Tree Version:** The `tree_version` field increments when a new KMS signing key is activated. Verifiers use this to select the correct historical key for old proofs. The system supports key rotation without invalidating existing proofs, maintaining a clear audit trail of key changes. 1203 | 1204 | ### A.5 ANS integrity monitor failure report example 1205 | This is an example of the payload published by an AIM worker when it detects an integrity failure. This event is consumed by the Remediation Service. 1206 | 1207 | ```json 1208 | { 1209 | "event_type": "integrity_failure_detected", 1210 | "event_timestamp": "2025-10-06T11:00:00Z", 1211 | "worker_id": "aim-worker-ap-southeast-2-def456", 1212 | "fqdn": "support-agent.my-support-co.com", 1213 | "ans_name": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1214 | "check": { 1215 | "record_type": "_ans", 1216 | "failure_type": "MISMATCH", 1217 | "expected_value": "v=ans1; url=[https://support-agent.my-support-co.com/agent-card.json](https://support-agent.my-support-co.com/agent-card.json)", 1218 | "actual_value": "v=ans1; url=[https://malicious-site.com/evil-card.json](https://malicious-site.com/evil-card.json)" 1219 | } 1220 | } 1221 | ``` 1222 | 1223 | ### A.6 Lifecycle API payload example (revocation request) 1224 | This is an example of the JSON payload submitted by an AHP to the RA's /revoke endpoint. It is a simple, direct instruction. 1225 | 1226 | ```json 1227 | { 1228 | "request_type": "agent_revocation", 1229 | "ans_name": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1230 | "reason_code": "DECOMMISSIONED", 1231 | "reason_description": "Service is being retired." 1232 | } 1233 | ``` 1234 | 1235 | ### A.7 Pub/Sub event payload example (revocation event) 1236 | This is the resulting event published by the RA after successfully processing the revocation request and sealing it into the log. 1237 | 1238 | **Signature Scope:** Same as A.3 - the `signature` field signs all fields EXCEPT `signature` and `signature_kms_key_id`. 1239 | 1240 | ```json 1241 | { 1242 | "fqdn": "support-agent.my-support-co.com", 1243 | "agent_card_url": "https://support-agent.my-support-co.com/agent-card.json", 1244 | "ans_name": "a2a://support.customerService.PID-MSC-11.v1.5.0.my-support-co.com", 1245 | "agent_state": "revoked", 1246 | "meta": { 1247 | "description": "Provides customer support via conversational and transactional interfaces.", 1248 | "capabilities": ["lookupOrder", "getTicketStatus"], 1249 | "provider": "MySupportCo", 1250 | "registered_date": "2025-10-05T18:00:00Z", 1251 | "event_timestamp": "2025-11-20T14:00:00Z", 1252 | "endpoint": "wss://support-agent.my-support-co.com/a2a", 1253 | "validation_type": "acme-dns-01-ov", 1254 | "cert_types": { 1255 | "server": "x509-ov-server", 1256 | "identity": "x509-ov-client-ans" 1257 | }, 1258 | "revocation_reason": "DECOMMISSIONED", 1259 | "ra_badge_url": "https://transparency.ra.ansregistry.com/registration/reg-8v2k7x9p" 1260 | }, 1261 | "signature_kms_key_id": "arn:aws:kms:us-east-1:123456789012:key/RootKey-A", 1262 | "signature": "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9..." 1263 | } 1264 | ``` 1265 | 1266 | ### A.8 DNS record examples 1267 | This is an example of the suite of DNS records provisioned by the RA for the registered agent. 1268 | 1269 | ```DNS Zone file 1270 | $ORIGIN support-agent.my-support-co.com. 1271 | $TTL 3600 1272 | 1273 | ; --- Core Location Records (Managed by AHP) --- 1274 | @ IN A 203.0.113.50 1275 | @ IN AAAA 2001:db8::50 1276 | 1277 | ; --- (Records below provisioned by RA) --- 1278 | ; Specifies that the service supports HTTP/2 ("h2") 1279 | @ IN HTTPS 1 . alpn=h2 1280 | 1281 | ; Binds the TLS certificate to the DNS name via DNSSEC 1282 | _443._tcp IN TLSA 3 1 1 1283 | 1284 | ; Points to the AHP-hosted Agent Card 1285 | _ans IN TXT "v=ans1; url=https://support-agent.my-support-co.com/agent-card.json" 1286 | 1287 | ; Points to the RA-hosted Dynamic Badge Lander 1288 | _ra-badge IN TXT "v=ra-badge1; url=https://transparency.ra.ansregistry.com/registration/reg-8v2k7x9p" 1289 | 1290 | ; --- Gold Tier Addition (Requires DNSSEC) --- 1291 | ; For maximum, financial-grade security, Gold Tier agents add one more record. 1292 | ; CRITICAL: Per ADR 010, this record MUST be managed exclusively by the agent 1293 | ; owner (or their AHP), not the RA. It acts as the owner's final, independent 1294 | ; attestation, ensuring that even a compromised RA cannot forge an agent's 1295 | ; identity because it cannot create a matching DNS record. 1296 | _ans-identity._tls IN TLSA 3 1 1 1297 | ``` 1298 | 1299 | ### A.9 Producer key registration example 1300 | 1301 | Example of a producer (RA instance) registering a public key: 1302 | 1303 | **Request:** 1304 | ```http 1305 | POST /internal/v1/producer-keys 1306 | Content-Type: application/json 1307 | 1308 | { 1309 | "key_id": "ra1-prod-key-2024-01", 1310 | "public_key_pem": "-----BEGIN PUBLIC KEY-----\nMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...", 1311 | "algorithm": "ES256", 1312 | "ra_id": "ra-instance-prod-us-east-1", 1313 | "valid_from": "2024-01-15T00:00:00Z", 1314 | "expires_at": "2025-01-15T00:00:00Z", 1315 | "metadata": { 1316 | "environment": "production", 1317 | "region": "us-east-1", 1318 | "rotation_of": "ra1-prod-key-2023-01" 1319 | } 1320 | } 1321 | ``` 1322 | 1323 | **Response:** 1324 | ```json 1325 | { 1326 | "key_id": "ra1-prod-key-2024-01", 1327 | "status": "active", 1328 | "fingerprint": "sha256:a1b2c3d4...", 1329 | "created_at": "2024-01-15T00:00:00Z" 1330 | } 1331 | ``` --------------------------------------------------------------------------------