├── assets ├── citadel.png ├── AIAC-1.1.0.png └── AIAC-Governance-1.1.0.png ├── CHANGELOG.md ├── .github ├── CODE_OF_CONDUCT.md ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── LICENSE.md ├── CONTRIBUTING.md ├── .gitignore ├── README.md ├── Citadel-WAF-Alignment.md └── CITADEL-TECHNICAL-GUIDE.md /assets/citadel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/foundry-citadel-platform/HEAD/assets/citadel.png -------------------------------------------------------------------------------- /assets/AIAC-1.1.0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/foundry-citadel-platform/HEAD/assets/AIAC-1.1.0.png -------------------------------------------------------------------------------- /assets/AIAC-Governance-1.1.0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/foundry-citadel-platform/HEAD/assets/AIAC-Governance-1.1.0.png -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## [project-title] Changelog 2 | 3 | 4 | # x.y.z (yyyy-mm-dd) 5 | 6 | *Features* 7 | * ... 8 | 9 | *Bug Fixes* 10 | * ... 11 | 12 | *Breaking Changes* 13 | * ... 14 | -------------------------------------------------------------------------------- /.github/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Microsoft Open Source Code of Conduct 2 | 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 4 | 5 | Resources: 6 | 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns 10 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 4 | > Please provide us with the following information: 5 | > --------------------------------------------------------------- 6 | 7 | ### This issue is for a: (mark with an `x`) 8 | ``` 9 | - [ ] bug report -> please search issues before submitting 10 | - [ ] feature request 11 | - [ ] documentation issue or request 12 | - [ ] regression (a behavior that used to work and stopped in a new release) 13 | ``` 14 | 15 | ### Minimal steps to reproduce 16 | > 17 | 18 | ### Any log messages given by the failure 19 | > 20 | 21 | ### Expected/desired behavior 22 | > 23 | 24 | ### OS and Version? 25 | > Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) 26 | 27 | ### Versions 28 | > 29 | 30 | ### Mention any other details that might be useful 31 | 32 | > --------------------------------------------------------------- 33 | > Thanks! We'll be in touch soon. 34 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Purpose 2 | 3 | * ... 4 | 5 | ## Does this introduce a breaking change? 6 | 7 | ``` 8 | [ ] Yes 9 | [ ] No 10 | ``` 11 | 12 | ## Pull Request Type 13 | What kind of change does this Pull Request introduce? 14 | 15 | 16 | ``` 17 | [ ] Bugfix 18 | [ ] Feature 19 | [ ] Code style update (formatting, local variables) 20 | [ ] Refactoring (no functional changes, no api changes) 21 | [ ] Documentation content changes 22 | [ ] Other... Please describe: 23 | ``` 24 | 25 | ## How to Test 26 | * Get the code 27 | 28 | ``` 29 | git clone [repo-address] 30 | cd [repo-name] 31 | git checkout [branch-name] 32 | npm install 33 | ``` 34 | 35 | * Test the code 36 | 37 | ``` 38 | ``` 39 | 40 | ## What to Check 41 | Verify that the following are valid 42 | * ... 43 | 44 | ## Other Information 45 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to [project-title] 2 | 3 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 4 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 5 | the rights to use your contribution. For details, visit [Contributor License Agreements](https://cla.opensource.microsoft.com). 6 | 7 | When you submit a pull request, a CLA bot will automatically determine whether you need to provide 8 | a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions 9 | provided by the bot. You will only need to do this once across all repos using our CLA. 10 | 11 | - [Code of Conduct](#coc) 12 | - [Issues and Bugs](#issue) 13 | - [Feature Requests](#feature) 14 | - [Submission Guidelines](#submit) 15 | 16 | ## Code of Conduct 17 | Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 18 | 19 | ## Found an Issue? 20 | If you find a bug in the source code or a mistake in the documentation, you can help us by 21 | [submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can 22 | [submit a Pull Request](#submit-pr) with a fix. 23 | 24 | ## Want a Feature? 25 | You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub 26 | Repository. If you would like to *implement* a new feature, please submit an issue with 27 | a proposal for your work first, to be sure that we can use it. 28 | 29 | * **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr). 30 | 31 | ## Submission Guidelines 32 | 33 | ### Submitting an Issue 34 | Before you submit an issue, search the archive, maybe your question was already answered. 35 | 36 | If your issue appears to be a bug, and hasn't been reported, open a new issue. 37 | Help us to maximize the effort we can spend fixing issues and adding new 38 | features, by not reporting duplicate issues. Providing the following information will increase the 39 | chances of your issue being dealt with quickly: 40 | 41 | * **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps 42 | * **Version** - what version is affected (e.g. 0.1.2) 43 | * **Motivation for or Use Case** - explain what are you trying to do and why the current behavior is a bug for you 44 | * **Browsers and Operating System** - is this a problem with all browsers? 45 | * **Reproduce the Error** - provide a live example or a unambiguous set of steps 46 | * **Related Issues** - has a similar issue been reported before? 47 | * **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be 48 | causing the problem (line of code or commit) 49 | 50 | You can file new issues by providing the above information at the corresponding repository's issues link: 51 | replace`[organization-name]` and `[repository-name]` in 52 | `https://github.com/[organization-name]/[repository-name]/issues/new` . 53 | 54 | ### Submitting a Pull Request (PR) 55 | Before you submit your Pull Request (PR) consider the following guidelines: 56 | 57 | * Search the repository's [pull requests](https://github.com/[organization-name]/[repository-name]/pulls) for an open or closed PR 58 | that relates to your submission. You don't want to duplicate effort. 59 | 60 | * Make your changes in a new git fork: 61 | 62 | * Commit your changes using a descriptive commit message 63 | * Push your fork to GitHub: 64 | * In GitHub, create a pull request 65 | * If we suggest changes then: 66 | * Make the required updates. 67 | * Rebase your fork and force push to your GitHub repository (this will update your Pull Request): 68 | 69 | ```shell 70 | git rebase main -i 71 | git push -f 72 | ``` 73 | 74 | That's it! Thank you for your contribution! 75 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/main/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.rsuser 8 | *.suo 9 | *.user 10 | *.userosscache 11 | *.sln.docstates 12 | *.env 13 | 14 | # User-specific files (MonoDevelop/Xamarin Studio) 15 | *.userprefs 16 | 17 | # Mono auto generated files 18 | mono_crash.* 19 | 20 | # Build results 21 | [Dd]ebug/ 22 | [Dd]ebugPublic/ 23 | [Rr]elease/ 24 | [Rr]eleases/ 25 | x64/ 26 | x86/ 27 | [Ww][Ii][Nn]32/ 28 | [Aa][Rr][Mm]/ 29 | [Aa][Rr][Mm]64/ 30 | [Aa][Rr][Mm]64[Ee][Cc]/ 31 | bld/ 32 | [Oo]bj/ 33 | [Oo]ut/ 34 | [Ll]og/ 35 | [Ll]ogs/ 36 | 37 | # Build results on 'Bin' directories 38 | **/[Bb]in/* 39 | # Uncomment if you have tasks that rely on *.refresh files to move binaries 40 | # (https://github.com/github/gitignore/pull/3736) 41 | #!**/[Bb]in/*.refresh 42 | 43 | # Visual Studio 2015/2017 cache/options directory 44 | .vs/ 45 | # Uncomment if you have tasks that create the project's static files in wwwroot 46 | #wwwroot/ 47 | 48 | # Visual Studio 2017 auto generated files 49 | Generated\ Files/ 50 | 51 | # MSTest test Results 52 | [Tt]est[Rr]esult*/ 53 | [Bb]uild[Ll]og.* 54 | *.trx 55 | 56 | # NUnit 57 | *.VisualState.xml 58 | TestResult.xml 59 | nunit-*.xml 60 | 61 | # Approval Tests result files 62 | *.received.* 63 | 64 | # Build Results of an ATL Project 65 | [Dd]ebugPS/ 66 | [Rr]eleasePS/ 67 | dlldata.c 68 | 69 | # Benchmark Results 70 | BenchmarkDotNet.Artifacts/ 71 | 72 | # .NET Core 73 | project.lock.json 74 | project.fragment.lock.json 75 | artifacts/ 76 | 77 | # ASP.NET Scaffolding 78 | ScaffoldingReadMe.txt 79 | 80 | # StyleCop 81 | StyleCopReport.xml 82 | 83 | # Files built by Visual Studio 84 | *_i.c 85 | *_p.c 86 | *_h.h 87 | *.ilk 88 | *.meta 89 | *.obj 90 | *.idb 91 | *.iobj 92 | *.pch 93 | *.pdb 94 | *.ipdb 95 | *.pgc 96 | *.pgd 97 | *.rsp 98 | # but not Directory.Build.rsp, as it configures directory-level build defaults 99 | !Directory.Build.rsp 100 | *.sbr 101 | *.tlb 102 | *.tli 103 | *.tlh 104 | *.tmp 105 | *.tmp_proj 106 | *_wpftmp.csproj 107 | *.log 108 | *.tlog 109 | *.vspscc 110 | *.vssscc 111 | .builds 112 | *.pidb 113 | *.svclog 114 | *.scc 115 | 116 | # Chutzpah Test files 117 | _Chutzpah* 118 | 119 | # Visual C++ cache files 120 | ipch/ 121 | *.aps 122 | *.ncb 123 | *.opendb 124 | *.opensdf 125 | *.sdf 126 | *.cachefile 127 | *.VC.db 128 | *.VC.VC.opendb 129 | 130 | # Visual Studio profiler 131 | *.psess 132 | *.vsp 133 | *.vspx 134 | *.sap 135 | 136 | # Visual Studio Trace Files 137 | *.e2e 138 | 139 | # TFS 2012 Local Workspace 140 | $tf/ 141 | 142 | # Guidance Automation Toolkit 143 | *.gpState 144 | 145 | # ReSharper is a .NET coding add-in 146 | _ReSharper*/ 147 | *.[Rr]e[Ss]harper 148 | *.DotSettings.user 149 | 150 | # TeamCity is a build add-in 151 | _TeamCity* 152 | 153 | # DotCover is a Code Coverage Tool 154 | *.dotCover 155 | 156 | # AxoCover is a Code Coverage Tool 157 | .axoCover/* 158 | !.axoCover/settings.json 159 | 160 | # Coverlet is a free, cross platform Code Coverage Tool 161 | coverage*.json 162 | coverage*.xml 163 | coverage*.info 164 | 165 | # Visual Studio code coverage results 166 | *.coverage 167 | *.coveragexml 168 | 169 | # NCrunch 170 | _NCrunch_* 171 | .NCrunch_* 172 | .*crunch*.local.xml 173 | nCrunchTemp_* 174 | 175 | # MightyMoose 176 | *.mm.* 177 | AutoTest.Net/ 178 | 179 | # Web workbench (sass) 180 | .sass-cache/ 181 | 182 | # Installshield output folder 183 | [Ee]xpress/ 184 | 185 | # DocProject is a documentation generator add-in 186 | DocProject/buildhelp/ 187 | DocProject/Help/*.HxT 188 | DocProject/Help/*.HxC 189 | DocProject/Help/*.hhc 190 | DocProject/Help/*.hhk 191 | DocProject/Help/*.hhp 192 | DocProject/Help/Html2 193 | DocProject/Help/html 194 | 195 | # Click-Once directory 196 | publish/ 197 | 198 | # Publish Web Output 199 | *.[Pp]ublish.xml 200 | *.azurePubxml 201 | # Note: Comment the next line if you want to checkin your web deploy settings, 202 | # but database connection strings (with potential passwords) will be unencrypted 203 | *.pubxml 204 | *.publishproj 205 | 206 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 207 | # checkin your Azure Web App publish settings, but sensitive information contained 208 | # in these scripts will be unencrypted 209 | PublishScripts/ 210 | 211 | # NuGet Packages 212 | *.nupkg 213 | # NuGet Symbol Packages 214 | *.snupkg 215 | # The packages folder can be ignored because of Package Restore 216 | **/[Pp]ackages/* 217 | # except build/, which is used as an MSBuild target. 218 | !**/[Pp]ackages/build/ 219 | # Uncomment if necessary however generally it will be regenerated when needed 220 | #!**/[Pp]ackages/repositories.config 221 | # NuGet v3's project.json files produces more ignorable files 222 | *.nuget.props 223 | *.nuget.targets 224 | 225 | # Microsoft Azure Build Output 226 | csx/ 227 | *.build.csdef 228 | 229 | # Microsoft Azure Emulator 230 | ecf/ 231 | rcf/ 232 | 233 | # Windows Store app package directories and files 234 | AppPackages/ 235 | BundleArtifacts/ 236 | Package.StoreAssociation.xml 237 | _pkginfo.txt 238 | *.appx 239 | *.appxbundle 240 | *.appxupload 241 | 242 | # Visual Studio cache files 243 | # files ending in .cache can be ignored 244 | *.[Cc]ache 245 | # but keep track of directories ending in .cache 246 | !?*.[Cc]ache/ 247 | 248 | # Others 249 | ClientBin/ 250 | ~$* 251 | *~ 252 | *.dbmdl 253 | *.dbproj.schemaview 254 | *.jfm 255 | *.pfx 256 | *.publishsettings 257 | orleans.codegen.cs 258 | 259 | # Including strong name files can present a security risk 260 | # (https://github.com/github/gitignore/pull/2483#issue-259490424) 261 | #*.snk 262 | 263 | # Since there are multiple workflows, uncomment next line to ignore bower_components 264 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 265 | #bower_components/ 266 | 267 | # RIA/Silverlight projects 268 | Generated_Code/ 269 | 270 | # Backup & report files from converting an old project file 271 | # to a newer Visual Studio version. Backup files are not needed, 272 | # because we have git ;-) 273 | _UpgradeReport_Files/ 274 | Backup*/ 275 | UpgradeLog*.XML 276 | UpgradeLog*.htm 277 | ServiceFabricBackup/ 278 | *.rptproj.bak 279 | 280 | # SQL Server files 281 | *.mdf 282 | *.ldf 283 | *.ndf 284 | 285 | # Business Intelligence projects 286 | *.rdl.data 287 | *.bim.layout 288 | *.bim_*.settings 289 | *.rptproj.rsuser 290 | *- [Bb]ackup.rdl 291 | *- [Bb]ackup ([0-9]).rdl 292 | *- [Bb]ackup ([0-9][0-9]).rdl 293 | 294 | # Microsoft Fakes 295 | FakesAssemblies/ 296 | 297 | # GhostDoc plugin setting file 298 | *.GhostDoc.xml 299 | 300 | # Node.js Tools for Visual Studio 301 | .ntvs_analysis.dat 302 | node_modules/ 303 | 304 | # Visual Studio 6 build log 305 | *.plg 306 | 307 | # Visual Studio 6 workspace options file 308 | *.opt 309 | 310 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 311 | *.vbw 312 | 313 | # Visual Studio 6 auto-generated project file (contains which files were open etc.) 314 | *.vbp 315 | 316 | # Visual Studio 6 workspace and project file (working project files containing files to include in project) 317 | *.dsw 318 | *.dsp 319 | 320 | # Visual Studio 6 technical files 321 | *.ncb 322 | *.aps 323 | 324 | # Visual Studio LightSwitch build output 325 | **/*.HTMLClient/GeneratedArtifacts 326 | **/*.DesktopClient/GeneratedArtifacts 327 | **/*.DesktopClient/ModelManifest.xml 328 | **/*.Server/GeneratedArtifacts 329 | **/*.Server/ModelManifest.xml 330 | _Pvt_Extensions 331 | 332 | # Paket dependency manager 333 | **/.paket/paket.exe 334 | paket-files/ 335 | 336 | # FAKE - F# Make 337 | **/.fake/ 338 | 339 | # CodeRush personal settings 340 | **/.cr/personal 341 | 342 | # Python Tools for Visual Studio (PTVS) 343 | **/__pycache__/ 344 | *.pyc 345 | 346 | # Cake - Uncomment if you are using it 347 | #tools/** 348 | #!tools/packages.config 349 | 350 | # Tabs Studio 351 | *.tss 352 | 353 | # Telerik's JustMock configuration file 354 | *.jmconfig 355 | 356 | # BizTalk build output 357 | *.btp.cs 358 | *.btm.cs 359 | *.odx.cs 360 | *.xsd.cs 361 | 362 | # OpenCover UI analysis results 363 | OpenCover/ 364 | 365 | # Azure Stream Analytics local run output 366 | ASALocalRun/ 367 | 368 | # MSBuild Binary and Structured Log 369 | *.binlog 370 | MSBuild_Logs/ 371 | 372 | # AWS SAM Build and Temporary Artifacts folder 373 | .aws-sam 374 | 375 | # NVidia Nsight GPU debugger configuration file 376 | *.nvuser 377 | 378 | # MFractors (Xamarin productivity tool) working folder 379 | **/.mfractor/ 380 | 381 | # Local History for Visual Studio 382 | **/.localhistory/ 383 | 384 | # Visual Studio History (VSHistory) files 385 | .vshistory/ 386 | 387 | # BeatPulse healthcheck temp database 388 | healthchecksdb 389 | 390 | # Backup folder for Package Reference Convert tool in Visual Studio 2017 391 | MigrationBackup/ 392 | 393 | # Ionide (cross platform F# VS Code tools) working folder 394 | **/.ionide/ 395 | 396 | # Fody - auto-generated XML schema 397 | FodyWeavers.xsd 398 | 399 | # VS Code files for those working on multiple tools 400 | .vscode/* 401 | !.vscode/settings.json 402 | !.vscode/tasks.json 403 | !.vscode/launch.json 404 | !.vscode/extensions.json 405 | !.vscode/*.code-snippets 406 | 407 | # Local History for Visual Studio Code 408 | .history/ 409 | 410 | # Built Visual Studio Code Extensions 411 | *.vsix 412 | 413 | # Windows Installer files from build outputs 414 | *.cab 415 | *.msi 416 | *.msix 417 | *.msm 418 | *.msp 419 | 420 | # Local files 421 | local/** -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Executive Summary: AI Governance at Speed 2 | 3 | ## Bridging Governance Requirements and Developer Velocity with Foundry Citadel Platform 4 | 5 | --- 6 | 7 | ## The AI Governance Imperative 8 | 9 | As AI systems become more powerful and integrated into everyday life, **governance is no longer a "nice-to-have"; it's a must**. Whether you're aligning to emerging regulations like the EU AI Act, meeting internal standards for risk and safety, or ensuring your AI systems are meeting your enterprise's business goals with scale and efficiency, the ability to govern AI responsibly at speed is a game-changer. 10 | 11 | --- 12 | 13 | ## The Governance-Velocity Paradox 14 | 15 | Yet, **governance and developer velocity often feel fundamentally misaligned**. Organizations face critical bottlenecks: 16 | 17 | - **Manual Risk Assessments**: Frequently time-consuming and lacking standardization 18 | - **Scattered Evaluation Tools**: Fragmented across different teams and systems 19 | - **Unclear Governance Requirements**: Ambiguous policies that are difficult to operationalize 20 | - **Implementation Gaps**: Policies rarely map cleanly to real-world technical implementation 21 | 22 | **The result?** Bottlenecks and delays that frustrate both governance teams and developers, slowing AI adoption and increasing organizational risk. 23 | 24 | --- 25 | 26 | ## The Collaboration Challenge 27 | 28 | Effective AI governance demands a new balance—**one that enforces oversight without impeding innovation**. It also requires multiple stakeholders collaborating effectively with each other: 29 | 30 | ### 👔 **Compliance Officers & Chief AI Officers** 31 | Must determine **what needs to be assessed** to comply with company policies and regulations 32 | 33 | ### 👨‍💻 **AI Developers & Engineering Teams** 34 | Need to **operationalize these requirements** by generating the right qualitative and quantitative evidence 35 | 36 | **Unfortunately**, the handshake between these personas is often not smooth and can create friction in the governance process. Traditional methods tend to create friction, slowing down deployment or leading to incomplete compliance. **It's a trade-off most organizations can no longer afford.** 37 | 38 | --- 39 | 40 | ## The Foundry Citadel Platform Solution 41 | 42 | **That's where Foundry Citadel Platform steps in.** 43 | 44 | Foundry Citadel Platform is a comprehensive solution accelerator that bridges the gap between governance requirements and technical implementation, enabling organizations to: 45 | 46 | ### 🛡️ **Govern AI Responsibly** 47 | - **Unified AI Gateway**: Single control point for all AI model access with enterprise-wide policy enforcement 48 | - **Automated Compliance**: Built-in safety checks, content filtering, and policy validation without manual intervention 49 | - **Central AI Registry**: Catalog and govern all AI assets—models, agents, and tools—across the enterprise 50 | 51 | ### 📊 **Maintain Complete Visibility** 52 | - **Platform-Level Observability**: Centralized monitoring across all AI workloads without code changes 53 | - **Agent-Level Tracing**: Detailed execution paths for debugging and quality assurance 54 | - **Automated Evaluations**: Continuous quality, safety, and compliance assessments applied consistently 55 | 56 | ### 🚀 **Accelerate Innovation** 57 | - **Pre-built Templates**: One-click deployment of secure, governed AI environments 58 | - **Flexible Development Options**: From low-code (Azure Logic Apps Agent Loop), managed agents runtime (AI Foundry Agents) to pro-code (Microsoft Agent Framework, LangChain,...) 59 | - **DevOps Integration**: CI/CD pipelines with automated testing and evaluation 60 | 61 | --- 62 | 63 | ## Key Business Outcomes 64 | 65 | Organizations adopting Foundry Citadel Platform achieve: 66 | 67 | | Outcome | Impact | 68 | |---------|--------| 69 | | **🎯 Faster Time-to-Value** | Deploy AI solutions in days, not months, with pre-configured infrastructure | 70 | | **🔒 Reduced Risk** | Automated governance ensures compliance from day one | 71 | | **💰 Cost Control** | Granular usage tracking and quota enforcement per team/project | 72 | | **📈 Scalable Adoption** | Repeatable patterns that grow with your organization | 73 | | **🤝 Cross-Functional Alignment** | Clear contracts between governance and development teams | 74 | | **🌐 Universal Gateway & Registry** | Unified access, governance and discovery of central AI assets | 75 | 76 | --- 77 | 78 | ## Enterprise Statistics Driving Citadel Adoption 79 | 80 | Real-world enterprise challenges that Citadel addresses: 81 | 82 | - **62%** of practitioners cite **security concerns** as the top blocker to wider AI adoption 83 | - **71%** of enterprises struggle to **track AI usage, enforce quotas, and report costs** per team 84 | - **47%** of organizations require **explicit guardrails** before deploying autonomous AI agents safely 85 | - **70%** of customers need an **AI registry** for LLMs, agents, and tools to adopt AI at scale 86 | 87 | --- 88 | 89 | ## The Three Pillars of Foundry Citadel Platform 90 | 91 | ### 1️⃣ **Governance & Security** – *Trustworthy AI Operations at Scale* 92 | Without centralized AI governance, organizations face unpredictable costs, reliability issues, security risks, and compliance nightmares. Citadel builds guardrails into every AI call through: 93 | - Unified AI Gateway for centralized control 94 | - Granular access control and key management 95 | - Multi-cloud and hybrid support 96 | - AI content safety and prompt shields 97 | - Central AI registry for agents and tools 98 | 99 | ### 2️⃣ **Observability & Compliance** – *End-to-End Monitoring, Evaluation & Trust* 100 | Full visibility creates trust and confidence. Citadel provides holistic observability through: 101 | - **Platform-Level**: Centralized APM, usage tracking, automated evaluations, and enterprise alerting 102 | - **Agent-Level**: Detailed execution traces, performance monitoring, and debugging tools 103 | - **Rich Dashboards**: Integrated views for both operational and development teams 104 | 105 | ### 3️⃣ **AI Development Velocity** – *Accelerating Innovation with Templates & Tools* 106 | Build fast, build right. Citadel empowers teams to innovate quickly within established guardrails: 107 | - Pre-built deployment templates 108 | - Flexible agent development options (low-code to pro-code) 109 | - Citadel AI Registry for asset discovery and reuse 110 | - DevOps integration for continuous delivery 111 | 112 | --- 113 | 114 | ## Two-Tier Architecture for Enterprise Scale 115 | 116 | ### **Citadel Governance Hub (CGH)** – Central Control Plane 117 | The enterprise-wide governance layer providing: 118 | - Unified AI gateway for all centralized AI models and MCP tools access 119 | - Universal AI registry for discovery and cataloging 120 | - Platform-level evaluations and compliance reporting 121 | - Usage analytics and cost allocation 122 | - Centralized security and safety enforcement 123 | 124 | ### **Citadel Agent Spoke (CAS)** – Domain-Specific Deployments 125 | Secure, isolated environments for AI agent workloads featuring: 126 | - Azure AI Foundry with agent capabilities 127 | - Comprehensive AI services (Search, Cosmos DB, Storage) 128 | - Zero Trust architecture with private endpoints 129 | - Auto-scaling container infrastructure 130 | - Hub-spoke integration with enterprise networks 131 | 132 | --- 133 | 134 | ## From Challenge to Solution: The Citadel Advantage 135 | 136 | | Traditional Approach | Foundry Citadel Platform | 137 | |---------------------|-------------------------| 138 | | ❌ Manual risk assessments | ✅ Automated compliance checks | 139 | | ❌ Scattered evaluation tools | ✅ Unified observability platform | 140 | | ❌ Unclear requirements | ✅ Codified governance contracts | 141 | | ❌ Implementation gaps | ✅ Pre-built, proven patterns | 142 | | ❌ Friction between teams | ✅ Streamlined collaboration | 143 | | ❌ Slow deployment cycles | ✅ Rapid, repeatable deployments | 144 | 145 | --- 146 | 147 | ## Strategic Partnerships & Integrations 148 | 149 | Foundry Citadel Platform bridges the gap between governance requirements and technical implementation through strategic integrations: 150 | 151 | - **Azure AI Foundry**: Enterprise AI platform with advance model catalog, managed agent services, and AI evaluations/observability 152 | - **Azure API Management**: Unified AI gateway for governance and policy enforcement 153 | - **Azure Monitor & Application Insights**: Comprehensive observability 154 | - **Azure Content Safety**: Automated GenAI safety checks and content filtering 155 | - **Microsoft Entra ID**: Identity and access management 156 | - **Microsoft Defender for AI**: Threat detection and security monitoring 157 | - **Microsoft Purview**: Data governance and sensitivity labeling 158 | 159 | --- 160 | 161 | ## The Bottom Line 162 | 163 | **Effective AI governance no longer means choosing between speed and safety.** 164 | 165 | Foundry Citadel Platform enables organizations to: 166 | - ✅ **Deploy AI with confidence** – knowing governance is built-in, not bolted-on 167 | - ✅ **Scale AI responsibly** – with consistent policies across all projects 168 | - ✅ **Accelerate innovation** – within secure, compliant guardrails 169 | - ✅ **Bridge organizational silos** – aligning governance, development, and operations 170 | 171 | --- 172 | 173 | ## Call to Action 174 | 175 | The challenge of governing AI at speed is precisely why Foundry Citadel Platform exists. By providing a comprehensive solution that addresses governance, observability, and development velocity in an integrated way, Citadel transforms the traditional trade-off between control and innovation into a **synergistic relationship**. 176 | 177 | **Organizations can now:** 178 | 1. Establish enterprise-wide AI governance from day one 179 | 2. Empower developers with self-service, governed AI capabilities 180 | 3. Maintain complete visibility and control as AI adoption scales 181 | 4. Meet regulatory requirements with automated compliance evidence 182 | 5. Accelerate time-to-value while reducing organizational risk 183 | 184 | --- 185 | 186 | ## Next Steps 187 | 188 | To learn more about how Foundry Citadel Platform can help your organization govern AI responsibly at speed: 189 | 190 | - **📘 Review the Full Documentation**: See [README.md](./CITADEL-TECHNICAL-GUIDE.md) for comprehensive technical details 191 | - **🏗️ Explore the AI Hub Gateway (Citadel Governance Hub)**: Visit the [AI Hub Gateway repository](https://aka.ms/ai-hub-gateway) 192 | - **🤖 Deploy Citadel Agent Spoke (Citadel Agent Spoke)**: Check out the [AI Landing Zones repository](https://github.com/Azure/AI-Landing-Zones) 193 | - **💬 Engage with Our Team**: Reach out to discuss your specific governance and AI adoption challenges 194 | 195 | --- 196 | 197 | *"Build the future, safely"* – Foundry Citadel Platform provides the **speed** that business demands with the **safeguards** that IT requires, all in one comprehensive, evolving platform. 198 | -------------------------------------------------------------------------------- /Citadel-WAF-Alignment.md: -------------------------------------------------------------------------------- 1 | # Foundry Citadel Platform - Azure Well-Architected Framework Alignment 2 | 3 | > **How Citadel Implements Microsoft Well-Architected Framework Principles for AI Workloads** 4 | 5 | **Document Version:** 1.0 6 | **Last Updated:** November 10, 2025 7 | **Reference:** [Azure Well-Architected Framework - AI Design Principles](https://learn.microsoft.com/en-us/azure/well-architected/ai/design-principles) 8 | 9 | --- 10 | 11 | ## Overview 12 | 13 | The **Foundry Citadel Platform** is architected to align with the Microsoft Well-Architected Framework (WAF) for AI workloads, delivering enterprise-grade AI solutions through three core pillars: 14 | 15 | - **Governance & Security** - Enterprise-grade controls, responsible AI, and data protection 16 | - **Observability & Compliance** - Comprehensive monitoring, auditing, and regulatory compliance 17 | - **AI Development Velocity** - Accelerated development with best practices and automation 18 | 19 | This document demonstrates how Citadel's concrete technical implementations address the five WAF pillars: **Reliability**, **Security**, **Cost Optimization**, **Operational Excellence**, and **Performance Efficiency**. 20 | 21 | --- 22 | 23 | ## Architecture Alignment Summary 24 | 25 | | WAF Pillar | Alignment Status | Key Citadel Capabilities | 26 | |------------|------------------|--------------------------| 27 | | **Reliability** | 🟢 Strong | Multi-region support, high availability, automated failover, resilient architecture | 28 | | **Security** | 🟢 Strong | Zero Trust, content safety, RBAC, encryption, network isolation | 29 | | **Cost Optimization** | 🟢 Strong | Usage tracking, quota management, auto-scaling, cost attribution | 30 | | **Operational Excellence** | 🟢 Strong | Automated monitoring, CI/CD integration, DevOps/AIOps support | 31 | | **Performance Efficiency** | 🟢 Strong | Load balancing, auto-scaling, performance monitoring, quality metrics | 32 | 33 | **Legend:** 🟢 Strong | 🟡 Partial | 🔴 Limited 34 | 35 | --- 36 | 37 | ## 1. Reliability - Building Resilient AI Workloads 38 | 39 | ### WAF Principle: Design Reliable AI Systems 40 | 41 | Citadel ensures AI workloads remain available and can recover from failures while maintaining model performance over time. 42 | 43 | ### Citadel Implementation 44 | 45 | #### Multi-Region High Availability 46 | - **Multi-region LLM deployments** with automated failover for continuous service availability 47 | - **Reliable state** for conversation history and agent state on Cosmos DB and Azure Monitor 48 | - **Availability zones support** for critical components in supported regions 49 | 50 | #### Fault Tolerance & Resilience 51 | - **AI Gateway (Azure API Management)** provides circuit breakers, retry logic, and bulkhead patterns 52 | - **Distributed architecture** with separate Citadel Governance Hub and Citadel Agents Spoke landing zones following Hub/Spoke model 53 | - **Network isolation** with NSGs and private endpoints preventing cascading failures 54 | - **Service isolation** through containerization and separate resource boundaries where every agentic deployment in separate Citadel Agent Spoke with central RBAC through Citadel Governance Hub 55 | 56 | #### Operational Reliability 57 | - **Automated workflows via Logic Apps** reducing manual intervention and human error 58 | - **Azure AI Foundry managed runtime** providing reliable, maintained agent execution environment. 59 | - **Version-controlled infrastructure** using Bicep and source controlled configurations (Citadel Contracts) for consistent, repeatable deployments of both central components and day-2 configurations 60 | 61 | ### Key Features 62 | 63 | | Feature | Benefit | Implementation | 64 | |---------|---------|----------------| 65 | | Multi-Region LLM | Ensures API availability even during regional outages | Automated failover between LLM backends/regions | 66 | | High-Availability Gateway | 99.95% SLA for API requests | Azure API Management Premium tier with multi-availability-zones and/or multi-region | 67 | | Distributed Data Stores | Data remains accessible during failures | Leveraging Cosmos DB and Azure Monitor log analytics | 68 | | Auto-Recovery | Minimizes downtime from transient failures | Circuit breakers and exponential backoff retry policies | 69 | 70 | --- 71 | 72 | ## 2. Security - Protecting AI Workloads and Data 73 | 74 | ### WAF Principle: Secure AI Systems and Earn User Trust 75 | 76 | Citadel implements defense-in-depth security with Zero Trust architecture, content safety, and comprehensive data protection. 77 | 78 | ### Citadel Implementation 79 | 80 | #### Earn User Trust with Responsible AI 81 | - **Azure AI Content Safety integration** for all incoming requests and outgoing responses 82 | - **Prompt Shield protection** against jailbreak attempts and prompt injection attacks 83 | - **Protected content detection** screening for sensitive data at both AI Gateway level and at LLM model level (through Microsoft Purview) 84 | - **Bidirectional content moderation** ensuring both user inputs and AI outputs are safe 85 | - **Groundedness detection** validating AI responses against source documents to prevent hallucinations through AI Foundry Evals 86 | 87 | #### Data Protection at All Layers 88 | - **Encryption at rest** with platfrom managed keys 89 | - **Encryption in transit** enforced via HTTPS/TLS 1.2+ for all communication 90 | - **Private endpoints** for all AI services eliminating public internet exposure 91 | - **Network security groups (NSGs)** controlling traffic flow between subnets 92 | - **Virtual network integration** for all compute and data services 93 | 94 | #### Robust Access Management 95 | - **Gateway-keys pattern** - No direct API key exposure to users or applications 96 | - **Managed identities** for service-to-service authentication eliminating stored credentials 97 | - **Azure RBAC integration** providing granular permissions across all components 98 | - **Role-based authorization at AI Gateway** enforcing least-privilege access per user/team 99 | - **Azure Key Vault** for centralized secrets management with audit logging 100 | 101 | #### Network Segmentation & Zero Trust 102 | - **Zero Trust architecture** with assume breach mentality 103 | - **Dedicated subnets with NSGs** for each service tier (web, app, data, AI) 104 | - **Private networking** for container images, training data, and source code 105 | - **Separate landing zones** (CGH for governance, CAS for agents) with controlled connectivity 106 | - **Hub-spoke network topology** with centralized security controls 107 | 108 | #### Security Testing & Compliance 109 | - **CI/CD integration** for automated security scanning in deployment pipelines 110 | - **Security policy enforcement** at gateway level before requests reach AI services 111 | - **Container vulnerability scanning** in Azure Container Registry 112 | - **Microsoft Purview integration** for data classification and governance 113 | - **Audit logging** of all access and operations for compliance reporting 114 | 115 | #### Minimize Attack Surface 116 | - **Authentication required** for all inferencing endpoints - no anonymous access 117 | - **Constrained API design** through AI Gateway limiting exposed functionality 118 | - **API versioning and deprecation** allowing secure evolution of interfaces 119 | - **Rate/token limiting and throttling** preventing abuse and resource exhaustion 120 | - **Input validation** at multiple layers preventing injection attacks 121 | 122 | ### Key Features 123 | 124 | | Feature | Benefit | Implementation | 125 | |---------|---------|----------------| 126 | | Content Safety | Prevents harmful content from entering or leaving the system | Azure AI Content Safety with custom policies | 127 | | Zero Trust Networking | Eliminates implicit trust, reduces breach impact | Private endpoints, NSGs, no public internet access | 128 | | Managed Identities | No credentials in code or config files | Azure Managed Identity for all service-to-service auth | 129 | | Gateway-Keys Pattern | Centralized access control and monitoring | API keys managed at gateway, not exposed to clients | 130 | | Data Encryption | Protects sensitive data at rest and in transit | Platform managed or with CMK encryption with Key Vault integration | 131 | 132 | --- 133 | 134 | ## 3. Cost Optimization - Maximizing ROI 135 | 136 | ### WAF Principle: Optimize Costs Without Sacrificing Quality 137 | 138 | Citadel provides comprehensive cost visibility, tracking, and optimization to maximize return on AI investments. 139 | 140 | ### Citadel Implementation 141 | 142 | #### Determine Cost Drivers 143 | - **Granular usage analytics** tracking consumption by team, use case, and individual agent 144 | - **Token consumption trends** with historical analysis in Cosmos DB 145 | - **Cost attribution dashboard** in Citadel Governance Hub showing spend breakdown 146 | - **Resource tagging strategy** enabling chargeback and showback models 147 | - **Integrated Azure Cost Management** with budget alerts and forecasting 148 | 149 | #### Pay for What You Intend to Use 150 | - **Auto-scaling Container Apps & Foundry Agents** automatically adjusting compute based on demand 151 | - **Multiple AI service tiers** supporting different performance and cost profiles 152 | - **Serverless options** via Logic Apps and Azure Functions for event-driven workloads 153 | - **Consumption-based pricing** for applicable compoenets (like Azure OpenAI pay-as-you-go) 154 | - **Flexible deployment options** allowing teams to choose cost-performance balance 155 | 156 | #### Use What You Pay For (Minimize Waste) 157 | - **Token quotas and rate limiting** preventing accidental overspending 158 | - **Auto-scaling with scale-to-zero** deallocating resources during idle periods 159 | - **Centralized monitoring** of utilization metrics identifying underused resources 160 | - **Cost accountability** assigned to operations teams with regular reviews 161 | - **Automated resource cleanup** removing unused deployments and test environments 162 | 163 | #### Optimize Operational Costs 164 | - **Automated workflows** via Logic Apps reducing manual operational overhead 165 | - **PaaS-first approach** minimizing infrastructure management costs 166 | - **Shared infrastructure** across multiple agents and teams reducing duplication 167 | - **DevOps automation** reducing time-to-market and manual deployment costs 168 | 169 | ### Key Features 170 | 171 | | Feature | Benefit | Implementation | 172 | |---------|---------|----------------| 173 | | Usage Analytics | Understand where AI costs are incurred | Real-time dashboards with drill-down by dimension | 174 | | Token Quotas | Prevent runaway costs from misbehaving agents | Configurable limits per user/team/agent | 175 | | Auto-Scaling | Pay only for active workloads | Container Apps/Foundry Agents with scale-to-zero capability | 176 | | Cost Attribution | Chargeback/showback to business units | Tagging and reporting by cost center | 177 | | Monitoring & Alerts | Proactive cost anomaly detection | Azure Monitor alerts on budget thresholds | 178 | 179 | --- 180 | 181 | ## 4. Operational Excellence - Automation and Continuous Improvement 182 | 183 | ### WAF Principle: Streamline Operations and Enable Innovation 184 | 185 | Citadel enables DevOps, and GenAIOps practices with comprehensive automation, monitoring, and safe deployment patterns. 186 | 187 | ### Citadel Implementation Recommendations 188 | 189 | #### Minimize Operational Burden 190 | - **PaaS-first architecture** using managed services (AI Foundry, API Management, Container Apps) 191 | - **Managed identities** eliminating credential rotation and secret management overhead 192 | - **Automated workflow orchestration** via Logic Apps for common operational tasks 193 | - **Infrastructure-as-Code (Bicep)** enabling one-click deployments and consistent environments 194 | - **Template-based agent deployment** empowering developers while maintaining governance 195 | 196 | #### Automated Monitoring with Actionable Alerts 197 | - **Azure Monitor Application Insights** integrated across all components 198 | - **Comprehensive dashboards** at platform and individual agent levels 199 | - **Actionable alerts** with context-specific remediation guidance 200 | - **Enterprise notification integration** (Teams, email, ticketing systems) 201 | - **Automated quality measurements** with trend analysis and anomaly detection 202 | - **End-to-end tracing** from user request through AI processing to response 203 | 204 | #### Detect and Mitigate Model Performance Issues 205 | - **Automated evaluations** at platform level measuring groundedness, relevance, coherence 206 | - **CI/CD integration** for regression testing before deployment 207 | - **Quality metrics tracking** over time identifying model drift 208 | - **Conversation replay capability** for debugging and quality analysis 209 | - **Feedback collection** from users feeding continuous improvement 210 | 211 | #### Safe Deployments 212 | - **CI/CD pipelines** with automated testing gates 213 | - **Multiple deployment strategies** support (blue/green, canary, rolling updates) 214 | - **Pre-production testing environments** mirroring production configuration 215 | - **Automated rollback capabilities** when health checks fail 216 | - **Change tracking and audit logs** for compliance and troubleshooting 217 | 218 | #### Evaluate and Improve User Experience 219 | - **User feedback mechanisms** integrated into agent interfaces 220 | - **Conversation logging with consent** enabling analysis and improvement 221 | - **Engagement metrics** tracking user satisfaction and agent effectiveness 222 | - **Session analytics** understanding user behavior patterns 223 | - **Continuous improvement loop** from feedback to model refinement 224 | 225 | ### Key Features 226 | 227 | | Feature | Benefit | Implementation | 228 | |---------|---------|----------------| 229 | | Comprehensive Monitoring | Full visibility into AI workload health | Application Insights with custom metrics and logs | 230 | | Automated Evaluations | Ensure quality before and after deployment | AI Foundry evaluation pipelines in CI/CD | 231 | | DevOps Integration | Accelerate development while maintaining quality | GitHub/Azure DevOps with automated gates | 232 | | Feedback Loops | Continuous improvement from production insights | User feedback, conversation analytics, quality metrics | 233 | | Infrastructure-as-Code | Consistent, repeatable deployments | Bicep templates with version control | 234 | 235 | --- 236 | 237 | ## 5. Performance Efficiency - Optimizing AI Workload Performance 238 | 239 | ### WAF Principle: Meet Performance Requirements Efficiently 240 | 241 | Citadel ensures AI workloads meet performance targets through proper resource allocation, monitoring, and continuous optimization. 242 | 243 | ### Citadel Implementation 244 | 245 | #### Establish Performance Benchmarks 246 | - **Agent-level performance monitoring** tracking latency, throughput, and token consumption 247 | - **Quality metrics tracking** measuring groundedness, relevance, coherence, fluency 248 | - **Continuous re-evaluation** ensuring performance remains within acceptable ranges 249 | - **Baseline establishment** for each agent type and use case 250 | - **Performance trend analysis** identifying degradation over time 251 | 252 | #### Evaluate and Right-Size Resources 253 | - **Multiple SKU options** allowing teams to balance performance and cost 254 | - **Load balancing via Application Gateway** distributing traffic for optimal resource utilization 255 | - **Auto-scaling Container Apps** dynamically adjusting resources based on actual demand 256 | - **Container resource quotas** preventing resource contention and ensuring fair allocation 257 | 258 | #### Collect and Analyze Performance Metrics 259 | - **Telemetry from all layers** - data pipeline, orchestration, model inference, and UI 260 | - **Query latency and throughput tracking** with percentile analysis (p50, p95, p99) 261 | - **End-to-end tracing** of agent execution identifying bottlenecks 262 | - **Token consumption monitoring** optimizing prompt engineering for efficiency 263 | - **Near real-time dashboards** enabling quick performance issue identification 264 | 265 | #### Continuous Performance Improvement 266 | - **Automated metric collection** feeding analysis and optimization 267 | - **CI/CD integration** for performance regression testing 268 | - **Production feedback loops** informing optimization decisions 269 | - **Performance optimization recommendations** based on observed patterns 270 | - **Caching strategies** reducing redundant processing and API calls 271 | 272 | #### Load Balancing and Distribution 273 | - **Multi-region load distribution** balancing traffic across Azure LLM deployments 274 | 275 | ### Key Features 276 | 277 | | Feature | Benefit | Implementation | 278 | |---------|---------|----------------| 279 | | Performance Monitoring | Near real-time visibility into latency and throughput | Application Insights with custom telemetry | 280 | | Auto-Scaling | Automatically match resources to demand | Container Apps with CPU/memory-based triggers | 281 | | Load Balancing | Distribute traffic for optimal performance | Application Gateway with backend health monitoring | 282 | | Quality Metrics | Ensure AI outputs meet standards efficiently | Automated evaluation of groundedness, relevance | 283 | | Resource Optimization | Right-size compute for cost-performance balance | Monitoring with recommendations engine | 284 | 285 | --- 286 | 287 | ## Cross-Cutting Capabilities 288 | 289 | ### Governance & Control 290 | 291 | Citadel Governance Hub (CGH) provides centralized governance across all WAF pillars: 292 | 293 | - **Policy Enforcement** - Centralized security, cost, and quality policies applied consistently 294 | - **Usage Analytics** - Real-time visibility into consumption patterns and costs 295 | - **Compliance Reporting** - Audit trails, access logs, and regulatory compliance dashboards 296 | - **Resource Management** - Centralized control over AI model deployments and configurations 297 | - **Team Isolation** - Multi-tenancy with resource boundaries and access controls 298 | 299 | ### Observability 300 | 301 | Comprehensive observability enables all WAF pillars: 302 | 303 | - **Application Insights Integration** - Full-stack monitoring from UI to AI backend 304 | - **Custom Dashboards** - Role-specific views for developers, operations, security, executives 305 | - **Distributed Tracing** - End-to-end request tracking across service boundaries 306 | - **Log Aggregation** - Centralized logging with advanced query and analysis capabilities 307 | - **Alerting & Notification** - Context-aware alerts with automated remediation 308 | 309 | ### DevOps & Automation 310 | 311 | Platform automation accelerates delivery while maintaining quality: 312 | 313 | - **CI/CD Pipelines** - Automated build, test, deploy for agents and infrastructure 314 | - **Infrastructure-as-Code** - Bicep templates for consistent environment provisioning 315 | - **Automated Testing** - Unit, integration, and quality tests in deployment pipeline 316 | - **Version Control** - Git-based workflow for code, configuration, and policies 317 | - **Self-Service Deployment** - Empowering teams while maintaining governance guardrails 318 | 319 | --- 320 | 321 | ## Well-Architected Framework Trade-offs 322 | 323 | Citadel provides balanced approaches to common WAF trade-offs: 324 | 325 | ### Security vs. Performance 326 | - **Configurable security levels** - Adjust Content Safety strictness based on use case 327 | - **Private endpoints optional** - Choose network isolation vs. simplified connectivity 328 | 329 | ### Cost vs. Reliability 330 | - **Multi-region optional** - Deploy single-region for cost, multi-region for high availability 331 | - **Tiered deployment patterns** - Basic, standard, premium configurations with clear trade-offs 332 | - **Auto-scaling boundaries** - Set maximum scale to control costs while ensuring performance 333 | 334 | ### Performance vs. Cost 335 | - **PTU vs. PAYG** - Choose reserved vs. consumption pricing for Azure LLM 336 | - **Caching strategies** - Reduce costs and improve performance for repeated queries 337 | 338 | ### Developer Velocity vs. Governance 339 | - **Template-based with guardrails** - Teams deploy reusable templates within policy boundaries 340 | - **Automated compliance** - Security scanning and policy enforcement in centrally 341 | - **Flexible approval gates** - Required for production, optional for development 342 | 343 | --- 344 | 345 | ## Getting Started with WAF Alignment 346 | 347 | ### 1. Assess Your Requirements 348 | 349 | Determine your priorities across WAF pillars: 350 | 351 | - **Mission-critical workloads** - Emphasize reliability and security 352 | - **Cost-sensitive projects** - Focus on cost optimization and right-sizing 353 | - **Innovation initiatives** - Prioritize developer velocity and experimentation 354 | - **Regulated industries** - Ensure security and compliance 355 | 356 | ### 2. Configure Citadel for Your Needs 357 | 358 | Citadel's modular architecture allows customization: 359 | 360 | - **Network topology** - Hub-spoke vs. single VNet based on isolation needs 361 | - **Deployment scope** - Single-region vs. multi-region based on availability requirements 362 | - **Compute tier** - Container Apps vs. AI Foundry based on control and cost needs 363 | - **Monitoring depth** - Adjust telemetry collection based on operational requirements 364 | 365 | ### 3. Implement Best Practices 366 | 367 | Follow Citadel's reference implementations: 368 | 369 | - **Use Infrastructure-as-Code** - Deploy via Bicep templates for consistency 370 | - **Enable all security features** - Private endpoints, managed identities, Content Safety 371 | - **Configure monitoring** - Set up dashboards and alerts appropriate for your team 372 | - **Establish governance** - Define policies, quotas, and approval workflows 373 | 374 | ### 4. Continuous Improvement 375 | 376 | Leverage Citadel's observability for ongoing optimization: 377 | 378 | - **Review cost reports** - Monthly analysis of spending patterns and optimization opportunities 379 | - **Monitor performance** - Track latency and quality metrics, adjust resources as needed 380 | - **Security audits** - Regular review of access logs, security alerts, compliance status 381 | - **Model quality** - Continuous evaluation and refinement based on production feedback 382 | 383 | --- 384 | 385 | ## Additional Resources 386 | 387 | ### Documentation 388 | - [Citadel Technical Guide](./CITADEL-TECHNICAL-GUIDE.md) - Complete platform architecture and components 389 | - [Contributing Guide](./CONTRIBUTING.md) - How to extend and customize Citadel 390 | 391 | ### External References 392 | - [Azure Well-Architected Framework](https://learn.microsoft.com/en-us/azure/well-architected/) 393 | - [Azure Well-Architected Framework for AI](https://learn.microsoft.com/en-us/azure/well-architected/ai/) 394 | - [Azure AI Foundry Documentation](https://learn.microsoft.com/en-us/azure/ai-foundry/) 395 | - [Responsible AI Principles](https://www.microsoft.com/en-us/ai/responsible-ai) 396 | 397 | --- 398 | 399 | ## Conclusion 400 | 401 | The **Foundry Citadel Platform** provides comprehensive alignment with the Microsoft Well-Architected Framework for AI workloads through: 402 | 403 | ✅ **Strong Security** - Zero Trust, content safety, encryption, and access management 404 | ✅ **High Reliability** - Multi-region support, fault tolerance, and automated recovery 405 | ✅ **Cost Efficiency** - Granular tracking, quotas, auto-scaling, and optimization 406 | ✅ **Operational Excellence** - Comprehensive monitoring, automation, and safe deployments 407 | ✅ **Performance Optimization** - Load balancing, auto-scaling, and continuous monitoring 408 | 409 | By building on Azure's platform services and implementing proven patterns, Citadel enables organizations to deploy enterprise-grade AI solutions that balance governance, security, cost, performance, and innovation velocity—all while maintaining alignment with Microsoft's Well-Architected Framework principles. 410 | 411 | --- -------------------------------------------------------------------------------- /CITADEL-TECHNICAL-GUIDE.md: -------------------------------------------------------------------------------- 1 | # Foundry Citadel Platform 2 | 3 | >*Scalable **AI Landing Zone** with Governance, Observability & Rapid Development* 4 | 5 | Foundry **Citadel** Platform is a solution accelerator designed as a **supplemental AI landing zone** that integrates seamlessly with your Azure environment. It provides a **secure, scalable foundation** for running AI applications and agents in production – with **unified governance**, **end-to-end observability**, and tools to **accelerate development**. Citadel delivers a **pre-configured reference architecture** (aligned to Azure’s Cloud Adoption and Well-Architected Frameworks) that can be deployed with one click and includes ready-made code, templates, and documentation following Microsoft’s best practices. This comprehensive approach helps organisations adopt AI **responsibly and efficiently**, ensuring that advanced AI agents can be developed **quickly** while remaining **well-managed** and **compliant** with enterprise requirements. 6 | 7 | > ### Citadel Adoption Signals 8 | > _Enterprise teams highlight these blockers and enablers for scaling AI responsibly._ 9 | 10 | | 🛡️ **Security** | 📊 **Consumption** | 🧭 **Guardrails** | 🗂️ **Registry** | 11 | | --- | --- | --- | --- | 12 | | **62 %** of practitioners cite security concerns as the top blocker to wider AI or agent adoption. | **71 %** of enterprises struggle to track AI usage, enforce quotas, and report costs per team. | **47 %** of organisations require explicit guardrails before deploying autonomous AI agents safely. | **70 %** of customers need an AI registry for both agents and tools to adopt AI at scale. | 13 | 14 | > 🧩 _Citadel turns these pain points into platform strengths—governed access, transparent consumption, defensible guardrails, and a shared catalog of reusable AI capabilities._ 15 | 16 | These challenges highlight why **Citadel’s capabilities** are crucial. **Foundry Citadel Platform** focuses on **three key pillars** – **Governance & Security**, **Observability & Compliance**, and **AI Development Velocity** – to address these concerns end-to-end. Below, we outline each pillar and the core capabilities provided, with architecture components and features that ensure enterprise-grade AI deployments: 17 | 18 | *** 19 | 20 | ## **1. Governance & Security Pillar** – *Trustworthy AI Operations at Scale* 21 | 22 | > ### Why Governance Matters 23 | > Without centralized AI governance, organisations face **unpredictable costs, reliability issues, security risks, developer friction,** and compliance nightmares. Citadel fixes this by building guardrails into every AI call. 24 | 25 | **Foundry Citadel Platform** implements strong governance and security controls so that enterprises can adopt generative AI **safely and in compliance**. Key capabilities of this pillar include a **unified AI gateway** for all model access, granular policy enforcement, and robust safety mechanisms: 26 | 27 | * **🔐 Unified AI Gateway:** At the core of Citadel’s security is the **“AI Gateway”** – a central entry point (built on Azure API Management) through which **all AI model requests** are routed. This gateway enforces organisation-wide policies consistently. For example, it implements **universal LLM policies like rate limiting and token quotas** to prevent misuse or cost overrun. **No application calls the model directly**; instead, apps call the gateway, which authenticates and forwards requests to the appropriate model (Azure OpenAI, open-source, or even third-party services like Amazon Bedrock) while applying the required controls. This design **centralises oversight** of all AI consumption. 28 | 29 | * **🗝️ **Granular Access Control & Key Management:**** Citadel’s gateway introduces a **gateway-keys model access pattern** for developers. Rather than embedding master API keys for various AI services, teams use **managed credentials issued by the gateway**. 30 | The gateway can map these to backend keys or identity tokens, ensuring that **no master keys are directly exposed** in code. Access can be segmented by team or use-case, with **role-based authorisation** (e.g. only approved apps or users can invoke certain AI endpoints) for greater security. This prevents uncontrolled use of AI services and allows rapid **revocation or rotation** of credentials from a single place. 31 | 32 | * **🔑 Credential Management:** Citadel secures API keys and service credentials by leveraging Azure Key Vault. Secrets are stored securely and accessed at runtime, ensuring that **no raw keys are exposed in code or logs**. 33 | 34 | * **🛡️ Policy Enforcement and Compliance:** The governance layer allows administrators to define and enforce a range of **custom policies**. These include **traffic mediation rules** (e.g. routing requests to different model endpoints based on content or load) and **usage policies** (per-user or per-app call rate limits and monthly token budgets). It also supports complex **expressions for policies** – for example, automatically choosing an Azure OpenAI instance in a specific region for compliance, or requiring certain **request headers/tags for auditing**. All usage is captured centrally, enabling compliance auditing and simplifying answer to the question *“Who is using which model, and how?”*. 35 | 36 | * **🌐 Multi-Cloud and Hybrid Support:** Citadel’s governance is flexible – it can govern not only Azure OpenAI, but also **open-source model servers or third-party AI APIs**. The AI Gateway speaks **OpenAI-compatible APIs** natively, meaning it can front-end virtually any generative model service. For instance, it can direct certain requests to Azure OpenAI or to an on-premises GPU-VM model, or even to Amazon Bedrock, all under the same policy umbrella. This multi-cloud ability gives organisations a **single control plane** for heterogeneous AI systems. Citadel’s gateway and related services can themselves run on-premises if needed (via APIM self-hosted gateways), supporting scenarios with strict data residency or partially air-gapped networks. 37 | 38 | * **🛡️ AI Content Safety & Guardrails:** Citadel includes built-in **AI safety** mechanisms to enforce responsible AI usage. Every request and response can be scanned by **Azure AI Content Safety** – which detects **hate speech, violent or sexual content, self-harm indications, and other harmful outputs**. If an application user tries to prompt an agent to produce disallowed content or if a model’s answer contains such content, the system can **block or filter** that response automatically. Citadel’s safety system also includes **“prompt shields”** that detect attempts to jailbreak the agent with malicious instructions hidden in user input or documents. This protects the AI agents from executing unintended commands. Additionally, **“protected content”** checks can recognise if a model’s answer includes large verbatim excerpts of known copyrighted text (lyrics, articles, etc.) and prevent accidental leakage of such content. These guardrails give organisations confidence that AI systems won’t go off-policy or create liability. 39 | 40 | * **📊 Central Monitoring & Cost Governance:** All AI usage through the gateway is logged centrally (calls, tokens used, timings, outcome). FCP provides **built-in reports and dashboards** to track this usage by application or department. This **solves the cost attribution problem** – e.g. you can see how many tokens the Finance team’s chatbot consumed this week and enforce per-team quotas. It also enables **cost optimisation** – detecting anomalous spikes or inefficient prompt usage. Combined with Azure Monitor, admins can set **alerts** (e.g. if a project exceeds its monthly AI budget, or if a spike in requests suggests a rogue script). By providing this transparency and control, Citadel helps prevent the “blank cheque” scenario of uncontrolled AI API spend. It effectively addresses the **“shadow AI”** governance nightmare by keeping all AI calls within the managed guardrails. 41 | 42 | * **📘 Central AI Registry for Agents and Tools:** FCP provides a unified **AI Registry** powered by the **Model Context Protocol (MCP)**, enabling organisations to manage and discover both **first-party** and **third-party** AI agents and tools. This registry acts as a central catalog where teams can securely share, document, and govern AI capabilities across the enterprise. By standardising metadata and access policies, the registry ensures that all agents and tools – whether developed in-house or sourced externally – are easily discoverable and can be integrated seamlessly into workflows. This capability fosters collaboration, reduces duplication of effort, and ensures consistent governance for all AI assets. 43 | 44 | * **🔒 Data Security:** Citadel ensures the protection of sensitive data in AI workflows by integrating with Microsoft Purview. This enables governance through **data sensitivity labels and policies**, ensuring that sensitive information remains within approved boundaries. For example, an AI agent accessing a database will operate under Purview’s oversight, with all usage logged and any policy violations (such as accessing restricted customer data) flagged for review. 45 | 46 | **Governance & Security Features and Components:** The table below summarises some of the key governance components of Citadel and their roles: 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 |
Governance FeatureDescription
Unified AI GatewayCentral gateway that mediates every AI call. It applies global policies (rate limits, authentication, routing) and provides a single secure endpoint for clients. This ensures all AI usage is centrally visible and controlled.
Policy EngineRich rule framework to enforce business rules – e.g. restrict certain models to specific regions, apply token quotas per user, or inject safety prompts. Administrators can write custom policies or use built-in templates for common requirements.
Managed CredentialsUses gateway-keys with/without Identity Platform issued tokens (like Microsoft Entra ID) to abstract backend secrets. Developers no longer handle raw AI services master keys – the gateway issues tokens/keys with scoped access. This prevents key leakage and allows instant revocation if needed.
Content Safety FiltersAutomated checks on prompts and responses using Azure AI Content Safety. Flags or removes profanity, hate, sexual or violent content, and can block outputs that violate compliance policies (e.g. privacy or confidential data).
AI Registry & CatalogA registry (via Azure API Center) for discovering and managing AI endpoints and tools (known as MCP servers). This catalogue lets teams securely share AI “skills” (Agents, APIs, functions) across the enterprise with proper metadata and governance.
Multi-cloud ConnectorsBuilt-in support to govern AI services beyond Azure. The gateway can proxy requests to open-source model APIs or other cloud’s AI endpoints (e.g. Bedrock) securely. This ensures consistent security and monitoring even for third-party AI services.
Azure Key VaultSecure store for secrets and credentials by AI Apps/Agents. All API keys, connection strings, etc., used by agents or the gateway are kept in spoke Key Vault, and accessed via managed identities. This eliminates hard-coded secrets and protects sensitive data at rest.
82 | 83 | **In practice, these governance features mean AI applications can be deployed with confidence.** For example, if you build a GPT-based internal assistant, it will run through Citadel’s gateway – **ensuring it only answers within approved data sources, filters any policy-breaking content, and logs its activity**. Administrators remain in control: they can update a policy to block a newly discovered prompt attack pattern, or quickly see which prompts are costing the most. FCP thus fosters **strong customer and stakeholder trust in AI** by providing the oversight needed beyond just “having a model”. Governance is no longer a roadblock – it’s baked into the platform so that **compliance officers and developers can collaborate effectively** without endless manual reviews. The result is faster deployment of AI solutions **“with the guardrails on”**, avoiding the common pitfalls of unchecked AI experimentation (data leaks, runaway costs, or reputational damage). 84 | 85 | *** 86 | 87 | ## **2. Observability & Compliance Pillar** – *End-to-End Monitoring, Evaluation & Trust* 88 | 89 | > ### Full Visibility = Trust & Confidence 90 | > Citadel provides **holistic observability** for AI systems through a **dual-layer approach**: centralised monitoring at the platform level and detailed tracing at the agent level. This ensures teams can debug issues, assure quality, and govern compliance in real time. *"You need a dashboard, not a crystal ball"* to manage AI. 91 | 92 | The **Observability & Compliance** pillar of **Foundry Citadel Platform** equips organisations with the tools to **monitor, trace, and evaluate** AI agents and LLMs behaviour continuously through a structured **layered observability approach**. This ensures that AI applications are not a "black box" – instead, they are transparent and auditable at both platform and agent levels, which is essential for maintaining reliability and trust in their outputs. 93 | 94 | ### **🏗️ Platform-Level Observability** 95 | 96 | Platform observability provides **centralised monitoring and governance** across all AI workflows, offering enterprise-grade visibility without requiring any agent code changes: 97 | 98 | * **📊 Central Application Performance Monitoring (APM):** Citadel integrates seamlessly with **Azure Monitor Application Insights** to provide comprehensive platform-wide APM capabilities. This centralised monitoring captures infrastructure-level metrics, performance data, and system health indicators across all AI workloads. Teams gain visibility into resource utilisation, system bottlenecks, and overall platform performance without needing to instrument individual agents. 99 | 100 | * **📈 Detailed Usage Tracking per Team/Use Case/Agent:** The platform provides **granular usage analytics** that can be segmented by team, use case, or individual agent. This includes tracking metrics such as: 101 | * **Token consumption trends** broken down by team, project, or agent 102 | * **Request volumes and patterns** across different use cases 103 | * **Cost allocation and budgeting** with detailed spend visibility per organisational unit 104 | * **User adoption patterns** and engagement metrics across different AI applications 105 | 106 | For example, operations teams can see that *"the Sales team's Q&A Bot consumed 1.2M tokens (cost ~£60) today across 5,000 requests, while the Legal team's document analysis agent used 800K tokens across 200 complex queries."* This granular tracking enables accurate **cost management, capacity planning, and resource allocation** across the organisation. 107 | 108 | * **🔍 Centralised AI Evaluation (No Code Changes Required):** One of FCP's key strengths is its ability to run **comprehensive AI evaluations** without requiring any modifications to agent code. The platform can: 109 | * **Automatically intercept and evaluate** AI outputs using predefined and custom metrics 110 | * Run **periodic batch evaluations** on historical data (e.g., evaluate 10% of conversations overnight) 111 | * Provide **comparative analysis** between different time periods, agent versions, or teams 112 | * Support **custom business-specific evaluators** that can be deployed centrally and applied across multiple agents 113 | 114 | The evaluation framework includes a comprehensive suite of **pre-defined metrics**: 115 | * *Response Quality Metrics:* **groundedness** (did the answer stick to the provided data sources?), **relevance** (did it address the user's query?), **coherence and fluency** of the language, and **completeness** (did it follow all instructions and provide all parts of the answer?) 116 | * *Retrieval Accuracy:* for agents using knowledge bases, Citadel checks whether facts in answers occur in retrieved documents (measuring **truthfulness** to sources) 117 | * *Safety Metrics:* evaluation for **potential harms** – offensive language, biased content, and **"jailbreak" susceptibility** (was the agent tricked into breaking rules?) 118 | 119 | This centralised approach means quality assurance and safety evaluations are **consistent across all AI applications** and can be managed by a central AI governance team without requiring development resources from individual agent teams. 120 | 121 | * **🚨 Enterprise Alerts and Automated Remediation:** For sensitive AI use cases, the platform provides **sophisticated alerting and automated response capabilities**: 122 | * **Configurable alert rules** on critical metrics (e.g., groundedness scores below thresholds, token usage spikes, error rate increases) 123 | * **Automated remediation actions** such as temporarily disabling agents, switching to backup models, or escalating to human oversight 124 | * **Integration with enterprise notification systems** (Teams, email, ITSM platforms) for immediate response 125 | * **Compliance monitoring** with automated reporting for regulatory requirements 126 | 127 | For high-stakes scenarios, teams can configure cascading responses – for instance, if safety scores drop below acceptable levels, the system can automatically route queries to human reviewers while alerting the responsible teams. Early warning of anomalies (maybe a new version of the model started "hallucinating" more, or an API that agents rely on is down) is critical for maintaining **high uptime and trust**. 128 | 129 | ### **🤖 Agent-Level Observability** 130 | 131 | Agent observability provides **detailed, granular insights** into individual AI agent behaviour, enabling deep debugging and optimisation: 132 | 133 | * **📋 Detailed Execution Traces:** Citadel guidance for agent deployments allows records comprehensive **execution traces** for each AI query or conversation. These traces capture **every step an agent takes** – from the initial user prompt, to system and tool prompts, all intermediate reasoning or chain-of-thought messages, calls to external tools or knowledge bases, and the final response. Along with the content, traces log **parameters, model identities, and timing** (latency and token counts) for each step. These traces are visualised in a **structured timeline** for developers and engineers. 134 | 135 | For example, if an AI agent uses a calculator API as part of its reasoning, you will see the exact API call and result in the trace. This level of insight makes it far easier to **debug issues** – such as figuring out *why* an agent gave a wrong answer (maybe it chose a flawed chain of actions), or why latency spiked (perhaps one tool took too long). Traces are stored durably (via Azure Application Insights/Log Analytics), allowing comparative analysis between runs and even between different versions of an agent. In short, **every conversation or action path is observable**; nothing is truly "hidden" behind an AI magic curtain. 136 | 137 | * **⚡ Performance Monitoring:** Agent-level monitoring captures detailed performance metrics including: 138 | * **Response latency** broken down by reasoning steps, tool calls, and model inference 139 | * **Token usage patterns** for each component of the agent's workflow 140 | * **Tool utilisation efficiency** and success rates 141 | * **Memory and resource consumption** during agent execution 142 | 143 | This granular performance data enables developers to identify bottlenecks, optimise agent workflows, and ensure consistent performance across different scenarios. 144 | 145 | * **🎯 Agent-Specific Quality Evaluations:** Beyond platform-wide evaluations, agent-level observability includes metrics tailored to specific agent behaviours: 146 | * **Intent fulfilment** (did the agent actually achieve what the user asked?) 147 | * **Tool use correctness** (did it call the right tool with correct parameters?) 148 | * **Reasoning efficiency** (were there unnecessary steps or redundant operations?) 149 | * **Multi-step coordination** (for complex agents with multiple reasoning phases) 150 | 151 | These agent-specific metrics provide developers with actionable insights for improving agent design and prompt engineering. 152 | 153 | * **🔧 Advanced Debugging & Diagnostics:** FCP's guidance provides rich tools to **search and inspect logs and traces** for any session or conversation. It has powerful filtering capabilities. This helps in **root cause analysis**. Moreover, developers can **replay traces**: taking a stored conversation and running it step-by-step (either on the same or updated version of the agent) to reproduce issues or test fixes. 154 | 155 | * **🔄 Continuous Improvement Integration:** Agent observability feeds directly into the **development and deployment lifecycle**: 156 | * **CI/CD integration** with automated testing using historical prompt datasets 157 | * **A/B testing capabilities** for comparing different agent versions 158 | * **Performance regression detection** when deploying new agent versions 159 | * **Feedback loop integration** allowing insights from production to inform development 160 | 161 | For example, AI Evaluation tests can be integrated with your CI/CD pipeline: whenever a new version of an agent is deployed, a battery of **automated tests and evaluations** can run (using stored prompt datasets) to compare its performance versus the previous version. If any metric regresses or new safety issues appear, the deployment can be halted or flagged. This DevOps-style approach – sometimes called **"AIOps"** – ensures that quality is maintained even as the AI system evolves. 162 | 163 | ### **🔗 Unified Platform & Agent Observability** 164 | 165 | The true power of FCP's observability recommendations lies in the **seamless integration** with platform layer and **clear guidance** for agent layers: 166 | 167 | * **🎛️ Unified Dashboards:** Ready-to-use dashboards provide both **platform-wide overviews** and **agent-specific drill-downs**. Operations teams can monitor overall system health while developers can dive deep into individual agent performance. These dashboards give a **bird's-eye view** of the system including: 168 | * **Platform metrics:** Overall usage, cost trends, system performance, and compliance status 169 | * **Agent metrics:** Individual performance, quality scores, and usage patterns 170 | * **Comparative analytics:** Performance trends across teams, use cases, and time periods 171 | 172 | For example, an operations engineer can see that *"today, across all teams, we served 15,000 AI requests, consumed 4.2M tokens (cost ~£180), with average platform latency 1.5s and 99.2% uptime, while the Sales Q&A Bot specifically had 1.8s average response time with 2 minor safety flags."* Having this in one place allows both **technical and business stakeholders to stay informed**. The dashboard is dynamic – teams can drill down into specific time windows or filter by scenario/agent. 173 | 174 | * **🚨 Coordinated Alerting:** Alerts can be configured at both platform and agent levels, with **intelligent escalation paths** that consider both individual agent issues and platform-wide concerns. For instance, if multiple agents start showing performance degradation simultaneously, this might indicate a platform-level issue rather than individual agent problems. 175 | 176 | * **📊 Cross-Layer Analytics:** The platform provides **correlation analysis** between platform metrics and agent performance (when Azure Monitor used end-to-end), helping teams understand how infrastructure changes, model updates, or usage patterns affect individual agent behaviour and overall system performance. 177 | 178 | **Key Observability Tools in Citadel:** 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 |
Observability FeaturePurposeLayer
Central APM MonitoringInfrastructure-level monitoring, resource utilisation, and system health indicators across all AI workloads without requiring agent code changes.Platform
Usage Analytics & Cost TrackingGranular tracking of token consumption, request patterns, and cost allocation segmented by team, use case, or agent for enterprise resource management.Platform
Centralised AI EvaluationsAutomated quality, safety, and compliance evaluations applied consistently across all agents without requiring code modifications from development teams.Platform
Enterprise Alerting & RemediationSophisticated alerting with automated responses for sensitive use cases, including agent disabling, human escalation, and compliance notifications.Platform
End-to-End TracingCaptures every step of an AI agent's reasoning and interactions (prompts, tool calls, responses), enabling transparent debugging and post-mortem analysis.Agent
Agent Performance MonitoringDetailed real-time metrics including response latency breakdown, token usage patterns, tool efficiency, and resource consumption per agent.Agent
Agent-Specific EvaluationsTailored quality metrics for individual agent behaviours including intent fulfilment, tool use correctness, and reasoning efficiency.Agent
Advanced Debugging ToolsPowerful querying, filtering, and trace replay capabilities for root-cause analysis and issue reproduction at the agent level.Agent
Unified DashboardsIntegrated visual dashboards providing both platform-wide overviews and agent-specific drill-downs for comprehensive operational visibility.Both
Continuous Improvement LoopConnects operational data back to development with CI/CD integration, A/B testing, and regression detection for ongoing AI system enhancement.Both
237 | 238 | All these observability measures ensure AI systems are **reliable and accountable**. Teams using Citadel principals can confidently answer **"What is my AI doing and why?"** at any time – a question that's otherwise hard to address. This pillar thus mitigates one of the biggest barriers to enterprise AI adoption: the fear of not knowing what the AI might do. With Citadel, **governance and observability go hand-in-hand**: if the Governance pillar is about setting the rules and guardrails, the Observability pillar is about **watching and verifying** adherence to those rules, and catching anything that falls outside. Together, they create a closed-loop system for responsible AI management, where issues are not only prevented but also detected and learned from in an ongoing cycle. 239 | 240 | *** 241 | 242 | ## **3. AI Development Velocity Pillar** – *Accelerating Innovation with Templates & Tools* 243 | 244 | > ### Build Fast, Build Right 245 | > Citadel provides both **low-code and pro-code** pathways to build AI agentic solutions, so teams can experiment and innovate quickly. Pre-built templates, integratable DevOps guidance, and flexible model choices enable rapid iteration *without* sacrificing governance or quality. 246 | 247 | While governance and oversight are crucial, **Foundry Citadel Platform** is not a single tool but an **AI Landing Zone**—a pre-configured set of Azure resources designed to help organizations **move quickly** and capitalize on AI opportunities. The **AI Development Velocity** pillar ensures that the platform **empowers AI developers and data scientists** with a spectrum of agentic platform choices, frameworks, and reusable assets. FCP strikes a critical balance: it enables rapid development **within established guardrails** through a template-based approach, so speed doesn’t come at the cost of security or oversight. Key aspects of this pillar include: 248 | 249 | * **🚀 Pre-built Deployment Templates:** FCP accelerates the provisioning of cloud environments with predefined deployment templates that can target single or multiple types of agents. These templates are integrated with the platform's central governance and security, allowing teams to quickly establish a production-ready environment for building and operating agents without manual configuration. 250 | 251 | * **🤖 Flexible Agent Development Models:** FCP supports a variety of agent types, allowing teams to choose the right approach for their needs. Customers may use one or a mix of these agent types, even within a single multi-agent system. The built-in types include: 252 | 253 | * **Copilot Studio Agents:** For a low-code approach, FCP integrates with **Copilot Studio**, a fully managed graphical interface for building and deploying AI agents. This drag-and-drop environment allows developers and power-users to design AI workflows visually. Within FCP, Copilot Studio agents are enhanced by integrating with the **Citadel AI Registry**, enabling them to securely discover and reuse existing agents and tools (like Model Context Protocol servers), ensuring governance even in a low-code context. 254 | 255 | * **Managed Runtime Agents:** For developers who want more control without managing the underlying infrastructure, FCP offers managed runtimes. Options include the **AI Foundry Agent Service**, which provides a scalable environment for hosting agent logic, and **Logic Apps Agent Loop**, which allows for the creation of serverless agent workflows. These runtimes provide a balance of flexibility and operational simplicity. 256 | 257 | * **Bring-Your-Own (BYO) Agents:** For maximum flexibility, FCP allows teams to bring their own agent architectures. Developers can leverage Microsoft's first-party AI orchestrators like **Semantic Kernel**, **Agent Framework**, and **AutoGen**, or third-party orchestrators such as **LangChain**. These agents can be containerized and deployed into Citadel’s environment, inheriting the platform's governance, security, and observability benefits. 258 | 259 | * **📚 The Citadel AI Registry:** At the heart of the platform's governance is the **Unified AI Gateway (CGH)** and its native capability to expose an **AI Registry**, which serves as a central catalog for all AI assets. Integration with the Citadel Governance Hub AI Gateway happens in two primary ways: 260 | * Getting **Managed AI Access:** Agents get secure access to LLMs, AI services, and published AI tools from the registry. This ensures that only approved models and tools are used within predefined capacity and security context. 261 | * **Publishing:** Service and team owners can publish their own tools and agents into the central AI Registry, making them discoverable and reusable by other teams across the organization. This fosters a secure collaborative environment and prevents duplication of effort. 262 | 263 | * **📦 One-Click Deployment & Reusable Blueprints:** To truly accelerate time-to-value, FCP provides automation for **environment setup and deployment**. The entire FCP reference architecture can be deployed via an **automated script or template** (e.g., Bicep), essentially a **“one-click deploy”** of the agent AI landing zone. This drastically reduces the initial setup time. On top of this, Microsoft offers **Gold Standard** assets—ready-made AI solution blueprints for common patterns like "Chat with your data" or "Conversation summarization." These blueprints come with code, configuration, and deployment scripts, serving as accelerators that allow teams to adapt proven solutions rather than starting from scratch. 264 | 265 | * **🔄 DevOps Integration & Lifecycle Management:** FCP treats AI solutions with the same rigor as any software project, embedding them into the DevOps toolchain. It provides seamless integration with **GitHub and Azure DevOps**, allowing developers to use pre-configured environments like **GitHub Codespaces**. Automated CI/CD pipelines can run evaluation suites on pull requests to ensure quality and deploy updated agents to staging or production environments. With **APIs and CLI tools**, teams can programmatically manage AI Gateway policies and safety settings as part of a release. FCP also supports **A/B testing and shadow deployments**, enabling teams to run multiple versions of an agent in parallel, compare their performance using observability data, and bring Agile principles to AI development. 266 | 267 | **Key Development Tools & Components:** 268 | 269 | 270 | 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | 283 | 284 | 285 | 286 | 287 | 288 | 289 | 290 | 291 | 292 | 293 | 294 |
Development AcceleratorRole in Citadel
Deployment TemplatesPre-built, one-click templates (i.e. Bicep) to provision a secure, governed cloud environment for single or multiple agent types, accelerating time-to-production.
Flexible Agent Runtimes
(Copilot Studio, Managed Runtime, BYO)
Supports a spectrum of development models, from low-code (Copilot Studio) to managed services (AI Foundry Agent Service) and fully custom "Bring-Your-Own" orchestrators (Semantic Kernel, LangChain), allowing teams to choose the best fit for their use case.
Citadel AI RegistryA central, governed catalog for discovering, managing, and reusing AI assets. It provides managed access to LLMs and tools and allows teams to publish their own, fostering collaboration and preventing redundant work.
Reusable Blueprints
(Gold Standard Solutions)
End-to-end solution examples that demonstrate common AI patterns. They serve as accelerators for new projects, embodying proven architectures and best practices.
DevOps IntegrationIntegrates with GitHub and Azure DevOps for CI/CD, automated testing, and lifecycle management of AI solutions. Supports A/B testing and canary releases to bring modern software engineering speed to AI development.
295 | 296 | All these capabilities mean that teams can innovate **rapidly** with AI. They can start with an idea, quickly assemble an MVP agent using existing building blocks, test it with real data (with governance in place), and iterate to improve it – all in a matter of days or weeks rather than months. Citadel’s approach of providing both low-code and pro-code options also ensures that **different personas can collaborate**: a business analyst could craft an initial agent behavior in Copilot Studio, then a software engineer could refine it using the code SDK for more complex logic – all deploying to the same managed environment. 297 | 298 | Crucially, **development speed does not mean throwing caution to the wind**. Every agent built on Citadel pillars and guidance, no matter how fast it was created, **runs within the secure, monitored framework** described in the previous sections. This means organisations can encourage experimentation and pilot projects without fear: if something grows in importance, the governance and reliability scaffolding is already there for it. Citadel effectively frees teams from the heavy lifting of creating a safe AI infrastructure from scratch, so they can focus on applying AI in ways that differentiate their business (be it new customer experiences, process automation, or decision support). 299 | 300 | Finally, this pillar embodies the idea of **scaling AI responsibly**. Once you’ve built one successful solution, Citadel makes it easier to rollout others (since the platform is already in place) and to templatise your approach. Over time, the catalogue of internal tools and connectors will grow – a “**network effect**” where each new AI project potentially adds reusable pieces for future projects. This accelerates AI adoption across the organisation in a governed way, helping build an **“AI factory”** capability. In summary, Citadel turns the typically slow, risky journey of AI solution development into a **fast, repeatable, and governed process**, accelerating innovation while maintaining **enterprise-grade standards**. 301 | 302 | *** 303 | 304 | ## **Architecture Overview:** *Inside the Foundry Citadel Platform Landing Zone* 305 | 306 | To support the three pillars above, **Foundry Citadel Platform (FCP)** implements a **reference architecture** that covers all necessary layers – from networking and compute to integration and data. It is essentially an **extension of your Azure Landing Zone** tailored for AI workloads, meant to run alongside your existing cloud setup (reusing things like your networking, identity, and governance foundations). Here is a high-level look at the key components of the FCP architecture and how they relate to the pillars: 307 | 308 | ![AI-Agents-Citadel-Architecture](/assets/AIAC-1.1.0.png) 309 | 310 | The above architecture ensures that FCP is not a monolithic product but a **collection of Azure services** wired together in a reference design. This modularity means it can be adapted – e.g., if an organisation has an existing logging solution, that can be integrated, or if they prefer AKS over Container Apps for specific compliance reasons, the design supports that swap. 311 | 312 | Critically, the **governance, observability, and dev velocity features are achieved by these components working in harmony**. 313 | 314 | For instance, the **Unified AI Gateway (Governance)** is a single point of entry for LLMs, published agents and tools, which allows it to mediate the traffic, enforcing governance policies and have observibility at a platfrom level across all AI agents and apps. 315 | 316 | Also, because the **landing zone is separate but connected** to the main enterprise landing zone, it can be introduced without disrupting existing applications – it’s an **add-on landing zone for AI** that still connects back to the core (network peering to the company’s Hub VNet, adhering to central governance via Azure Policy, etc.). 317 | 318 | To summarise the architecture in simpler terms: **Citadel’s landing zone is like a secure factory for AI agents**. The **Unified AI Gateway** is the fortified front door and guard for LLMs, tools & agents, the **Agent hosting** (AI Foundry, containers, apps,...) is the assembly line where work gets done, and the **observability layer** is the set of instruments and dials that supervisors use to monitor the process and outcomes both at platform-level and agent-level. 319 | 320 | All of this is delivered with a blueprint so that organisations can set it up quickly and be confident that nothing was left out in the design (security, networking, ops, all accounted for). It gives you the **peace of mind** that as you scale up AI usage, you have an architecture that can handle **growth in users, agents, and integrations** without sacrificing control or performance. 321 | 322 | **Foundry Citadel Platform** is divided into two deployments: the central **Citadel Governance Hub (CGH)** landing zone representing the **governance and security** pillar, and **Citadel Agent Spoke (CAS)** landing zones for single agents or multi-agent systems serving specific use cases or business units, representing the **AI development velocity** pillar. 323 | 324 | The **Observability and compliance** pillar spans both CGH and CAS landing zones, providing unified monitoring and evaluation capabilities. 325 | 326 | ### **Citadel Governance Hub (CGH)**: Central governance & security 327 | 328 | The **Citadel Governance Hub (CGH)** is an enterprise-grade solution accelerator that establishes a centralized, governable, and observable control plane for all AI service consumption across multiple teams, use cases, and environments. Often referred to as the **AI Hub Gateway**, CGH replaces fragmented, unmonitored, key-based model access with a **unified AI gateway** pattern built on Azure API Management (APIM), adding intelligent routing, security enforcement, compliance guardrails, usage analytics, AI registry and automated onboarding. 329 | 330 | This elevates AI consumption from ad hoc experimentation to a scalable, auditable, and cost-attributable platform capability. 331 | 332 | > **🔗 Explore the AI Hub Gateway Repo:** 333 | > For detailed guidance on deploying and operating the AI Hub Gateway—including architecture, templates, and best practices—visit the official [**AI Hub Gateway repository**](https://aka.ms/ai-hub-gateway). 334 | >
335 | > This resource is your starting point for hands-on instructions, reference implementations, and operational insights to accelerate secure, governed AI adoption in your enterprise. 336 | 337 | #### 🏗️ What Gets Deployed 338 | 339 | ![Azure components](./assets/AIAC-Governance-1.1.0.png) 340 | 341 | | Component | Purpose | Enterprise Features | 342 | |-----------|---------|-------------------| 343 | | **🚪 API Management** | Unified AI gateway | LLM governance, AI resiliency, AI registry gateway | 344 | | **📘 API Center** | Universal AI Registry | Discovery of available AI tools, agents and AI services for 1st and 3rd party | 345 | | **🔍 AI Foundry** | Platform Observability and Compliance | Platform AI Evaluations & Compliance reports | 346 | | **📊 Log Analytics Workspace** | LLM Logs, metrics & audits | scalable enterprise telemetry ingestion and storage | 347 | | **📊 Application Insights** | Platform monitoring & analytics | performance dashboards, automated alerts | 348 | | **📨 Event Hub** | Usage data streaming & processing | Usage streaming, custom logging | 349 | | **🛡️ Azure Content Safety** | Centralized LLM protection | Prompt Shield and Content Safety protections | 350 | | **💳 Azure Language Service** | PII entity detection | Natural language based PII entity detection, anonymization | 351 | | **🗄️ Cosmos DB** | Usage analytics & cost allocation | Long term storage of usage, automatic scaling | 352 | | **⚡ Logic App** | Event processing & data transformation | Workflow-based processing of ingested usage/logs & AI Eval workflow | 353 | | **🔐 Managed Identity** | Zero-credential authentication | Secure service-to-service communication | 354 | | **🔗 Virtual Network** | Private connectivity & isolation | BYO-VNET support, private endpoints | 355 | | **🤖 Azure OpenAI (OPTIONAL)** | Multi-region OpenAI deployments (3 regions) | GPT-models, Realtime API, fully private | 356 | 357 | *** 358 | 359 | ### **Citadel Agent Spoke (CAS)**: Local AI development velocity 360 | 361 | The **Citadel Agent Spoke (CAS)** provides a comprehensive, enterprise-ready infrastructure foundation for deploying and scaling AI agent workloads on Azure. Built on Azure Verified Modules (AVM), CAS delivers a secure, network-isolated environment optimized for generative AI applications and agent services per domain or workload. The architecture centers around Azure AI Foundry with integrated agent capabilities, supported by a full suite of AI services, data stores, and enterprise-grade security controls. 362 | 363 | > **🔗 Explore the Citadel Agent Spoke Repo:** 364 | > For comprehensive guidance on deploying and operating Citadel Agent Spokes (CAS)—including architecture, deployment templates, and enterprise best practices—visit the official [**Citadel Agent Spoke repository**](https://github.com/Azure/AI-Landing-Zones). 365 | >
366 | > This resource provides step-by-step instructions, reference implementations, and operational insights to help you rapidly build, scale, and manage AI agent solutions in a secure, governed Azure environment. 367 | 368 | #### 🏗️ What Gets Deployed 369 | 370 | | Component | Purpose | Enterprise Features | 371 | |-----------|---------|-------------------| 372 | | **🤖 Azure AI Foundry** | AI agent development platform with Standard Agent Services | Agent capability hosts, project management, private networking, managed identities | 373 | | **🚪 API Management** | Unified AI gateway and service orchestration | LLM governance, API versioning, traffic control, usage analytics, security policies | 374 | | **🔍 Azure AI Search** | Vector and hybrid search for RAG patterns | Private endpoints, semantic search, vector indexing, enterprise security | 375 | | **🗄️ Azure Cosmos DB** | Distributed database for agent state and conversations | Global distribution, multi-region failover, private connectivity | 376 | | **💾 Azure Storage Account** | Blob storage for documents and model artifacts | Private endpoints, hierarchical namespace, lifecycle management | 377 | | **🔐 Azure Key Vault** | Secrets and certificate management | Private endpoints, RBAC integration, HSM-backed keys | 378 | | **📊 Azure Container Apps** | Containerized AI applications and microservices | Auto-scaling, managed environments, private networking | 379 | | **📦 Azure Container Registry** | Container image registry for AI workloads | Private endpoints, vulnerability scanning, geo-replication | 380 | | **📈 Application Insights** | Telemetry and performance monitoring | Custom metrics, distributed tracing, alerting | 381 | | **🌐 Virtual Network** | Network isolation and security | Private endpoints, NSGs, subnet segmentation | 382 | | **🌍 Application Gateway** | Web application firewall and load balancing | WAF protection, SSL termination, path-based routing | 383 | | **💻 Jump VM** | Secure access to private resources | Bastion integration, managed maintenance, RBAC | 384 | | **🏗️ Build VM** | DevOps and CI/CD operations | Automated deployments, secure build environment | 385 | | **🔒 Network Security Groups** | Subnet-level security controls | Fine-grained traffic rules, security logging | 386 | | **🌐 Private DNS Zones** | Name resolution for private endpoints | Automated DNS management, secure resolution | 387 | 388 | >Note: Many of the above components highlighted as part of CAS like network and Application Gateway are optional with toggles to provision new, do not provision or use existing. 389 | 390 | #### Key Enterprise Capabilities 391 | 392 | ##### 🤖 **AI Agent Infrastructure** 393 | - **Agent Capability Hosts**: Dedicated infrastructure for AI agent services with Azure AI Foundry 394 | - **Project Management**: Multi-tenant project isolation and management capabilities 395 | - **Standard Agent Services**: Pre-configured agent runtime environment with networking integration/isolation 396 | 397 | ##### 🔐 **Security & Compliance** 398 | - **Zero Trust Architecture**: All services communicate through private endpoints 399 | - **Network Isolation**: Dedicated subnets with network security groups for each service tier 400 | - **Identity & Access Management**: Managed identities and RBAC integration across all components 401 | - **Secrets Management**: Centralized key and certificate management with Azure Key Vault 402 | 403 | ##### 🚀 **Scalability & Performance** 404 | - **Auto-scaling Container Apps**: Elastic compute for variable AI workloads 405 | - **Global Distribution**: Multi-region capabilities with Cosmos DB and geo-replication 406 | - **Load Balancing**: Application Gateway with WAF for high availability and security 407 | - **Caching & CDN**: Built-in caching strategies for optimal performance 408 | 409 | ##### 🔧 **DevOps & Operations** 410 | - **Infrastructure as Code**: Complete Bicep templates with Azure Verified Modules 411 | - **Monitoring & Observability**: Comprehensive telemetry with AI Foundry observability powered by Application Insights and Log Analytics 412 | - **Automated Deployments**: CI/CD integration with build agents and maintenance windows 413 | - **Configuration Management**: Centralized app configuration with Azure App Configuration 414 | 415 | ##### 🌍 **Networking & Connectivity** 416 | - **Hub-Spoke Architecture**: VNet peering capabilities for enterprise network integration 417 | - **Private Connectivity**: All AI services accessible only through private endpoints 418 | - **DNS Management**: Automated private DNS zone configuration and management 419 | - **Firewall Protection**: Azure Firewall with threat intelligence and custom rules 420 | 421 | ##### 📈 **Data & Analytics** 422 | - **Vector Search**: Advanced AI Search capabilities for RAG and semantic search scenarios 423 | - **Document Storage**: Hierarchical blob storage with lifecycle management policies 424 | - **State Management**: Distributed database for conversation history and agent state 425 | - **Configuration Store**: Centralized configuration management with feature flags 426 | 427 | #### Deployment Flexibility 428 | 429 | The blueprint supports both stand alone **greenfield deployments** and **integration with existing Foundry Citadel Platform - Citadel Governance Hub (CGH)**: 430 | 431 | - **Create New**: Deploy all components as new resources with optimized defaults 432 | - **Reuse Existing**: Integrate with existing virtual networks, DNS zones, and shared services 433 | - **Hybrid Approach**: Mix of new and existing resources based on organizational requirements 434 | 435 | This enterprise-ready blueprint provides the foundation for building, deploying, and scaling AI agent solutions while maintaining the highest standards of security, compliance, and operational excellence. 436 | 437 | ### **Citadel Governance Hub Integration** – *Automated Alignment Between Agents & Guardrails* 438 | 439 | Citadel streamlines the handshake between each **Citadel Agent Spoke** and the central **Citadel Governance Hub**, ensuring that every agent inherits the platform’s security, policy, and observability standards from day one. Through a fully automated onboarding flow, teams can codify their integration in source control and wire it directly into CI/CD pipelines—enabling repeatable deployments, rapid environment cloning, and verifiable governance drift checks. 440 | 441 | * **AI Access Contract:** Declares the governed dependencies an agent needs—LLMs, AI services, tools (MCP), and reusable agents—along with the precise access policies (model selection, capacity, regions, safety requirements). When automated, this contract guarantees consistent consumption guardrails across environments and simplifies approvals by making entitlements explicit. 442 | * **AI Publish Contract:** Describes the tools and agents a spoke exposes back to the hub, including the publishing rules, ownership metadata, and security posture. Automation turns this into a predictable cataloging workflow, accelerating time-to-discovery, enforcing compliance gates, and keeping the enterprise AI registry continuously in sync. 443 | 444 | By treating governance onboarding as code, organisations gain **audit-ready traceability**, **faster release cycles**, and **reduced manual effort**, while ensuring every agent remains within the Citadel’s unified policy perimeter. 445 | 446 | 447 | 448 | ## **Conclusion & Next Steps** 449 | 450 | **Foundry Citadel Platform (FCP)** brings together everything an enterprise needs to **build, run, and scale AI-powered solutions responsibly**. By focusing on the three pillars of **Governance & Security**, **Observability & Compliance**, and **Development Velocity**, it ensures that AI projects can move fast from idea to production **with the right safety nets in place** at every stage. Organisations adopting FCP can accelerate their AI journey: teams are empowered to create powerful AI agents (from simple chatbots to complex multi-agent systems) using a rich set of tools and templates, while central IT can rest assured that **proper controls and insights** are enforced globally through the **Citadel Governance Hub (CGH)** and delivered securely through **Citadel Agent Spokes (CAS)**. 451 | 452 | In practice, Citadel’s impact is significant: it can **accelerate time-to-value** for AI initiatives (by providing out-of-box infrastructure and best practices), and at the same time **reduce the risks** that typically accompany AI experiments (thanks to its rigorous governance and monitoring). It helps answer common executive concerns in AI projects – *“How do we prevent sensitive data leaks? How do we ensure the AI stays reliable and fair? How do we integrate these new AI apps into our existing systems and culture?”* – by providing a proven solution. This platform has already been leveraged in various industries, from finance (where auditability and security are paramount) to retail and manufacturing (where rapid innovation and cost control are key). Early adopters have reported increased confidence in deploying generative AI for critical use cases, knowing they can track usage, attribute costs, and meet compliance requirements. 453 | 454 | In essence, **Foundry Citadel Platform** enables enterprises to **innovate with AI at scale – safely, efficiently, and transparently**. It represents a move from ad-hoc AI experiments to a **disciplined AI engineering approach**: akin to going from crafting one-off artisan pieces to running a well-oiled factory that can produce reliable, high-quality products repeatedly. With FCP, organisations can unlock the tremendous potential of generative AI and autonomous agents *“with the confidence that comes from having a Citadel around your AI operations.”* 455 | 456 | *(Placeholder: Diagram of the AI Foundry Citadel Reference Architecture and Pillars)* 457 | 458 | In summary, **Foundry Citadel Platform** helps you **“build the future, safely”** – delivering the **speed** that business demands, with the **safeguards** that IT requires, all in one comprehensive, evolving platform. It is your organisation’s Citadel in the new world of AI – providing **protection, structure, and strength** as you scale new heights with enterprise AI. 459 | --------------------------------------------------------------------------------