├── .gitignore ├── example ├── strategy.md └── data-mesh │ └── data-product │ └── output-port │ └── files │ ├── 0001-data-product-output-port-files.cue │ ├── 0001-data-product-output-port-files-example.yaml │ └── 0001-data-product-output-port-files.md ├── gdr_template.md ├── README.md └── LICENSE /.gitignore: -------------------------------------------------------------------------------- 1 | /*.iml 2 | .$* 3 | .DS_Store 4 | /.idea/** 5 | /.idea/* 6 | /.idea 7 | 8 | -------------------------------------------------------------------------------- /example/strategy.md: -------------------------------------------------------------------------------- 1 | # ACME Strategy 2 | 3 | In ACME, as of industrial plan 2023-2025, we have decided to adopt the Data Mesh paradigm as data strategy. 4 | 5 | ## Document purpose 6 | 7 | The purpose of this document is to collect the high-level status quo, according to all the specific decisions that are formalized as **GDR**. The related GDR ids shall be reported when describing a specific top level vision. 8 | 9 | ## Content Summary 10 | 11 | The strategy for ACME data mesh implementation is divided into 4 sections, according the 4 pillars of the paradigm: 12 | 13 | 1. Decentralized Domain Ownership 14 | 2. Data As A Product 15 | 3. Self-serve infrastructure as a Platform 16 | 4. Federated Computational Governance 17 | 18 | ### Decentralized Domain Ownership 19 | 20 | TBD 21 | 22 | ### Data As A Product 23 | 24 | ACME considers an acceptable "Data Product" an asset with . 25 | 26 | When it comes to Output Ports, the following types are currently supported: 27 | - files: these must be created according to the specifications in [GDR-0001](./data-mesh/data-product/output-port/files/0001-data-product-output-port-files.md). 28 | - SQL: these must be created according to the specifications in TBD 29 | - events: these must be created according to the specifications in TBD 30 | 31 | ### Self-serve infrastructure as a Platform 32 | 33 | TBD 34 | 35 | ### Federated Computational Governance 36 | 37 | TBD 38 | 39 | ## Operating model 40 | 41 | The operating model to come to the decisions formalized as GDR is ... 42 | 43 | ## More 44 | 45 | TBD 46 | 47 | 48 | 49 | -------------------------------------------------------------------------------- /gdr_template.md: -------------------------------------------------------------------------------- 1 | 2 | # {Governance Decision Record title} 3 | 4 | ![NEW](https://img.shields.io/badge/HISTORY-NEW-brightgreen?style=flat&logo=CodeReview) 5 | ![AMENDS](https://img.shields.io/badge/HISTORY-AMENDS-green?style=flat&logo=CodeReview) 6 | ![SUPERCEDES](https://img.shields.io/badge/HISTORY-SUPERCEDES-yellowgreen?style=flat&logo=CodeReview) 7 | ![AMENDED](https://img.shields.io/badge/HISTORY-AMENDED-yellow?style=flat&logo=CodeReview) 8 | ![SUPERCEDED](https://img.shields.io/badge/HISTORY-SUPERCEDED-orange?style=flat&logo=CodeReview) 9 | ![DEPRECATED](https://img.shields.io/badge/HISTORY-DEPRECATED-black?style=flat&logo=CodeReview) 10 | 11 | (in the case of amend* and supercede* the related policy should be linked) 12 | 13 | ![DRAFT](https://img.shields.io/badge/LIFECYCLE-DRAFT-blue?style=flat&logo=StackShare) 14 | ![APPROVED](https://img.shields.io/badge/LIFECYCLE-APPROVED-brightgreen?style=flat&logo=StackShare) 15 | ![REJECTED](https://img.shields.io/badge/LIFECYCLE-REJECTED-red?style=flat&logo=StackShare) 16 | 17 | 18 | ## Context 19 | 20 | (where does the policy applies to) 21 | 22 | ## Decision 23 | 24 | (The policy content, including code/data/metadata snippets if necessary) 25 | 26 | ### Lifecycle 27 | 28 | (What would create a breaking change and what would not) 29 | 30 | ## Consequences and accepted trade-offs 31 | 32 | (pros, cons, impact, tech debt implied) 33 | 34 | ## Implementation Steward 35 | 36 | (A set of persons, roles, or groups) 37 | 38 | ## Where the policy becomes computational 39 | 40 | - **LOCAL POLICY**: (when the policy is applied as part of the local scope of a data asset, or data product, e.g. live execution context of an orchestrated set of pipelines) 41 | - **GLOBAL POLICY**: (when the policy is applied to all the assets, e.g. to all the data products using a specific output port) 42 | 43 | - policy code: (link to the policy-as-code related file, if any) 44 | -------------------------------------------------------------------------------- /example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files.cue: -------------------------------------------------------------------------------- 1 | // 2 | // To validate the YAML run: 3 | // 4 | // cue vet .yaml .cue 5 | // 6 | 7 | #OM_DataType: string & =~"(?i)^(NUMBER|TINYINT|SMALLINT|INT|BIGINT|BYTEINT|BYTES|FLOAT|DOUBLE|DECIMAL|NUMERIC|TIMESTAMP|TIME|DATE|DATETIME|INTERVAL|STRING|MEDIUMTEXT|TEXT|CHAR|VARCHAR|BOOLEAN|BINARY|VARBINARY|ARRAY|BLOB|LONGBLOB|MEDIUMBLOB|MAP|STRUCT|UNION|SET|GEOGRAPHY|ENUM|JSON)$" 8 | #OM_Constraint: string & =~"(?i)^(NULL|NOT_NULL|UNIQUE|PRIMARY_KEY)$" 9 | #Version: string & =~"^[0-9]+\\.[0-9]+\\..+$" 10 | #Id: string & =~"^[a-zA-Z0-9:._-]+$" 11 | #DataProductId: #Id 12 | #ComponentId: #Id 13 | #URL: string & =~"^https?://[a-zA-Z0-9@:%._~#=&/?]*$" 14 | 15 | #OM_TableData: { 16 | columns: [... string] 17 | rows: [... [...]] 18 | } 19 | 20 | #OM_Tag: { 21 | tagFQN: string 22 | description?: string | null 23 | source: string & =~"(?i)^(Tag|Glossary)$" 24 | labelType: string & =~"(?i)^(Manual|Propagated|Automated|Derived)$" 25 | state: string & =~"(?i)^(Suggested|Confirmed)$" 26 | href?: string | null 27 | } 28 | 29 | #OM_Column: { 30 | name: string 31 | dataType: #OM_DataType 32 | if dataType =~ "(?i)^(ARRAY)$" { 33 | arrayDataType: #OM_DataType 34 | } 35 | if dataType =~ "(?i)^(CHAR|VARCHAR|BINARY|VARBINARY)$" { 36 | dataLength: number 37 | } 38 | dataTypeDisplay?: string | null 39 | description?: string | null 40 | fullyQualifiedName?: string | null 41 | tags?: [... #OM_Tag] 42 | constraint?: #OM_Constraint | null 43 | ordinalPosition?: number | null 44 | if dataType =~ "(?i)^(JSON)$" { 45 | jsonSchema: string 46 | } 47 | if dataType =~ "(?i)^(MAP|STRUCT|UNION)$" { 48 | children: [... #OM_Column] 49 | } 50 | } 51 | 52 | #DataContract: { 53 | schema: [... #OM_Column] 54 | serializationFormat: string & =~"(?i)^(parquet|deltalake|iceberg)$" 55 | compressionCodec: string & =~"(?i)^(snappy|gzip)$" 56 | biTempBusinessTs: string 57 | biTempWriteTs: string 58 | termsAndConditions: string 59 | partitionedBy?: [... string] 60 | endpoint?: string | null 61 | SLA: #SLA 62 | } 63 | 64 | #SLA: { 65 | intervalOfChange: string 66 | timeliness: string 67 | upTime: number 68 | } 69 | 70 | #OutputPortSpecificFiles: { 71 | // Mandatory 72 | resourceGroup: =~"^.{2,}$" 73 | storageAccount: =~"^[a-z0-9]{3,24}$" 74 | container: =~"^[a-z0-9-]{3,63}$" 75 | directory?: [... string] 76 | 77 | performance: *"Standard" | "Premium" 78 | geoReplication: *"GRS" | "LRS" 79 | accessTier: *"Hot" | "Cool" 80 | 81 | // Fixed 82 | region: "germanywestcentral" 83 | accountKind: "StorageV2" 84 | shared_access_key_enabled: false 85 | infrastructure_encryption_enabled: true 86 | versioning_enabled: true | false 87 | ... 88 | } 89 | 90 | #Component: { 91 | id: #ComponentId 92 | name: string & =~"^[a-z0-9-]{3,40}$" 93 | description: string & =~"^\\W*(?:\\w+\\b\\W*){10,200}$" 94 | version: #Version 95 | platform: string & "Azure" 96 | technology: string & "ADLSgen2" 97 | outputPortType: string & "Files" 98 | kind: string & =~"(?i)^(outputport|_)$" 99 | tags?: [... #OM_Tag] 100 | dataContract: #DataContract 101 | sampleData: #OM_TableData 102 | specific: #OutputPortSpecificFiles 103 | } 104 | 105 | components: [#Component, ...#Component] 106 | -------------------------------------------------------------------------------- /example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files-example.yaml: -------------------------------------------------------------------------------- 1 | components: 2 | - id: urn:dmb:cmp:cards:ccfrauds:1:top_10_historical-year 3 | name: top-10-ccfrauds-historical-year 4 | description: an aggregation of top 10 x y for the historical series of credit cards frauds, partitioned by year 5 | kind: outputport 6 | version: 1.0.1 7 | platform: Azure 8 | technology: ADLSgen2 9 | outputPortType: Files 10 | dataContract: 11 | serializationFormat: parquet 12 | compressionCodec: snappy 13 | partitionedBy: [ year ] 14 | schema: 15 | - name: customerId 16 | dataType: string 17 | description: global addressable identifier for a customer 18 | constraint: PRIMARY_KEY 19 | tags: 20 | - tagFQN: GlobalAddressableIdentifier 21 | source: Tag 22 | labelType: Manual 23 | state: Confirmed 24 | - name: name 25 | dataType: string 26 | description: customer's first name 27 | constraint: NOT_NULL 28 | tags: 29 | - tagFQN: PII 30 | source: Tag 31 | labelType: Manual 32 | state: Confirmed 33 | - name: surname 34 | dataType: string 35 | description: customer's last name 36 | constraint: NOT_NULL 37 | tags: 38 | - tagFQN: PII 39 | source: Tag 40 | labelType: Manual 41 | state: Confirmed 42 | - name: businessTs 43 | dataType: timestamp 44 | description: the business timestamp, to be leveraged for time-travelling 45 | constraint: NOT_NULL 46 | tags: [ ] 47 | - name: writeTs 48 | dataType: timestamp 49 | description: the technical (write) timestamp, to be leveraged for time-travelling 50 | constraint: NOT_NULL 51 | tags: [ ] 52 | - name: amount 53 | dataType: double 54 | description: the amount in EUR, 2 decimals precision 55 | constraint: NOT_NULL 56 | - name: year 57 | dataType: int 58 | description: reference year of the event 59 | constraint: NOT_NULL 60 | SLA: 61 | intervalOfChange: "1 hour" 62 | timeliness: "10 minutes" 63 | upTime: 99.9 64 | termsAndConditions: confidential, it shouldn't be exported outside of the organization 65 | endpoint: https://acmedpcardssvilop.blob.core.windows.net/cards-ccfrauds 66 | biTempBusinessTs: businessTs 67 | biTempWriteTs: writeTs 68 | tags: 69 | - tagFQN: cards 70 | source: Tag 71 | labelType: Manual 72 | state: Confirmed 73 | - tagFQN: analytical 74 | source: Tag 75 | labelType: Manual 76 | state: Confirmed 77 | - tagFQN: svil 78 | source: Tag 79 | labelType: Manual 80 | state: Confirmed 81 | - tagFQN: PII 82 | source: Tag 83 | labelType: Manual 84 | state: Confirmed 85 | 86 | specific: 87 | subscription: 0f876e36-124c-77f1-aabb-e543b3d2b3ad 88 | resourceGroup: cards.data_products 89 | storageAccount: acmecardsdpsvilop 90 | container: cards-ccfrauds 91 | shared_access_key_enabled: false 92 | infrastructure_encryption_enabled: true 93 | versioning_enabled: true 94 | accountKind: StorageV2 95 | directory: 96 | - /1 97 | geoReplication: GRS 98 | accessTier: Hot 99 | performance: Premium 100 | region: germanywestcentral -------------------------------------------------------------------------------- /example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files.md: -------------------------------------------------------------------------------- 1 | # Data Product Output Port Files 2 | 3 | ![NEW](https://img.shields.io/badge/HISTORY-NEW-brightgreen?style=flat&logo=CodeReview) 4 | ![APPROVED](https://img.shields.io/badge/LIFECYCLE-APPROVED-brightgreen?style=flat&logo=StackShare) 5 | 6 | ## Context 7 | 8 | One of the accepted output port types for Data Products is "FILES", that usually includes: 9 | 10 | - the storage technology and related configurations 11 | - the serialization format 12 | - the compression codec 13 | - lifecycle information (cost management, ownership, ...) 14 | - security and compliance constraints 15 | - data contract 16 | 17 | From an architectural perspective, we're talking about files on a storage system/container (HDFS, Ozone, MinIO, S3, 18 | ADLSgen2, etc ..). 19 | 20 | ## Decision 21 | 22 | Our company must comply to the Financial Services regulations for EMEA, thus here follow what Data Products developers 23 | can/should/must require via the platform: 24 | 25 | - **STORAGE SPEC**: 26 | - technology: Storage account with Azure Data Lake storage gen2 (`StorageV2`) 27 | - region: due to regulatory constraints, the only allowed region is Germany West Central (`germanywestcentral`) 28 | - encryption: `true` or `false`, depending if a _PII_ tag is assigned 29 | - public access: `none` 30 | - performance: `Standard` or `Premium` 31 | - access tier: `Hot` or `Cool` 32 | - redundancy: `GRS` or `ZRS` 33 | - tags: ``, `` (if any), `dp`, `` 34 | - shared access key enabled: `false`, always mandatory 35 | - **STORAGE STRUCTURE**: 36 | - different storage accounts for different _company_, _domains_, _subdomains_, _environments_, _usage_ (`op` as for 37 | Output Port). Storage Account name should be `acmedpop`, no 38 | spaces/hyphens/underscore. The storage account must be part of a resource group (1 per domain). 39 | - different containers for different data products. Container name should 40 | be `-`. 41 | - different directories for different **major** versions (assuming Semantic Versioning is leveraged) and 42 | different output ports of the data product 43 | - different subdirectories for different output ports 44 | - different subdirectories for different partitions/buckets depending on the partitioning/bucketing 45 | strategy 46 | - **FORMAT**: 47 | - serialization: `parquet`, `iceberg`, or `deltalake` 48 | - compression codec: `snappy`, `gzip`, or `none` 49 | - partitionedBy: [list of partition keys] 50 | - **DATA CONTRACT** should include: 51 | - schema (according to 52 | the [OpenMetadata](https://docs.open-metadata.org/metadata-standard/schemas/entities/table#column) specification) 53 | - serialization format 54 | - compression codec 55 | - bi-temporality business time reference 56 | - bi-temporality technical time reference 57 | - SLA/SLO 58 | 59 | Not all of these properties must specified in the specification, since the mandatory ones will be applied as default 60 | behaviours by the related specific provisioner. 61 | 62 | Here follow the specific metadata: 63 | 64 | ```yaml 65 | # [ inside an output port component object ] 66 | 67 | id: # according to platform's standards 68 | name: # minimum 3, max 30 chars with hyphens 69 | platform: Azure # mandatory 70 | technology: ADLSgen2 #mandatory 71 | outputPortType: Files # mandatory 72 | kind: outputport # mandatory 73 | description: #minimum 10, maximum 200 words 74 | version: # major.minor.patch 75 | dataContract: 76 | serializationFormat: parquet # or "deltalake", or "iceberg" 77 | compressionCodec: snappy # or "none" or "gzip" 78 | partitionedBy: [ , ] # array of partition keys 79 | schema: { ... } # according to OpenMetadata spec, including fields tagging for PII 80 | SLA: 81 | intervalOfChange: # descriptive, e.g. "1 hour" 82 | timeliness: # descriptive, e.g. "10 minutes" 83 | upTime: # like 99.9 84 | termsAndConditions: 85 | endpoint: 86 | biTempBusinessTs: businessTs # according to the schema 87 | biTempWriteTs: writeTs # according to the schema 88 | tags: [] # array of Tags as per the OpenMetadata spec 89 | specific: 90 | subscription: "1234567-1234-1234-1234-12345567788" # azure subscription id 91 | resourceGroup: cards.data_products # azure resource group into that subscription in the form of domain.datapproduct_name 92 | storageAccount: acmecardsdpsvilop # azure storage account id 93 | container: cards-ccfrauds # container into that storage account in the form of domain-dataproduct_name 94 | shared_access_key_enabled: false # true | false 95 | infrastructure_encryption_enabled: true # true if PII | false otherwise 96 | versioning_enabled: true # mandatory 97 | accountKind: StorageV2 # mandatory 98 | directory: /1 # optional 99 | geoReplication: GRS # or ZRS 100 | accessTier: Hot # or Cool 101 | performance: Premium # or Standard 102 | region: germanywestcentral # mandatory 103 | ``` 104 | 105 | The resulting specification (NOTE: here we reference 106 | the [Data Product Specification](https://github.com/agile-lab-dev/Data-Product-Specification/blob/main/example.yaml), in 107 | particular focused on the Output Port component), are reported in the form of example 108 | in [0001-data-product-output-port-files-example.yaml](0001-data-product-output-port-files-example.yaml). 109 | 110 | ### LIFECYCLE 111 | 112 | The following changes are considered _BREAKING_: 113 | 114 | - schema change in case no out-of-the-box backward compatibility is guaranteed 115 | - bi-temporality fields changes and in general data contract 116 | - serialization format 117 | - compression codec 118 | 119 | In the case of a breaking change, the **major** version of the Data Product should change (e.g. 1.x.y -> 2.0.0) 120 | 121 | All the other changes are considered _NOT BREAKING_ and should lead to minor or patch Data Product's version change. 122 | 123 | ## Consequences and accepted trade-offs 124 | 125 | Domains must comply to this serialization format and compression codec. In case of major version change, data product 126 | owners should inform all the consumers, documenting the changelog and migration plan (including the time period during 127 | which both old and new versions are kept in parallel so to avoid breaking changes at the consumers side). Obviously, 128 | this implies higher maintenance costs for the DP owner. 129 | 130 | Due to regulatory reasons, there's only one Azure region admitted. 131 | 132 | Premium tier is more expensive, but the massive layout of the data product requires for analytical use cases this level 133 | of performance. 134 | 135 | ## Implementation Steward 136 | 137 | The Data Product Owner for requesting the proper provisioning. 138 | The Platform Team for implementing the governance policy-as-code on the platform. 139 | 140 | ## Where the policy becomes computational 141 | 142 | - **LOCAL POLICY**: the Data Product Owner should clone and reuse the template for the FILES output port on ADLSgen2 143 | - **GLOBAL POLICY**: the Platform Team should implement an automated validation of the data product specification in the 144 | section specific to the output port's metadata at deploy-time (while requesting the infrastructure provisioning in 145 | self-service fashion), so to detect non-compliant requirements along with breaking changes. 146 | - policy code: [0001-data-product-output-port-files.cue](0001-data-product-output-port-files.cue) 147 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Governance Decision Record 2 | 3 | The Governance Decision Record (**GDR**) is a specification model for (computational) data governance policies inspired from [ADR](https://adr.github.io/) (Architectural Decision Record). Its goal is to enable the creation of version-controlled data governance policies that include: 4 | 5 | - a **policy lifecycle** state 6 | - a **policy history** state 7 | - the policy **title** 8 | - the **context** 9 | - the **decision** 10 | - the **consequences** and accepted trade-offs 11 | 12 | These are basically in common with the ADR model. In this specification, that aims to perfectly tailor the Data Mesh context but can also be used for differnt data management paradigms, some more sections are added: 13 | 14 | - an **implementation steward** 15 | - where the **policy becomes computational** 16 | 17 | Having documented and version controlled policies is also useful to enable distributed (federated) async but tracked and organized work by the governance team (*federated* governance team, in the storytelling of Data Mesh). 18 | 19 | Documents are usually created when taking design decisions in the IT. Likewise the role of the ADR in software architectures, the GDR goal is to enable structured/versioned/governable federated work on a git repository that can include code (policy-as-code), thus closing the gap with the "platforms" world - where most of the governance decision must be executed or made live: in fact, GDR paired with policies as code can be directly accessed by a governance platform, thus offering the "computational" policy capability. When this capability is also orchestrated as part of a more complex lifecycle of technical assets (like self-serve provisioning for data products), then the picture is complete. Agile Lab has made this view a real thing, creating [Witboost Data Mesh Boost](https://www.agilelab.it/witboost-data-mesh-main). 20 | 21 | Let's deep dive into each section. 22 | 23 | ## Policy Lifecycle State 24 | This can be as simple as a label tracking down the **lifecycle** state of a policy. Common states are: 25 | 26 | - `DRAFT`, when a policy is being developed and still needs to be formally approved, or has been submitted for approval; 27 | - `APPROVED`, when a policy has been formally approved: this makes it actionable and a reference for the overall governance; 28 | - `REJECTED`, when a policy has been formally rejected (after the approval process). 29 | 30 | In the [GDR template](gdr_template.md) file, some pre-compiled web-rendered labels are provided. 31 | 32 | ## Policy History State 33 | This can be as simple as a label tracking down the **history** state of a policy. Common states are: 34 | 35 | - `NEW`, when a policy is created for the first time, it doesn't amend or supercede an existing one; 36 | - `AMENDS` or `AMENDED`, when an approved policy amends (or is amended by) another existing policy; 37 | - `SUPERCEDES` or `SUPERCEDED`, when a policy supercedes (or is superceded by) another existing policy; 38 | - `DEPRECATED`, when a policy ceases to be valid/applied and no other one amends or supercedes it. 39 | 40 | **NOTE:** in the case of amend* and supercede* the related policy should be linked. 41 | 42 | In the [GDR template](gdr_template.md) file, some pre-compiled web-rendered labels are provided. 43 | 44 | ## Context 45 | This section describes what is the context where the policy applies to (and why). 46 | 47 | ## Decision 48 | The decision the policies aims to apply. 49 | 50 | ### Lifecycle 51 | Declare what changes to the metadata (or anything else) would be considered BREAKING and what NOT BREAKING. This is important to implement automations at platform level and create a robust change management process based on trust between data producers and consumers. 52 | 53 | ## Consequences and accepted trade-offs 54 | What we accept to happen while the policy is applied including pros (improvements) and cons (impacts, rework, new accountabilities or requirements). Since there's no "universally optimal decision", the policy should also report the trade-offs the organization is going to accept with this policy, which could mean in some scenario making explicit the accumulated tech debt (a note on tech debt: this is usually hidden and hard to track. When making it explicit, it easier to measure/keep track to the overall tech debt, system quality in terms of architecture and behaviour, etc.). 55 | 56 | ## Implementation Steward 57 | Who is supposed to take care of the implementation (we talk about implementation since the policy, like in the context of Data Mesh, is supposed to become as more "computational" as possibile, thus leading to automate the data management practice, probably with the help of a backing platform). It can also be the role with the accountability to follow the application of such policy. 58 | 59 | ## Where the policy becomes computational 60 | Which are the specific points in the architecture, the platform, the system, the context, etc where this policy (and its checks, if any) are implemented so to become an _automation_ (thus becoming _"computational"_). 61 | 62 | This is split into **LOCAL** and **GLOBAL** policy: while the former assess the context of a policy locally implemented/applied/verified (in the context of Data Mesh, this could be a Data Product Owner wanting to calculate and measure the Data Quality over data at rest in the DP's output ports, is specific to the context and does not affect others, like domains or DP owners), the latter is for policies globally applied (e.g. the S3 bucket provisionable for Data Product's output ports can only be in `eu-central-1` AWS region). 63 | 64 | If using a descriptive modelling languange, a metadata validation policy-as-code file can be provided (probably it will be integrated in the platform, e.g. using CUE lang for YAML). 65 | 66 | ----------- 67 | 68 | ### How to make use of this policy model? 69 | 70 | An example of usage includes: 71 | 1. setting up a git repo 72 | 2. (optional) installing a tool so that every contributor follows the same process (which is a good idea to document in the repo itself), e.g. [adr-tools](https://github.com/npryce/adr-tools) 73 | 3. keep track of governance policies to create by leveraging the issue tracking system of the git repo, making use of all the features the issue tracking system provides (like labels, epics, etc ...) 74 | 4. work out the policies issues, creating the related merge requests 75 | 5. implement the policy, leveraging the [template](gdr_template.md) here provided 76 | 6. provide a metadata model, example, and validation (policy-as-code) file 77 | 7. when the policy is ready, merge it (according to the governance process) and make it executive. 78 | 79 | An important **note** on points 3, 4, 5, 6, and 7: in the case of **Data Mesh**, the federated governance team (which include SME, Subject Matter Experts, coming from all the most meaningful units of the company like engineering, security, compliance, as well as domains' representative spokespersons) should collaborate in their own perimeters of expertise. Probably, a Federated Governance Team "core members" group (e.g. the Platform team) could take care of the final merge of the policies as in point 6, thus also acting as a final validation. 80 | 81 | The policies can (will) evolve over time during the data platform lifecycle. In order to account and embrace the change, it's suggested to create a folder for every GDR and name the GDR (policy) file with the notation: `xxxx-policy-content-or-decision.md` (in case the Markdown format is used for the policy document, xxxx is a monothonically increasing id that tracks the policy's evolutions/version). Generally speaking of GDRs, multiple different GDRs (addressing different decisions of a same area of application) are supposed to cohexist within the same folder: in the case of governance policies this could lead to misunderstanding of the incremental sequence id, but still grouping into nested folders/subfolders can be used. 82 | 83 | When evolving an existing policy, is important to take care of the policy lifecycle state, expecially when amending or superceding existing policies. By using the 1:1 ration for folder:policy, then it's straightforward to identify the most recent (and supposedly currently valid) policy for every context. 84 | 85 | *NOTE*: it could be worthwile to also have a super high-level document reporting the current state of the system/company according to (and reporting) all the decisions leading to the current status quo. 86 | 87 | ### Example 88 | 89 | A pretty exhaustive example policy and related metadata + policy-as-code validation files is provided in the [example](example/data-mesh/data-product/output-port/files) folder. In this example, the specific architectural decision (a.k.a. GDR now) is provided to describe how an *Output Port* of type "FILES" should be defined, provisioned, configured, described, validated. 90 | The folder contains 3 files: 91 | - [0001-data-product-output-port-files.md](example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files.md) containing the descriptive GDR (as implementation of the [template](gdr_template.md)). In this example, we took inspiration from the Financial Services world (in terms of constraints); 92 | - [0001-data-product-output-port-files.cue](example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files.cue) containing the [CUE lang](https://cuelang.org) policy-as-code validation file, that supposedly will be integrated in the Data Mesh self-service infrastructure-as-a-platform; 93 | - [0001-data-product-output-port-files-example.yaml](example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files-example.yaml) containing an example of metadata specification with real world values. 94 | 95 | The GDR versioning assumes this is the first policy created to address this governance topic. 96 | 97 | The overall vision is reported in the [top level strategy file](./example/strategy.md). 98 | 99 | The policy metadata can be validated with the policy-as-code file using the CUE CLI (if installed): 100 | 101 | ```bash 102 | cue vet example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files-example.yaml example/data-mesh/data-product/output-port/files/0001-data-product-output-port-files.cue 103 | ``` 104 | 105 | ## Coming next 106 | 107 | Future releases could include: 108 | - an organizational process for the governance meetings 109 | - a workflow to manage the policies lifecycle 110 | - more examples 111 | 112 | ## License 113 | 114 | The proposed approach, template, examples and policy-as-code files are shared with the community under the [APACHE 2.0](LICENSE) LICENSE. 115 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | --------------------------------------------------------------------------------