├── .github └── workflows │ ├── CODEOWNERS │ └── cla.yml ├── CLA.md ├── CODE_OF_CONDUCT.md ├── LICENSE ├── README.md ├── SECURITY.md ├── images ├── SML-Object-Hierarchy.png ├── sml-image-medium.png ├── sml-logo-large.png └── sml-logo-small.png └── sml-reference ├── calculation.md ├── catalog.md ├── composite-model.md ├── connection.md ├── dataset.md ├── dimension.md ├── metric.md ├── model.md ├── package.md └── row-security.md /.github/workflows/CODEOWNERS: -------------------------------------------------------------------------------- 1 | # Require one of these two users to review any change in the repo 2 | * @diannewood @svetoslavpetkov 3 | -------------------------------------------------------------------------------- /.github/workflows/cla.yml: -------------------------------------------------------------------------------- 1 | name: "CLA Assistant" 2 | 3 | on: 4 | issue_comment: 5 | types: [created] 6 | pull_request_target: 7 | types: [opened,closed,synchronize] 8 | 9 | permissions: 10 | actions: write 11 | contents: write 12 | pull-requests: write 13 | statuses: write 14 | 15 | jobs: 16 | cla-assistant: 17 | runs-on: "ubuntu-latest" 18 | steps: 19 | - name: "CLA Assistant" 20 | if: "(github.event.comment.body == 'recheck' || github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA') || github.event_name == 'pull_request_target'" 21 | uses: contributor-assistant/github-action@v2.5.1 22 | env: 23 | GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}" 24 | PERSONAL_ACCESS_TOKEN: "${{ secrets.PERSONAL_ACCESS_TOKEN }}" 25 | with: 26 | remote-organization-name: "semanticdatalayer" 27 | remote-repository-name: "sml-cla" 28 | path-to-signatures: "signatures/version1/cla.json" 29 | path-to-document: "https://github.com/semanticdatalayer/SML/blob/main/CLA.md" 30 | branch: "main" 31 | allowlist: "dependabot[bot],greenkeeper[bot]" -------------------------------------------------------------------------------- /CLA.md: -------------------------------------------------------------------------------- 1 | # AtScale Individual Contributor License Agreement 2 | 3 | Thank you for your interest in contributing to open source software projects (“Projects”) made available by AtScale Inc. (“AtScale”). This Individual Contributor License Agreement (“Agreement”) sets out the terms governing any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that you submit or have submitted, in any form and in any manner, to AtScale in respect of any Projects (collectively “Contributions”). If you have any questions respecting this Agreement, please contact ATSCALE SUPPORT EMAIL. 4 | 5 | You agree that the following terms apply to all of your past, present and future Contributions. Except for the licenses granted in this Agreement, you retain all of your right, title and interest in and to your Contributions. 6 | 7 | **Copyright License.** You hereby grant, and agree to grant, to AtScale a non-exclusive, perpetual, irrevocable, worldwide, fully-paid, royalty-free, transferable copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute your Contributions and such derivative works, with the right to sublicense the foregoing rights through multiple tiers of sublicensees. 8 | 9 | **Patent License.** You hereby grant, and agree to grant, to AtScale a non-exclusive, perpetual, irrevocable, worldwide, fully-paid, royalty-free, transferable patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer your Contributions, where such license applies only to those patent claims licensable by you that are necessarily infringed by your Contributions alone or by combination of your Contributions with the Project to which such Contributions were submitted, with the right to sublicense the foregoing rights through multiple tiers of sublicensees. 10 | 11 | **Moral Rights.** To the fullest extent permitted under applicable law, you hereby waive, and agree not to assert, all of your “moral rights” in or relating to your Contributions for the benefit of AtScale, its assigns, and their respective direct and indirect sublicensees. 12 | 13 | **Third Party Content/Rights.** If your Contribution includes or is based on any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that were not authored by you (“Third Party Content”) or if you are aware of any third party intellectual property or proprietary rights associated with your Contribution (“Third Party Rights”), then you agree to include with the submission of your Contribution full details respecting such Third Party Content and Third Party Rights, including, without limitation, identification of which aspects of your Contribution contain Third Party Content or are associated with Third Party Rights, the owner/author of the Third Party Content and Third Party Rights, where you obtained the Third Party Content, and any applicable third party license terms or restrictions respecting the Third Party Content and Third Party Rights. For greater certainty, the foregoing obligations respecting the identification of Third Party Content and Third Party Rights do not apply to any portion of a Project that is incorporated into your Contribution to that same Project. 14 | 15 | **Representations.** You represent that, other than the Third Party Content and Third Party Rights identified by you in accordance with this Agreement, you are the sole author of your Contributions and are legally entitled to grant the foregoing licenses and waivers in respect of your Contributions. If your Contributions were created in the course of your employment with your past or present employer(s), you represent that such employer(s) has authorized you to make your Contributions on behalf of such employer(s) or such employer(s) has waived all of their right, title or interest in or to your Contributions. 16 | 17 | **Disclaimer.** To the fullest extent permitted under applicable law, your Contributions are provided on an "asis" basis, without any warranties or conditions, express or implied, including, without limitation, any implied warranties or conditions of non-infringement, merchantability or fitness for a particular purpose. You are not required to provide support for your Contributions, except to the extent you desire to provide support. 18 | 19 | **No Obligation.** You acknowledge that AtScale is under no obligation to use or incorporate your Contributions into any of the Projects. The decision to use or incorporate your Contributions into any of the Projects will be made at the sole discretion of AtScale or its authorized delegates. 20 | 21 | **Disputes.** This Agreement shall be governed by and construed in accordance with the laws of the State of New York, United States of America, without giving effect to its principles or rules regarding conflicts of laws, other than such principles directing application of New York law. The parties hereby submit to venue in, and jurisdiction of the courts located in New York, New York for purposes relating to this Agreement. In the event that any of the provisions of this Agreement shall be held by a court or other tribunal of competent jurisdiction to be unenforceable, the remaining portions hereof shall remain in full force and effect. 22 | 23 | **Assignment.** You agree that AtScale may assign this Agreement, and all of its rights, obligations and licenses hereunder. -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, gender identity and expression, level of experience, 9 | nationality, personal appearance, race, religion, or sexual identity and 10 | orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | - Using welcoming and inclusive language 18 | - Being respectful of differing viewpoints and experiences 19 | - Gracefully accepting constructive criticism 20 | - Focusing on what is best for the community 21 | - Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | - The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | - Trolling, insulting/derogatory comments, and personal or political attacks 28 | - Public or private harassment 29 | - Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | - Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team at **abuse@atscale.com**. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | ## Attribution 69 | 70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 71 | available at 72 | 73 | [homepage]: https://contributor-covenant.org -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![logo](images/sml-logo-large.png) 2 | 3 | # What is SML? 4 | Semantic Modeling Language, or SML for short, encompasses over a decade of hands-on development, solving use cases for hundreds of customers across industries such as finance, healthcare, retail, manufacturing, CPG, and more. SML covers more than just tabular use cases. At its core, it is a multidimensional semantic modeling language that supports metrics, dimensions, hierarchies, semi-additive measures, many-to-many relationships, cell-based expressions, and much more. 5 | 6 | SML delivers on the following requirements: 7 | 8 | 1. **Object-oriented**: SML is an object-oriented language that promotes composability and inheritance. This allows semantic objects to be shared within other semantic objects and across organizations, supporting easy and consistent model-building. 9 | 2. **Comprehensive**: SML is based on more than a decade of modeling experience across various industry verticals and use cases. SML handles multi-dimensional constructs and serves as a superset of all other existing semantic modeling languages. 10 | 3. **Familiar**: SML is based on YAML, a widely adopted, human-readable, industry-standard syntax. 11 | 4. **CI/CD Friendly**: SML is code, so it is compatible with Git and CI/CD practices for version control, automated deployment, and software lifecycle management. 12 | 5. **Extensible**: SML syntax can be enhanced to support additional properties and features. 13 | 6. **Open**: SML is Apache open-sourced to support community innovation and is free to use in any application or use case. 14 | 15 | ## What's in this repository? 16 | 17 | Open-sourcing SML aims to promote the building of reusable models and semantic objects. We are making the SML specification available for public consumption and collaboration. Soon, we will add software tools to make serializations and translations from various semantic dialects easier. 18 | 19 | We are or will be open-sourcing the following: 20 | 21 | 1. **A YAML-based Language Specification**: The SML specification is documented and encompasses tabular and multidimensional constructs. 22 | 2. **Pre-built Semantic Models**: The GitHub repository contains pre-built semantic models that incorporate standard data models, such as TPC-DS, common training models like Worldwide Importers and AdventureWorks, and marketplace models like Snowplow and CRISP. We expect to add semantic models for SaaS applications such as Salesforce, Google Analytics, and Jira soon. 23 | 3. **SML SDK**: An SDK that facilitates the programmatic reading and writing of SML syntax. 24 | 4. **SML CLIs**: Command line interfaces (CLIs) for installing SML dependencies and validating SML syntax. This includes a reference CLI for deploying SML models to a proprietary semantic layer platform. 25 | 5. **Semantic Converters**: A CLI for translating other semantic modeling languages to and from SML, including Snowflake Cortex semantic models, Databricks UC Metrics, and Power BI semantic models. 26 | 27 | ## SML Example 28 | The following is an example of an SML `model` object: 29 | 30 | ``` 31 | unique_name: Internet Sales 32 | object_type: model 33 | label: Internet Sales 34 | visible: true 35 | 36 | relationships: 37 | - unique_name: factinternetsales_Date_Dimension_Order 38 | from: 39 | dataset: factinternetsales 40 | join_columns: 41 | - orderdatekey 42 | to: 43 | dimension: Date Dimension 44 | level: DayMonth 45 | role_play: "Order {0}" 46 | 47 | dimensions: 48 | - Color Dimension 49 | - Size Dimension 50 | - Style Dimension 51 | - Weight 52 | 53 | metrics: 54 | - unique_name: orderquantity 55 | folder: Sales Metrics 56 | 57 | - unique_name: salesamount 58 | folder: Sales Metrics 59 | ``` 60 | 61 | ## SML Object Hierarchy 62 | The following graphic illustrates the key SML objects and their relationships: 63 | 64 | ```mermaid 65 | erDiagram 66 | CATALOG }|..|{ MODEL : has 67 | CATALOG }|..|{ PACKAGE : "may have" 68 | MODEL ||--|{ DIMENSION : references 69 | MODEL ||--|{ METRIC : references 70 | MODEL ||--|{ METRIC_CALC : references 71 | MODEL ||--|{ PACKAGE : "may reference" 72 | MODEL ||--|{ DATASET : references 73 | DIMENSION ||--|{ DATASET : references 74 | MODEL ||--|{ ROW_SECURITY : "may reference" 75 | DIMENSION ||--|{ ROW_SECURITY : "may reference" 76 | METRIC ||--|{ DATASET : references 77 | METRIC_CALC ||--|{ METRIC : "may reference" 78 | METRIC_CALC ||--|{ DIMENSION : "may reference" 79 | DATASET ||--|{ CONNECTION : references 80 | ``` 81 | 82 | ## SML Object Documentation 83 | 84 | The following sections describe the different SML object types as well 85 | as the properties available for each: 86 | 87 | - [Catalog](sml-reference/catalog.md) - Defines the control file for a SML repository. It contains all repository-level definitions. 88 | - [Package](sml-reference/package.md) - Defines additional Git repositories references whose objects can be used in the current repository. 89 | - [Model](sml-reference/model.md) - Defines the logical, business-friendly representation on top of the physical data. 90 | - [Dimension](sml-reference/dimension.md) - Defines the logical collection of attributes and hierarchies for supporting drill-down. 91 | - [Row Security](sml-reference/row-security.md) - Defines row-level data access rules for users and groups. 92 | - [Metric](sml-reference/metric.md) - Defines a numeric value representing a summarized (or aggregated) column. 93 | - [Calculation](sml-reference/calculation.md) - Defines an expression to combine, evaluate, or manipulate other metrics defined in the model. 94 | - [Dataset](sml-reference/dataset.md) - Defines columns on a physical table or query. Columns can be defined as SQL expressions. 95 | - [Connection](sml-reference/connection.md) - Defines a database and schema for connecting datasets to the physical data platform. 96 | - [Composite Model](sml-reference/composite-model.md) - Defines a model made up of multiple other models. 97 | 98 | ## SML Converters 99 | 100 | [SML Converters and Tooling](https://github.com/semanticdatalayer/sml-converters) - Library of bi-directional SML converters and tooling for different semantic layer platforms. 101 | 102 | ## Model Library 103 | 104 | ### Tutorial Models 105 | 1. [Internet Sales Model](https://github.com/semanticdatalayer/sml-models-tutorials-internet-sales) - a simple, single-fact model derived from the fictitious AdventureWorks retail dataset. 106 | 2. [World Wide Importers Model](https://github.com/semanticdatalayer/sml-models-tutorials-ww-importers) - a more complex, multi-fact model representing a fictional wholesale and distribution company. 107 | 3. [TPC-DS Model](https://github.com/semanticdatalayer/sml-models-tutorials-tpcds) - a complex, multi-fact model that encodes the [TPC-DS](https://www.tpc.org/tpcds/) benchmark model in SML. 108 | 4. [TPC-H Model](https://github.com/semanticdatalayer/sml-models-tutorials-tpch) - a complex, multi-fact model that encodes the [TPC-H](https://www.tpc.org/tpch/) benchmark model in SML. 109 | 5. [AdventureWorks2012 Model](https://github.com/semanticdatalayer/sml-models-tutorials-adventureworks2012) - the standard Microsoft SSAS tutorial in SML. 110 | 111 | ### Data Warehouse Usage/Cost Models 112 | 1. [Snowflake Usage Model](https://github.com/semanticdatalayer/sml-models-usage-snowflake) - a semantic model for analyzing Snowflake credit and data warehouse usage. 113 | 114 | ### Marketplace Models 115 | 1. [Snowplow Digital Analytics Model](https://github.com/semanticdatalayer/sml-models-snowplow) - Snowplow empowers organizations to create a scalable, first-party data foundation so marketing and data teams can effectively analyze and tackle Customer 360 use cases. 116 | 2. [CRISP CPG Retail and Distributor Data Model](https://github.com/semanticdatalayer/sml-models-crisp-cpg-retail) - Crisp connects to over 40 leading U.S. retailers and distributors. 117 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | Thanks for helping make **AtScale** Inc projects safe for everyone. 2 | 3 | ## Security 4 | 5 | AtScale takes the security of our software products and services seriously, including all of the open source code repositories managed through our GitHub organizations, such as [AtScale](https://github.com/AtScaleInc). 6 | 7 | Even though [open source repositories are outside of the scope of our bug bounty program](https://bounty.github.com/index.html#scope) and therefore not eligible for bounty rewards, we will ensure that your finding gets passed along to the appropriate maintainers for remediation. 8 | 9 | ## Reporting Security Issues 10 | 11 | If you believe you have found a security vulnerability in any AtScale-owned repository, please report it to us through coordinated disclosure. 12 | 13 | **Please do not report security vulnerabilities through public GitHub issues, discussions, or pull requests.** 14 | 15 | Instead, please send an email to **security@atscale.com**. 16 | 17 | Please include as much of the information listed below as you can to help us better understand and resolve the issue: 18 | 19 | - The type of issue (e.g., buffer overflow, SQL injection, or cross-site scripting) 20 | - Full paths of source file(s) related to the manifestation of the issue 21 | - The location of the affected source code (tag/branch/commit or direct URL) 22 | - Any special configuration required to reproduce the issue 23 | - Step-by-step instructions to reproduce the issue 24 | - Proof-of-concept or exploit code (if possible) 25 | - Impact of the issue, including how an attacker might exploit the issue 26 | 27 | This information will help us triage your report more quickly. 28 | 29 | ## Policy 30 | 31 | See [AtScale's Vulnerability Disclosure Program Policy - This page is currently under construction.](XXX) -------------------------------------------------------------------------------- /images/SML-Object-Hierarchy.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/semanticdatalayer/SML/8f4ced8b3e398d25551c7a631bd16b8ebbf39c19/images/SML-Object-Hierarchy.png -------------------------------------------------------------------------------- /images/sml-image-medium.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/semanticdatalayer/SML/8f4ced8b3e398d25551c7a631bd16b8ebbf39c19/images/sml-image-medium.png -------------------------------------------------------------------------------- /images/sml-logo-large.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/semanticdatalayer/SML/8f4ced8b3e398d25551c7a631bd16b8ebbf39c19/images/sml-logo-large.png -------------------------------------------------------------------------------- /images/sml-logo-small.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/semanticdatalayer/SML/8f4ced8b3e398d25551c7a631bd16b8ebbf39c19/images/sml-logo-small.png -------------------------------------------------------------------------------- /sml-reference/calculation.md: -------------------------------------------------------------------------------- 1 | # Calculation 2 | 3 | Calculation files define custom MDX expressions for creating calculated 4 | metrics in SML. They can be used to combine, evaluate, or manipulate 5 | other metrics defined in the model. For example, you can do simple math 6 | operations to combine metrics, or simple comparison operations to return 7 | a given metric value when certain conditions are met. 8 | 9 | In SML, calculation files are a subset of metrics. The separation of 10 | calculation metrics from other types enables you to easily create 11 | boilerplate calculations that can be used across multiple metrics. 12 | 13 | Sample `calculation` file: 14 | 15 | ```yaml 16 | unique_name: Average Catalog Unit Net Profit 17 | object_type: metric_calc 18 | label: Average Catalog Unit Net Profit 19 | expression: "[Measures].[m_cs_net_profit_sum]/[Measures].[m_cs_quantity_sum]" 20 | ``` 21 | 22 | # Entitity Relationships 23 | 24 | ```mermaid 25 | classDiagram 26 | class MetricCalc{ 27 | String unique_name 28 | String label 29 | String description 30 | const object_type 31 | String format 32 | String expression 33 | String mdx_aggregate_function 34 | Boolean is_hidden 35 | } 36 | ``` 37 | 38 | # Calculation Properties 39 | 40 | ## unique_name 41 | 42 | - **Type:** string 43 | - **Required:** Y 44 | 45 | The unique name of the calculation. This must be unique across all 46 | repositories and subrepositories. 47 | 48 | ## object_type 49 | 50 | - **Type:** const 51 | - **Required:** Y 52 | 53 | The type of object defined by the file. For calculations, the value of 54 | this property must be `metric_calc`. 55 | 56 | ## label 57 | 58 | - **Type:** string 59 | - **Required:** Y 60 | 61 | The name of the calculation as it appears in SML. This value does 62 | not need to be unique. 63 | 64 | ## expression 65 | 66 | - **Type:** string 67 | - **Required:** Y 68 | 69 | The MDX expression to use for the calculation. 70 | 71 | This expression must be written in MDX syntax, surrounded by quotes ("). 72 | Additionally, it can only operate on existing metrics in the model, and 73 | must return a numeric value. 74 | 75 | ## description 76 | 77 | - **Type:** string 78 | - **Required:** N 79 | 80 | A description of the calculation. 81 | 82 | ## format 83 | 84 | - **Type:** string 85 | - **Required:** N 86 | 87 | The format of the values returned by the calculation. You can use one of 88 | SML's built-in named formats, or a custom string format. 89 | 90 | Supported named formats: 91 | 92 | - `fixed` 93 | - `general number` 94 | - `none` 95 | - `percent` 96 | - `scientific` 97 | - `standard` 98 | 99 | Custom format strings should be in quotes (") and contain one to four 100 | sections, separated by semicolons. 101 | 102 | ## is_hidden 103 | 104 | - **Type:** boolean 105 | - **Required:** N 106 | 107 | Determines whether the calculation is visible in BI tools. 108 | 109 | Supported values: 110 | 111 | - `true` 112 | - `false` 113 | 114 | ## mdx_aggregation_function 115 | 116 | - **Type:** string 117 | - **Required:** N 118 | 119 | Supported values: 120 | 121 | - `SUM` 122 | - `MAX` 123 | - `MIN` 124 | - `COUNT` 125 | - `AVG` 126 | -------------------------------------------------------------------------------- /sml-reference/catalog.md: -------------------------------------------------------------------------------- 1 | # Catalog 2 | 3 | `catalog.yml` (the catalog file) is the control file for a SML 4 | repository. It contains all repository-level definitions, such as the 5 | repository name and settings for building aggregates. Each repository 6 | must contain a file called `catalog.yml` at the root level. 7 | 8 | Sample `catalog` file: 9 | 10 | ```yaml 11 | unique_name: sml-models 12 | object_type: catalog 13 | label: SML Model Library 14 | version: 1.0 15 | aggressive_agg_promotion: false 16 | build_speculative_aggs: false 17 | ``` 18 | 19 | # Entitity Relationships 20 | 21 | ```mermaid 22 | classDiagram 23 | class Catalog{ 24 | String unique_name 25 | const object_type 26 | String label 27 | int version 28 | Boolean aggressive_agg_promotion 29 | Boolean build_speculative_aggs 30 | Object dataset_properties 31 | } 32 | ``` 33 | 34 | # Catalog Properties 35 | 36 | ## unique_name 37 | 38 | - **Type:** string 39 | - **Required:** Y 40 | 41 | The name of the repository. This must be unique across all repositories 42 | and subrepositories. 43 | 44 | ## object_type 45 | 46 | - **Type:** const 47 | - **Required:** Y 48 | 49 | The type of object defined by the file. For `catalog.yml`, this must be 50 | `catalog`. 51 | 52 | ## label 53 | 54 | - **Type:** string 55 | - **Required:** Y 56 | 57 | The name of the repository, as it appears in the consumption tool. This value does 58 | not need to be unique. 59 | 60 | ## version 61 | 62 | - **Type:** number 63 | - **Required:** Y 64 | 65 | The version of SML being used. 66 | 67 | ## aggressive_agg_promotion 68 | 69 | - **Type:** boolean 70 | - **Required:** Y 71 | 72 | Enables/disables aggressive aggregate promotion for the repository. When 73 | enabled, all aggregates referenced by a query are considered for 74 | promotion, regardless of whether a join to other non-preferred or 75 | non-aggregate datasets was required. 76 | 77 | Supported values: 78 | 79 | - `true` 80 | - `false` 81 | 82 | ## build_speculative_aggs 83 | 84 | - **Type:** boolean 85 | - **Required:** Y 86 | 87 | Enables/disables speculative aggregates for the repository. 88 | 89 | When enabled, the query engine automatically creates aggregate tables 90 | that it anticipates being useful based on your models. These are 91 | intended to improve the performance of queries from client BI tools 92 | faster than with demand-defined aggregates alone. 93 | 94 | **Note:** Speculative aggregates are also called prediction-defined aggregates. 95 | 96 | Supported values: 97 | 98 | - `true` 99 | - `false` 100 | 101 | ## dataset_properties 102 | 103 | - **Type:** object 104 | - **Required:** N 105 | 106 | Defines dataset properties to use within the repository. 107 | 108 | Supported properties: 109 | 110 | - `allow_aggregates`: Boolean, optional. Enables the query engine to 111 | create aggregates for datasets in the repository. 112 | - `allow_local_aggs`: Boolean, optional. Enables local aggregation for 113 | datasets in the repository. 114 | - `allow_peer_aggs`: Boolean, optional. Enables aggregation on data 115 | derived from datasets in data warehouses that are different from the 116 | source dataset. 117 | - `allow_preferred_aggs`: Boolean, optional. Allow aggregates to be built 118 | in preferred storage. 119 | 120 | Specify the `unique_name` of the dataset followed by the properties and 121 | values you want to set for it at the repository level. For example: 122 | 123 | dataset1: 124 | allow_peer_aggs: true 125 | 126 | **Note:** Datasets are typically defined at the repository level, in 127 | `catalog.yml`; however, datasets used by a specific model (typically 128 | fact datasets) can be defined within the model itself. 129 | -------------------------------------------------------------------------------- /sml-reference/composite-model.md: -------------------------------------------------------------------------------- 1 | # Composite Model 2 | 3 | Composite models are made up of multiple other models that all share a dimension, as well as calculations specific to the composite model itself. They are defined by `composite_model` files in SML. 4 | 5 | When you deploy a composite model, all of its referenced objects are deployed as a single model. Note, however, that deployed composite models do not include the following objects from their referenced models: 6 | - User defined aggregates 7 | - Partitions 8 | - Perspectives 9 | - Drill-throughs 10 | 11 | Sample `composite_model` file: 12 | 13 | ```yaml 14 | unique_name: TPCDS - Composite 15 | object_type: composite_model 16 | label: TPCDS - Composite 17 | description: This is a composite model that combines TPC-DS subject-area models. 18 | 19 | models: 20 | - TPC-DS Catalog Sales 21 | - TPC-DS Inventory 22 | - TPC-DS Store Promotion 23 | - TPC-DS Store Returns 24 | - TPC-DS Store Sales 25 | - TPC-DS Web Sales 26 | 27 | metrics: 28 | - unique_name: Store and Web Purchased Amount 29 | folder: Time Relative 30 | - unique_name: Catalog Purchased Amount Growth 31 | folder: Time Relative 32 | - unique_name: m_ws_cs_ext_sales_price_sum 33 | folder: Time Relative 34 | ``` 35 | 36 | # Entity Relationships 37 | 38 | ```mermaid 39 | classDiagram 40 | CompositeModel ..> ModelReference 41 | CompositeModel ..> MetricReference 42 | namespace CompositeModels{ 43 | class CompositeModel{ 44 | String unique_name 45 | const object_type 46 | String label 47 | Array~ModelReference~ models 48 | Array~MetricReference~ metrics 49 | } 50 | class ModelReference{ 51 | String unique_name 52 | } 53 | class MetricReference{ 54 | String unique_name 55 | String folder 56 | } 57 | } 58 | ``` 59 | 60 | # Composite Model Properties 61 | 62 | ## unique_name 63 | 64 | - **Type:** string 65 | - **Required:** Y 66 | 67 | The unique name of the composite model. This must be unique across all repositories and subrepositories. 68 | 69 | ## object_type 70 | 71 | - **Type:** const 72 | - **Required:** Y 73 | 74 | The type of object defined by the file. For composite models, this must be `composite_model`. 75 | 76 | ## label 77 | 78 | - **Type:** string 79 | - **Required:** Y 80 | 81 | The name of the composite model as it appears in the consumption tool. This value does not need to be unique. 82 | 83 | ## description 84 | 85 | - **Type:** string 86 | - **Required:** N 87 | 88 | A description of the composite model. 89 | 90 | ## models 91 | 92 | - **Type:** array 93 | - **Required:** Y 94 | 95 | A list of the models that make up the composite model. These must meet the following criteria: 96 | 97 | - They cannot be other composite models. 98 | - They must all have at least one dimension in common. 99 | 100 | 101 | ## metrics 102 | 103 | - **Type:** array 104 | - **Required:** N 105 | 106 | A list of the calculations to include in the composite model. These must meet the following criteria: 107 | 108 | - They must be of object type `metric_calc`. 109 | - Each calculation’s MDX formula can only contain references to objects that appear in the referenced models. 110 | 111 | The `metrics` property supports the following properties: 112 | 113 | - `unique_name`: String, required. The unique name of the calculation. 114 | - `folder`: String, optional. The name of the folder in which this calculation appears in BI tools. 115 | -------------------------------------------------------------------------------- /sml-reference/connection.md: -------------------------------------------------------------------------------- 1 | # Connection 2 | 3 | Connection files define database connections and schemas for the 4 | repository. These are required to import fact and dimension datasets 5 | into your repository. 6 | 7 | Each connection file should define a single database connection *and* 8 | its schema. If you need to use additional schemas for the same database, 9 | each must be defined in a separate connection file. 10 | 11 | Sample `connection` file: 12 | 13 | ```yaml 14 | unique_name: Connection - TPCDS 15 | label: Connection - TPCDS 16 | object_type: connection 17 | as_connection: Snowflake 18 | database: tutorial_data 19 | schema: tpcds 20 | ``` 21 | 22 | # Entitity Relationships 23 | 24 | ```mermaid 25 | classDiagram 26 | class Connection{ 27 | String unique_name 28 | String label 29 | const object_type 30 | String as_connection 31 | String database 32 | String schema 33 | } 34 | ``` 35 | 36 | # Connection Properties 37 | 38 | ## unique_name 39 | 40 | - **Type:** string 41 | - **Required:** Y 42 | 43 | A unique name for the database and the schema. This must be unique 44 | across all repositories and subrepositories. 45 | 46 | ## object_type 47 | 48 | - **Type:** const 49 | - **Required:** Y 50 | 51 | The type of object defined by this file. For connections, this value 52 | must be `connection`. 53 | 54 | ## label 55 | 56 | - **Type:** string 57 | - **Required:** Y 58 | 59 | The name of the database connection as it appears in the consumption tool. This value 60 | does not need to be unique. 61 | 62 | ## as_connection 63 | 64 | - **Type:** string 65 | - **Required:** Y 66 | 67 | The name of the database connection itself, excluding the schema. 68 | 69 | ## database 70 | 71 | - **Type:** string 72 | - **Required:** Y 73 | 74 | The source database used for this connection. 75 | 76 | ## schema 77 | 78 | - **Type:** string 79 | - **Required:** Y 80 | 81 | The source schema used for this connection. 82 | -------------------------------------------------------------------------------- /sml-reference/dataset.md: -------------------------------------------------------------------------------- 1 | # Dataset 2 | 3 | Dataset files define datasets to use in the repository. Each dataset 4 | file in your repository must correspond to either a physical table/view 5 | in your database, or the results of a `SELECT` statement. 6 | 7 | **Note:** Dataset files must define *all* columns in the physical tables 8 | they reference, and can therefore be quite large. Because of this, 9 | we recommend sharing these files across repositories. 10 | 11 | Sample `dataset` file: 12 | 13 | ```yaml 14 | unique_name: store_sales 15 | object_type: dataset 16 | label: store_sales 17 | connection_id: Connection - TPCDS 18 | table: store_sales 19 | 20 | columns: 21 | - name: Net Profit Tier 22 | data_type: string 23 | sql: "CASE WHEN \"ss_net_profit\" > 25000 THEN 'More than 25000' 24 | WHEN \"ss_net_profit\" BETWEEN 3000 AND 25000 THEN '3000-25000' 25 | WHEN \"ss_net_profit\" BETWEEN 2000 AND 3000 THEN '2000-3000' 26 | WHEN \"ss_net_profit\" BETWEEN 300 AND 2000 THEN '300-2000' 27 | WHEN \"ss_net_profit\" BETWEEN 250 AND 300 THEN '250-300' 28 | WHEN \"ss_net_profit\" BETWEEN 200 AND 250 THEN '200-250' 29 | WHEN \"ss_net_profit\" BETWEEN 150 AND 200 THEN '150-200' 30 | WHEN \"ss_net_profit\" BETWEEN 100 AND 150 THEN '100-150' 31 | WHEN \"ss_net_profit\" BETWEEN 50 AND 100 THEN ' 50-100' 32 | WHEN \"ss_net_profit\" BETWEEN 0 AND 50 THEN ' 0- 50' 33 | ELSE ' 50 or Less' 34 | END" 35 | dialects: 36 | - dialect: DatabricksSQL 37 | sql: "CASE WHEN ss_net_profit > 25000 THEN 'More than 25000' 38 | WHEN ss_net_profit BETWEEN 3000 AND 25000 THEN '3000-25000' 39 | WHEN ss_net_profit BETWEEN 2000 AND 3000 THEN '2000-3000' 40 | WHEN ss_net_profit BETWEEN 300 AND 2000 THEN '300-2000' 41 | WHEN ss_net_profit BETWEEN 250 AND 300 THEN '250-300' 42 | WHEN ss_net_profit BETWEEN 200 AND 250 THEN '200-250' 43 | WHEN ss_net_profit BETWEEN 150 AND 200 THEN '150-200' 44 | WHEN ss_net_profit BETWEEN 100 AND 150 THEN '100-150' 45 | WHEN ss_net_profit BETWEEN 50 AND 100 THEN ' 50-100' 46 | WHEN ss_net_profit BETWEEN 0 AND 50 THEN ' 0- 50' 47 | ELSE ' 50 or Less' 48 | END" 49 | - dialect: BigQuery 50 | sql: "CASE WHEN ss_net_profit > 25000 THEN 'More than 25000' 51 | WHEN ss_net_profit BETWEEN 3000 AND 25000 THEN '3000-25000' 52 | WHEN ss_net_profit BETWEEN 2000 AND 3000 THEN '2000-3000' 53 | WHEN ss_net_profit BETWEEN 300 AND 2000 THEN '300-2000' 54 | WHEN ss_net_profit BETWEEN 250 AND 300 THEN '250-300' 55 | WHEN ss_net_profit BETWEEN 200 AND 250 THEN '200-250' 56 | WHEN ss_net_profit BETWEEN 150 AND 200 THEN '150-200' 57 | WHEN ss_net_profit BETWEEN 100 AND 150 THEN '100-150' 58 | WHEN ss_net_profit BETWEEN 50 AND 100 THEN ' 50-100' 59 | WHEN ss_net_profit BETWEEN 0 AND 50 THEN ' 0- 50' 60 | ELSE ' 50 or Less' 61 | END" 62 | - name: Purchased Amount in Store 63 | data_type: "decimal(16,8)" 64 | sql: "((\"ss_ext_list_price\"-\"ss_ext_wholesale_cost\"-\"ss_ext_discount_amt\")+\"ss_ext_sales_price\")/2" 65 | dialects: 66 | - dialect: DatabricksSQL 67 | sql: "((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2" 68 | - dialect: BigQuery 69 | sql: "((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2" 70 | - name: ss row counter 71 | data_type: int 72 | sql: "1" 73 | - name: ss_addr_sk 74 | data_type: long 75 | - name: ss_cdemo_sk 76 | data_type: long 77 | - name: ss_coupon_amt 78 | data_type: "decimal(7,2)" 79 | - name: ss_customer_sk 80 | data_type: long 81 | - name: ss_ext_discount_amt 82 | data_type: "decimal(7,2)" 83 | - name: ss_ext_list_price 84 | data_type: "decimal(7,2)" 85 | - name: ss_ext_sales_price 86 | data_type: "decimal(7,2)" 87 | - name: ss_ext_tax 88 | data_type: "decimal(7,2)" 89 | - name: ss_ext_wholesale_cost 90 | data_type: "decimal(7,2)" 91 | - name: ss_hdemo_sk 92 | data_type: long 93 | - name: ss_item_sk 94 | data_type: long 95 | - name: ss_list_price 96 | data_type: "decimal(7,2)" 97 | - name: ss_net_paid 98 | data_type: "decimal(7,2)" 99 | - name: ss_net_paid_inc_tax 100 | data_type: "decimal(7,2)" 101 | - name: ss_net_profit 102 | data_type: "decimal(7,2)" 103 | - name: ss_promo_sk 104 | data_type: long 105 | - name: ss_quantity 106 | data_type: long 107 | - name: ss_sales_price 108 | data_type: "decimal(7,2)" 109 | - name: ss_sold_date_sk 110 | data_type: long 111 | - name: ss_sold_time_sk 112 | data_type: long 113 | - name: ss_store_sk 114 | data_type: long 115 | - name: ss_ticket_number 116 | data_type: long 117 | - name: ss_wholesale_cost 118 | data_type: "decimal(7,2)" 119 | - name: sales price tier 120 | data_type: string 121 | sql: "CASE WHEN \"ss_sales_price\" > 200 THEN '200 and More' 122 | WHEN \"ss_sales_price\" BETWEEN 150 AND 200 THEN '150-200' 123 | WHEN \"ss_sales_price\" BETWEEN 100 AND 150 THEN '100-150' 124 | WHEN \"ss_sales_price\" BETWEEN 50 AND 100 THEN ' 50-100' 125 | ELSE ' 50 and Less' END" 126 | dialects: 127 | - dialect: DatabricksSQL 128 | sql: "CASE WHEN ss_sales_price > 200 THEN '200 and More' 129 | WHEN ss_sales_price BETWEEN 150 AND 200 THEN '150-200' 130 | WHEN ss_sales_price BETWEEN 100 AND 150 THEN '100-150' 131 | WHEN ss_sales_price BETWEEN 50 AND 100 THEN ' 50-100' 132 | ELSE ' 50 and Less' END" 133 | - dialect: BigQuery 134 | sql: "CASE WHEN ss_sales_price > 200 THEN '200 and More' 135 | WHEN ss_sales_price BETWEEN 150 AND 200 THEN '150-200' 136 | WHEN ss_sales_price BETWEEN 100 AND 150 THEN '100-150' 137 | WHEN ss_sales_price BETWEEN 50 AND 100 THEN ' 50-100' 138 | ELSE ' 50 and Less' END" 139 | ``` 140 | 141 | # Entity Relationships 142 | 143 | ```mermaid 144 | classDiagram 145 | Dataset *-- Column 146 | Dataset *-- Alternate 147 | Dataset *-- Incremental 148 | Incremental ..> Column 149 | Column *-- MapColumn 150 | Dataset *-- Dialect 151 | Column *-- Dialect 152 | namespace Datasets{ 153 | class Dataset{ 154 | String unique_name 155 | const object_type 156 | String label 157 | String description 158 | String connection_id 159 | String sql 160 | String table 161 | Array~Column~ columns 162 | Array~Dialect~ dialects 163 | Boolean immutable 164 | Alternate alternate 165 | Incremental incremental 166 | } 167 | class Column{ 168 | String name 169 | const data_type 170 | String sql 171 | Array~Dialect~ dialects 172 | String parent_column 173 | MapColumn map 174 | } 175 | class Dialect{ 176 | String dialect 177 | String sql 178 | } 179 | class Alternate{ 180 | String type 181 | String connection_id 182 | String table 183 | String sql 184 | } 185 | class Incremental{ 186 | String column 187 | String grace_period 188 | } 189 | class MapColumn{ 190 | String field_terminator 191 | String key_terminator 192 | String key_type 193 | String value_type 194 | Boolean is_prefixed 195 | } 196 | } 197 | ``` 198 | 199 | # Dataset Properties 200 | 201 | ## unique_name 202 | 203 | - **Type:** string 204 | - **Required:** Y 205 | 206 | The unique name of the dataset. This must be unique across all 207 | repositories and subrepositories. 208 | 209 | ## object_type 210 | 211 | - **Type:** const 212 | - **Required:** Y 213 | 214 | The type of object defined by the file. For datasets, the value of this 215 | property must be `dataset`. 216 | 217 | ## label 218 | 219 | - **Type:** string 220 | - **Required:** Y 221 | 222 | The name of the dataset, as it appears in the consumption tool. This value does not 223 | need to be unique. 224 | 225 | ## connection_id 226 | 227 | - **Type:** string 228 | - **Required:** Y 229 | 230 | The `unique_name` of the connection object that defines the database and 231 | schema in which the dataset is stored. 232 | 233 | ## sql 234 | 235 | - **Type:** string 236 | - **Required:** Required if `table` is not defined 237 | 238 | A SQL query used to pull data from a specific connection defined within 239 | the repository, similar to a database view. This determines whether the 240 | dataset file defines a query dataset. 241 | 242 | If you want to run the query on other types of databases, use 243 | the `dialects` property to define additional dialects for it to run in. 244 | 245 | ### dialects 246 | 247 | - **Type:** array 248 | - **Required:** N 249 | 250 | Defines alternate dialects for the `sql` statement so that it can run on 251 | other types of databases. You can define as many alternate dialects as 252 | needed. 253 | 254 | Supported properties: 255 | 256 | - `dialect`: String, required. An alternate SQL dialect. Supported 257 | values: 258 | - `Snowflake` 259 | - `Postgresql` 260 | - `DatabricksSQL` 261 | - `BigQuery` 262 | - `Iris` 263 | 264 | - `sql`: String, required. The alternate SQL statement. 265 | 266 | ## table 267 | 268 | - **Type:** string 269 | - **Required:** Required if `sql` is not defined 270 | 271 | The name of the table in the database that the dataset is based on. 272 | 273 | ## columns 274 | 275 | - **Type:** array 276 | - **Required:** Y 277 | 278 | Defines the columns available in the dataset. 279 | 280 | **Note:** You should define *all* columns available in the dataset. This 281 | is especially important for dataset files that are shared across 282 | multiple repositories. 283 | 284 | The `columns` property within a dataset file supports the following 285 | properties. 286 | 287 | ### name 288 | 289 | - **Type:** string 290 | - **Required:** Y 291 | 292 | The name of the column. 293 | 294 | ### data_type 295 | 296 | - **Type:** string 297 | - **Required:** Required unless this column is a `map` 298 | 299 | The data type of the values within the column. 300 | 301 | Supported values: 302 | 303 | - `string` 304 | - `int` 305 | - `long` 306 | - `bigint` 307 | - `tinyint` 308 | - `float` 309 | - `double` 310 | - `decimal` 311 | - `decimal(x,y)` 312 | - `number` 313 | - `number(x,y)` 314 | - `numeric(x,y)` 315 | - `boolean` 316 | - `date` 317 | - `datetime` 318 | - `timestamp` 319 | 320 | ### sql 321 | 322 | - **Type:** string 323 | - **Required:** N 324 | 325 | Defines the column as a calculated column. 326 | 327 | Calculated columns enable you to add simple data transformations to the 328 | dataset. These can be used as the basis of model attributes, just like 329 | any other dataset column. 330 | 331 | The value of this property should be a valid SQL statement that can be 332 | run as part of the `SELECT` list of a query. 333 | 334 | The SQL statement is passed directly to the underlying database when the 335 | query runs, so it must be in a syntax that is supported by your chosen 336 | engine. If you want to run the query on other types of databases, use 337 | the `dialects` property to define additional dialects for it to run in. 338 | 339 | ### dialects 340 | 341 | - **Type:** array 342 | - **Required:** N 343 | 344 | Defines alternate dialects for the `sql` statement so that it can run on 345 | other types of databases. You can define as many alternate dialects as 346 | needed. 347 | 348 | Supported properties: 349 | 350 | - `dialect`: String, required. An alternate SQL dialect. Supported 351 | values: 352 | - `Snowflake` 353 | - `Postgresql` 354 | - `DatabricksSQL` 355 | - `BigQuery` 356 | - `Iris` 357 | 358 | - `sql`: String, required. The alternate SQL statement. 359 | 360 | ### map 361 | 362 | - **Type:** object 363 | - **Required:** N 364 | 365 | Defines a map to use to create a calculated column. 366 | 367 | Supported properties: 368 | 369 | - `field_terminator`: String, required. The delimiter used to separate 370 | the key:value pairs. This must be in quotes ("). 371 | - `key_terminator`: String, required. The delimiter used to separate the 372 | individual keys from their values. This must be in quotes ("). 373 | - `key_type`: String, required. The data type of the map's keys. 374 | - `value_type`: String, required. The data type of the map's values. 375 | - `is_prefixed`: Boolean, optional. Is first character delimited. 376 | 377 | The mapped columns are defined as separate columns within the dataset 378 | file. Each of these must have the `parent_column` property. 379 | 380 | ### parent_column 381 | 382 | - **Type:** string 383 | - **Required:** Required for mapped columns 384 | 385 | For mapped columns only. Specifies the `map` column used to create this 386 | column. 387 | 388 | ## description 389 | 390 | - **Type:** string 391 | - **Required:** N 392 | 393 | A description of the dataset. 394 | 395 | ## incremental 396 | 397 | - **Type:** object 398 | - **Required:** N 399 | 400 | Enables incremental builds for the dataset. When the engine performs an 401 | incremental rebuild of an aggregate table, it appends new rows and 402 | updates existing rows that fall within a specified period of time 403 | (called the grace period). 404 | 405 | The `incremental` property supports the following properties. 406 | 407 | ### column 408 | 409 | - **Type:** string 410 | - **Required:** Y 411 | 412 | The name of the dataset column to use as an incremental indicator. This 413 | column must have values that increase monotonically, such as a numeric 414 | UNIX timestamp showing seconds since epoch, or a Timestamp/DateTime. The 415 | values in this column enable the query engine both to append rows to 416 | an aggregate table and update rows during an incremental rebuild. 417 | 418 | The values of this column should be of one of the following data types: 419 | Long, Integer, Decimal (38,0) (Snowflake only), Timestamp, DateTime. 420 | 421 | If you do not have a column that meets this criteria, you may need to 422 | create a calculated column to transform string or datetime values into 423 | the right data type. 424 | 425 | ### grace_period 426 | 427 | - **Type:** string 428 | - **Required:** Y 429 | 430 | When the semantic engine starts an incremental build, the `grace_period` 431 | determines how far back in time the engine looks for updates to rows in 432 | the dataset; for example, one week or 15 days. 433 | 434 | The value of this property should be an integer, followed by the time 435 | unit. The time unit can be any of the following: `s` (second), `m` 436 | (minute), `h` (hour), `d` (day), `w` (week). 437 | 438 | For example, setting the value to `'100s'` sets the grace period to 100 439 | seconds. Setting it to `'1w'` sets the grace period to one week. 440 | 441 | ## immutable 442 | 443 | - **Type:** boolean 444 | - **Required:** N 445 | 446 | Determines whether the dataset changes often or not. The semantic engine 447 | uses this information when running incremental builds of aggregates that 448 | use joins on dimensions that do not change often. 449 | -------------------------------------------------------------------------------- /sml-reference/dimension.md: -------------------------------------------------------------------------------- 1 | # Dimension 2 | 3 | Dimension files define the dimensions used in the model. A *dimension* 4 | is a logical collection of attributes that are bound to specific columns 5 | in a source dataset. These attributes are in turn used to group and 6 | filter metric data at query time. 7 | 8 | SML supports the following types of dimensions: 9 | 10 | - **Normal:** Dimensions that are based on a dataset. All data for a 11 | normal dimension is normalized into a single table or view. There are 12 | two types of normal dimensions: 13 | - **Standard:** Can have any type of hierarchy. 14 | - **Time:** Must have a time hierarchy. 15 | 16 | - **Degenerate:** A dimension that is based on one or more columns in a 17 | fact dataset. 18 | 19 | - **Shared degenerate:** A dimension that is based on one or more 20 | columns that are common to two or more fact datasets. 21 | 22 | - **Snowflake:** A logical dimension that is composed of multiple 23 | underlying physical datasets. 24 | 25 | - **Many-to-many:** Also called multi-valued. This is when a fact 26 | dataset row refers to more than one row in a dimension dataset. In 27 | SML, this is modeled by defining a dimensional bridge or junction 28 | table to resolve the many-to-many relationship. 29 | 30 | Sample `dimension` file: 31 | 32 | ```yaml 33 | unique_name: Store Dimension 34 | object_type: dimension 35 | label: Store Dimension 36 | type: standard 37 | 38 | hierarchies: 39 | 40 | - unique_name: Store Dimension 41 | label: Store Dimension 42 | folder: Store Attributes 43 | filter_empty: "yes" 44 | 45 | levels: 46 | 47 | - unique_name: d_store_country 48 | 49 | - unique_name: d_store_state 50 | 51 | - unique_name: d_store_county 52 | 53 | - unique_name: d_store_city 54 | 55 | - unique_name: Store Dimension 56 | 57 | secondary_attributes: 58 | 59 | - unique_name: d_s_floor_space 60 | label: Store Floor Space 61 | folder: Store Attributes 62 | dataset: store 63 | name_column: s_floor_space 64 | key_columns: 65 | - s_floor_space 66 | sort_column: s_floor_space 67 | 68 | - unique_name: d_s_manager 69 | label: Store Manager 70 | folder: Store Attributes 71 | dataset: store 72 | name_column: s_manager 73 | key_columns: 74 | - s_manager 75 | sort_column: s_manager 76 | 77 | - unique_name: d_s_number_employees 78 | label: Store Number of Employees 79 | folder: Store Attributes 80 | dataset: store 81 | name_column: s_number_employees 82 | key_columns: 83 | - s_number_employees 84 | sort_column: s_number_employees 85 | 86 | - unique_name: d_store_company_id 87 | label: Store Company ID 88 | folder: Store Attributes 89 | dataset: store 90 | name_column: s_company_id 91 | key_columns: 92 | - s_company_id 93 | sort_column: s_company_id 94 | 95 | - unique_name: d_store_name 96 | label: Store Name 97 | folder: Store Attributes 98 | dataset: store 99 | name_column: s_store_name 100 | key_columns: 101 | - s_store_name 102 | sort_column: s_store_name 103 | 104 | level_attributes: 105 | 106 | - unique_name: d_store_city 107 | label: Store City 108 | dataset: store 109 | name_column: s_city 110 | key_columns: 111 | - s_country 112 | - s_state 113 | - s_city 114 | 115 | - unique_name: d_store_country 116 | label: Store Country 117 | dataset: store 118 | name_column: s_country 119 | key_columns: 120 | - s_country 121 | 122 | - unique_name: d_store_county 123 | label: Store County 124 | dataset: store 125 | name_column: s_county 126 | key_columns: 127 | - s_state 128 | - s_county 129 | 130 | - unique_name: d_store_state 131 | label: Store State 132 | dataset: store 133 | name_column: s_state 134 | key_columns: 135 | - s_country 136 | - s_state 137 | 138 | - unique_name: Store Dimension 139 | label: Store Number 140 | is_unique_key: true 141 | dataset: store 142 | name_column: s_store_sk 143 | key_columns: 144 | - s_store_sk 145 | ``` 146 | 147 | # Entitity Relationships 148 | 149 | ```mermaid 150 | classDiagram 151 | Relationship *-- From 152 | Relationship *-- To 153 | Dimension *-- Relationship 154 | Dimension *-- Hierarchy 155 | Dimension *-- LevelAttribute 156 | Dimension *-- CalculationGroup 157 | Hierarchy *-- DefaultMember 158 | Hierarchy *-- Level 159 | LevelAttribute *-- CustomEmptyMember 160 | LevelAttribute *-- SharedDegenerateColumns 161 | Level *-- SecondaryAttribute 162 | Level *-- MetricalAttribute 163 | Level *-- Alias 164 | Level *-- ParallelPeriods 165 | CalculationGroup *-- CalculatedMembers 166 | SecondaryAttribute *-- CustomEmptyMember 167 | Alias *-- CustomEmptyMember 168 | MetricalAttribute *-- CustomEmptyMember 169 | namespace Dimensions{ 170 | class Dimension{ 171 | String unique_name 172 | String label 173 | const object_type 174 | String description 175 | enum type 176 | Boolean is_degenerate 177 | Array~Hierarchy~ hierarchies 178 | Array~LevelAttribute~ level_attributes 179 | Array~Relationship~ relationships 180 | Array~CalculationGroup~ calculation_groups 181 | } 182 | class Relationship{ 183 | String unique_name 184 | Object from 185 | Object to 186 | String role_play 187 | String type 188 | } 189 | class From{ 190 | String dataset 191 | Array~String~ join_columns 192 | String hierarchy 193 | String level 194 | } 195 | class To{ 196 | String dimension 197 | String level 198 | String row_security 199 | } 200 | class Hierarchy{ 201 | String unique_name 202 | String label 203 | String description 204 | String folder 205 | enum filter_empty 206 | DefaultMember default_member 207 | Array~Level~ levels 208 | } 209 | class DefaultMember{ 210 | String expression 211 | Boolean apply_only_when_in_query 212 | } 213 | class Level{ 214 | String unique_name 215 | Array~SecondaryAttribute~ secondary_attributes 216 | Array~Alias~ aliases 217 | Array~MetricalAttribute~ metrics 218 | Array~ParallelPeriods~ parallel_periods 219 | Boolean is_hidden 220 | } 221 | class Alias{ 222 | String unique_name 223 | String label 224 | String description 225 | String dataset 226 | String name_column 227 | String sort_column 228 | String folder 229 | Boolean is_hidden 230 | String format 231 | Boolean exclude_from_dim_agg 232 | Boolean is_aggregatable 233 | Boolean exclude_from_fact_agg 234 | Array~CustomEmptyMember~ custom_empty_member 235 | } 236 | class MetricalAttribute{ 237 | String unique_name 238 | String label 239 | String description 240 | String folder 241 | String format 242 | String calculation_method 243 | String dataset 244 | String column 245 | Boolean is_hidden 246 | Boolean exclude_from_dim_agg 247 | Boolean is_aggregatable 248 | Boolean exclude_from_fact_agg 249 | CustomEmptyMember custom_empty_member 250 | enum unrelated_dimensions_handling 251 | Array~String~ allowed_calcs_for_dma 252 | } 253 | class ParallelPeriods{ 254 | String level 255 | Array~String~ key_columns 256 | } 257 | class LevelAttribute{ 258 | String unique_name 259 | String label 260 | String description 261 | String dataset 262 | String name_column 263 | String sort_column 264 | Array~String~ key_columns 265 | Boolean contains_unique_names 266 | Boolean is_hidden 267 | Boolean is_unique_key 268 | Boolean exclude_from_dim_agg 269 | Boolean is_aggregatable 270 | Boolean exclude_from_fact_agg 271 | String time_unit 272 | Array~String~ allowed_calcs_for_dma 273 | CustomEmptyMember custom_empty_member 274 | String folder 275 | Array~SharedDegenerateColumns~ shared_degenerate_columns 276 | } 277 | class SharedDegenerateColumns { 278 | String dataset 279 | String name_column 280 | String sort_column 281 | Array~String~ key_columns 282 | Boolean is_unique_key 283 | } 284 | class SecondaryAttribute{ 285 | String unique_name 286 | String label 287 | String description 288 | String folder 289 | String dataset 290 | String name_column 291 | String sort_column 292 | Array~String~ key_columns 293 | Boolean exclude_from_dim_agg 294 | Boolean is_aggregatable 295 | Boolean exclude_from_fact_agg 296 | Array~String~ allowed_calcs_for_dma 297 | CustomEmptyMember custom_empty_member 298 | Boolean is_hidden 299 | Boolean contains_unique_names 300 | } 301 | class CustomEmptyMember{ 302 | Array~String~ key 303 | String name 304 | String sort_name 305 | } 306 | class CalculationGroup{ 307 | String unique_name 308 | String label 309 | String description 310 | String folder 311 | Array~CalculatedMembers~ calculated_members 312 | } 313 | class CalculatedMembers{ 314 | String unique_name 315 | String description 316 | String format 317 | String expression 318 | Boolean use_input_metric_format 319 | String template 320 | } 321 | } 322 | ``` 323 | 324 | # Dimension Properties 325 | 326 | ## unique_name 327 | 328 | - **Type:** string 329 | - **Required:** Y 330 | 331 | The unique name of the dimension. This must be unique across all 332 | repositories and subrepositories. 333 | 334 | ## object_type 335 | 336 | - **Type:** const 337 | - **Required:** Y 338 | 339 | The type of object defined by the file. For dimensions, this value must 340 | be `dimension`. 341 | 342 | ## label 343 | 344 | - **Type:** string 345 | - **Required:** Y 346 | 347 | The name of the dimension, as it appears in the consunmption tool. This value does not 348 | need to be unique. 349 | 350 | ## description 351 | 352 | - **Type:** string 353 | - **Required:** N 354 | 355 | A description of the dimension. 356 | 357 | ## type 358 | 359 | - **Type:** enum 360 | - **Required:** N 361 | 362 | The type of dimension defined by this file. 363 | 364 | Supported values: 365 | 366 | - `standard`: Can have any type of hierarchy. 367 | - `time`: Must have a time hierarchy. 368 | 369 | ## is_degenerate 370 | 371 | - **Type:** boolean 372 | - **Required:** N 373 | 374 | Determines whether the dimension is degenerate. 375 | 376 | ## hierarchies 377 | 378 | - **Type:** array 379 | - **Required:** Y 380 | 381 | Defines the hierarchies within the dimension. 382 | 383 | Hierarchies organize the dimension attributes into categories or levels, 384 | where each level is a subdivision of the level above. Every logical 385 | dimension you create has at least one hierarchy with at least one level. 386 | 387 | ## level_attributes 388 | 389 | - **Type:** array 390 | - **Required:** Y 391 | 392 | Level attributes are attributes associated with a particular dimension 393 | hierarchy. Every hierarchy has a key level attribute, which is the most 394 | granular representation of the dimension's data. Only level attributes 395 | can be used to define relationships between datasets and other 396 | dimensions. 397 | 398 | ## relationships 399 | 400 | - **Type:** array 401 | - **Required:** N 402 | 403 | The `relationships` property in a dimension file defines the 404 | relationships to embedded and snowflake dimensions. 405 | 406 | **Note:** The relationships between the model's fact datasets and first 407 | order dimensions (fact relationships) are defined in model files. 408 | 409 | ## calculation_groups 410 | 411 | - **Type:** array 412 | - **Required:** N 413 | 414 | The `calculation_groups` property in a dimension file defines 415 | calculation groups to use in the dimension. 416 | 417 | Dimension calculation groups offer a simplifying alternative to 418 | calculated metrics by enabling the expression of boiler-plate 419 | calculations across multiple metrics. This feature defines calculations 420 | as dimension members and removes static references to individual 421 | measures. 422 | 423 | # Relationship Properties 424 | 425 | ## unique_name 426 | 427 | - **Type:** string 428 | - **Required:** N 429 | 430 | The unique name of the relationship. This must be unique within the 431 | dimension. 432 | 433 | ## from 434 | 435 | - **Type:** object 436 | - **Required:** Y 437 | 438 | Defines the side of the relationship that contains the physical dataset 439 | that you want to connect to another dimension. 440 | 441 | Supported properties: 442 | 443 | - `dataset`: String, required. The physical dataset you want to link to 444 | a dimension. 445 | - `join_columns`: Array, required. The column(s) within the `dataset` 446 | that you want to use for the join. 447 | - `hierarchy`: String, optional. The hierarchy within the dimension from 448 | which the relationship should originate. 449 | - `level`: String, optional. The level within the `hierarchy` from which 450 | the relationship should originate. 451 | 452 | For snowflake relationships (as defined by the `type` property), you 453 | only need to define `dataset` and `join_columns`. 454 | 455 | ## to 456 | 457 | - **Type:** object 458 | - **Required:** Y 459 | 460 | Defines the dimension that the `from` dataset is linked to. 461 | 462 | Supported properties: 463 | 464 | - `dimension`: String. The name of the dimension the `from` dataset is 465 | linked to. 466 | - `level`: String, required if `row_security` is undefined. The key 467 | level within the dimension to use for the relationship. 468 | - `row_security`: String, required if `level` is undefined. For security 469 | relationships, the row 470 | security object that the `from` dataset is linked to. 471 | 472 | For snowflake relationships (as defined by the `type` property), you 473 | only need to define `level`. 474 | 475 | ## type 476 | 477 | - **Type:** string 478 | - **Required:** Y 479 | 480 | Defines the relationship as either embedded or snowflake. 481 | 482 | Supported values: 483 | 484 | - `embedded`: A secondary relationship, or one that connects a primary 485 | dimension to a secondary dimension. 486 | - `snowflake`: A relationship that connects one of several underlying 487 | physical datasets together to create a snowflake dimension. 488 | 489 | ## role_play 490 | 491 | - **Type:** string 492 | - **Required:** N 493 | 494 | For role-playing relationships only. Defines the role-playing template 495 | for the relationship. 496 | 497 | The role-playing template is the prefix or suffix that is added to every 498 | attribute in the role-played dimension. You can also specify both a 499 | prefix and a suffix. 500 | 501 | This value must be in one of the following formats (including quotation 502 | marks): 503 | 504 | - **Prefix:** `" {0}"` 505 | - **Suffix:** `"{0} "` 506 | - **Prefix and suffix:** `" {0} "` 507 | 508 | For example, if you wanted to use the prefix **Order**, you would set 509 | `role_play` to `"Order {0}"`. 510 | 511 | # Calculation Group Properties 512 | 513 | ## unique_name 514 | 515 | - **Type:** string 516 | - **Required:** Y 517 | 518 | The name of the calculation group. This must be unique within the 519 | dimension. 520 | 521 | ## label 522 | 523 | - **Type:** string 524 | - **Required:** Y 525 | 526 | The name of the calculation group, as it appears in the BI consumer. This value does not need to be unique. 527 | 528 | ## calculated_members 529 | 530 | - **Type:** array 531 | - **Required:** Y 532 | 533 | Defines the individual calculated members in the group. 534 | 535 | **Note:** The first in the list is considered the default member of the group. 536 | 537 | ## description 538 | 539 | - **Type:** string 540 | - **Required:** N 541 | 542 | A description of the calculation group. 543 | 544 | ## folder 545 | 546 | - **Type:** string 547 | - **Required:** N 548 | 549 | The name of the folder in which the calculation group is displayed in BI tools. 550 | 551 | # Calculated Members Properties 552 | 553 | ## unique_name 554 | 555 | - **Type:** string 556 | - **Required:** Y 557 | 558 | The name of the calculated member. This must be unique within the dimension. 559 | 560 | ## description 561 | 562 | - **Type:** string 563 | - **Required:** N 564 | 565 | A description of the calculated member. 566 | 567 | ## template 568 | 569 | - **Type:** string 570 | - **Required:** Required if `expression` is not specified. 571 | 572 | Sets the calculation to one of SML's built-in MDX expression templates. 573 | 574 | Supported templates: 575 | 576 | `Current`, `Previous`, `Current vs Previous`, `Current vs Previous Pct`, `Next`, `Current vs Next`, `Current vs Next Pct`, `Pct of Total`, `Pct of Parent`, `Last Year`, `Current vs Last Year`, `Current vs Last Year Pct`, `Year to Date`, `Quarter to Date`, `Month to Date`, `Month Moving Average 3 Month`, `Moving Average 30 Period`, `Moving Average 5 Period`, `Moving Std Dev 30 Period`, `Moving Std Dev 5 Period` 577 | 578 | If you do not want to use a built-in template, you can define a custom expression using the `expression` property (see below). 579 | 580 | ## expression 581 | 582 | - **Type:** string 583 | - **Required:** Required if `template` is not specified. 584 | 585 | The MDX expression for the calculated member. This value should be quoted. 586 | 587 | **Note:** If you plan on referencing a calculation via the `Aggregate` MDX function in your calculated member, ensure that the computed metric has an aggregation function set. You can do this by including the `mdx_aggregation_function` property in the calculation file. If you do not set an aggregation function, you may encounter errors at query time. 588 | 589 | You can alternatively use the `template` property (see above) to use one of SML's built-in MDX expression templates instead of a custom one. 590 | 591 | ## format 592 | 593 | - **Type:** String 594 | - **Required:** N 595 | 596 | The format in which query results are returned. You can use one of SML's built-in named formats or a custom format string. 597 | 598 | Supported named formats: 599 | 600 | `fixed`,`general number`, `none`, `percent`, `scientific`, `standard` 601 | 602 | Custom format strings should be in quotes and contain one to four sections, separated by semicolons. 603 | 604 | You can alternatively configure the calculated member to return results in the format defined for input metric by including the `use_input_metric_format` property (see below). 605 | 606 | ## use_input_metric_format 607 | 608 | - **Type:** boolean 609 | - **Required:** N 610 | 611 | When `true`, query results always use the formatting defined for the input metric. This is useful for calculations that can't have a standard output format. 612 | 613 | When `false`, the results are formatted according to the `format` property. 614 | 615 | # Hierarchy Properties 616 | 617 | ## unique_name 618 | 619 | - **Type:** string 620 | - **Required:** Y 621 | 622 | The unique name of the hierarchy. This must be unique within the dimension. 623 | 624 | ## label 625 | 626 | - **Type:** string 627 | - **Required:** Y 628 | 629 | The name of the hierarchy. This value does not need to be unique. 630 | 631 | ## description 632 | 633 | - **Type:** string 634 | - **Required:** N 635 | 636 | A description of the hierarchy. 637 | 638 | ## folder 639 | 640 | - **Type:** string 641 | - **Required:** N 642 | 643 | The name of the folder in which to display this hierarchy in BI tools. If your model has a lot of dimensional hierarchies, folders are a good way to organize them. 644 | 645 | ## filter_empty 646 | 647 | - **Type:** string 648 | - **Required:** N 649 | 650 | Configures the join behavior for the hierarchy, which determines how empty values are handled in client BI tools. The value you specify must be in quotes. 651 | 652 | Supported values: 653 | 654 | - `yes`: Query results in BI tools only include members that join to 655 | the fact dataset (inner join behavior). Members with no matching 656 | entries in the fact dataset are still included if the client BI 657 | tool requests them. 658 | 659 | - `no`: Query results include all members of the dimension, even 660 | those that have no matching entries in the fact dataset (outer 661 | join behavior). This occurs unless the client BI tool specifically 662 | requests to have these values filtered out. 663 | 664 | - `always`: Query results only include members that join to the fact 665 | dataset (inner join behavior). This typically provides the best 666 | performance. 667 | 668 | ## default_member 669 | 670 | - **Type:** object 671 | - **Required:** N 672 | 673 | Defines a default member for the hierarchy. Default members of dimensional hierarchies serve together as a default filter for MDX queries on the model. 674 | 675 | When adding default hierarchies, be aware of the following: 676 | 677 | - If a query specifies a level in a hierarchy that has a default member, the default is not used. 678 | - Default hierarchy members are *not* used in queries that populate select fields and filter dialogs in BI clients. 679 | - You cannot specify secondary attributes as default dimension members. Doing so will cause queries to fail. 680 | - Setting default members on dimensions with multiple hierarchies can produce unexpected results, as it is easy to forget about the default member filtering on another hierarchy. 681 | 682 | `default_member` supports the following properties: 683 | 684 | - `expression`: String, required. An MDX expression that specifies the default member. This value must be quoted. 685 | - `apply_only_when_in_query`: Boolean, optional. When `true`, the default hierarchical member is only applied when it is explicitly included in a query. This enables you to selectively apply default constraints for meta dimensions (calculation groups or similar dimensions) that represent calculations or parameters, rather than data. 686 | 687 | ## levels 688 | 689 | - **Type:** array 690 | - **Required:** Y 691 | 692 | Defines the levels within the hierarchy. You can include as many levels as needed in the list. 693 | 694 | # Level Properties 695 | 696 | ## unique_name 697 | 698 | - **Type:** string 699 | - **Required:** Y 700 | 701 | Specifies the unique name of the level. This must be unique within the dimension. 702 | 703 | ## secondary_attributes 704 | 705 | - **Type:** array 706 | - **Required:** N 707 | 708 | Defines secondary attributes for the dimension level. Secondary attributes are dimensional attributes that are not the dimension's key, and are not part of a hierarchy. 709 | 710 | SML supports the following types of secondary attributes: 711 | 712 | - **Dimensional:** Provides an independent "dimensional" attribute for 713 | grouping metric data. This is the default type of secondary attribute. 714 | - **Level alias:** Enables the creation of tabular reports that select 715 | hierarchical expressions without forcing the user to drill down a 716 | hierarchy. 717 | 718 | **Note:** Secondary attributes cannot be used to create relationships 719 | between datasets and dimensions. 720 | 721 | ## aliases 722 | 723 | - **Type:** array 724 | - **Required:** N 725 | 726 | Defines secondary attributes that can be used as aliases for specific hierarchy levels within BI tools. 727 | 728 | ## metrics 729 | 730 | - **Type:** array 731 | - **Required:** N 732 | 733 | Defines metrics for the level. 734 | 735 | ## parallel_periods 736 | 737 | - **Type:** array 738 | - **Required:** N 739 | 740 | For levels in time dimensions only. Defines a custom parallel period for the level. You can use custom parallel periods to compare members in different levels of a time hierarchy that aren't in the same relative child position; for example, the last week of a 53-week year to that of a 52-week year. 741 | 742 | You can define as many parallel periods for a level as needed. 743 | 744 | ## is_hidden 745 | 746 | - **Type:** boolean 747 | - **Required:** N 748 | 749 | Determines whether the level is visible in BI tools. 750 | 751 | # Secondary Attributes Properties 752 | 753 | ## unique_name 754 | 755 | - **Type:** string 756 | - **Required:** Y 757 | 758 | The unique name of the secondary attribute. This must be unique within the dimension. 759 | 760 | ## label 761 | 762 | - **Type:** string 763 | - **Required:** Y 764 | 765 | The name of the secondary attribute, as it appears in the BI consumer. This value does not need to be unique. 766 | 767 | ## description 768 | 769 | - **Type:** string 770 | - **Required:** N 771 | 772 | A description of the secondary attribute. 773 | 774 | ## folder 775 | 776 | - **Type:** string 777 | - **Required:** N 778 | 779 | The name of the folder in which the attribute is displayed in BI tools. 780 | 781 | ## is_hidden 782 | 783 | - **Type:** boolean 784 | - **Required:** N 785 | 786 | Determines whether the attribute is visible in BI tools. 787 | 788 | Supported values: 789 | 790 | - `false` (default) 791 | - `true` 792 | 793 | ## contains\_unique\_names 794 | 795 | - **Type:** boolean 796 | - **Required:** N 797 | 798 | Determines whether each member of this attribute has a unique name. Do not enable this functionality if two members have different keys but the same name. 799 | 800 | Supported values: 801 | 802 | - `true` 803 | - `false` 804 | 805 | ## dataset 806 | 807 | - **Type:** string 808 | - **Required:** Y 809 | 810 | The dataset that contains the `key_columns` the secondary attribute is based on. 811 | 812 | ## name_column 813 | 814 | - **Type:** string 815 | - **Required:** Y 816 | 817 | The dataset column that the attribute is based on. 818 | 819 | ## key_columns 820 | 821 | - **Type:** array 822 | - **Required:** Y 823 | 824 | A list of the key columns that the attribute is based on. If the attribute has a compound key, you should specify all columns that make up the key as a list. 825 | 826 | ## sort_column 827 | 828 | - **Type:** array 829 | - **Required:** N 830 | 831 | The column used to sort the attribute's values in result sets. (This only applies to MDX queries.) 832 | 833 | ## allowed\_calcs\_for_dma 834 | 835 | - **Type:** array 836 | - **Required:** N 837 | 838 | A list of the calculation types that can be used to create dimensionally modified aggregates for the secondary attribute. Note that when working with a time dimension, you can only define calculation types if the `time_unit` property for the level is set to `day` or longer. 839 | 840 | ## exclude\_from\_dim_agg 841 | 842 | - **Type:** boolean 843 | - **Required:** N 844 | 845 | Excludes this attribute from system generated dimension-only aggregates. This is useful if the attribute contains a large number (millions) of distinct values that you don't want to aggregate. 846 | 847 | ## is_aggregatable 848 | 849 | - **Type:** boolean 850 | - **Required:** N 851 | 852 | Determines whether the attribute's member values can be aggregated. When enabled, an `All` member is created for the attribute, whose value is the aggregation of all of the attribute's member values. The `All` member sits at the top of the attribute's hierarchy, though it is not a part of the attribute itself. It often serves as the attribute's default member. 853 | 854 | ## exclude\_from\_fact_agg 855 | 856 | - **Type:** boolean 857 | - **Required:** N 858 | 859 | Excludes this attribute from system generated fact-based aggregates. This is useful if the attribute contains a large number (millions) of distinct values that you don't want to aggregate. 860 | 861 | ## custom\_empty\_member 862 | 863 | - **Type:** object 864 | - **Required:** N 865 | 866 | Defines a custom empty member for the attribute. This feature allows fact data with missing or invalid foreign key values to be isolated and independently aggregated from those with valid foreign key values. Because fact records with invalid foreign keys are aggregated separately from records referencing valid dimension members, analysts can easily spot data integrity problems and further investigate them. Use this feature to ensure that un-joinable values are included in query results and aggregated under a specially designated dimension member called the Custom Empty Member. 867 | 868 | # Alias Properties 869 | 870 | ## unique_name 871 | 872 | - **Type:** String 873 | - **Required:** Y 874 | 875 | The unique name of the alias. This must be unique within the dimension. 876 | 877 | ## label 878 | 879 | - **Type:** String 880 | - **Required:** Y 881 | 882 | The name of the alias as it appears in the consunmption tool and BI tools. This value does not need to be unique. 883 | 884 | ## description 885 | 886 | - **Type:** String 887 | - **Required:** N 888 | 889 | A description of the alias. 890 | 891 | ## folder 892 | 893 | - **Type:** String 894 | - **Required:** N 895 | 896 | The name of the folder in which the alias appears in BI tools. 897 | 898 | ## format 899 | 900 | - **Type:** String 901 | - **Required:** N 902 | 903 | The format in which query results are returned. You can use one of SML's built-in named formats or a custom format string. 904 | 905 | Supported named formats: 906 | 907 | `fixed`,`general number`, `none`, `percent`, `scientific`, `standard` 908 | 909 | Custom format strings should be in quotes and contain one to four sections, separated by semicolons. 910 | 911 | ## dataset 912 | 913 | - **Type:** String 914 | - **Required:** Y 915 | 916 | The source dataset that contains the column that the alias is based on. 917 | 918 | ## name_column 919 | 920 | - **Type:** String 921 | - **Required:** Y 922 | 923 | The dataset column that the alias is based on. 924 | 925 | ## sort_column 926 | 927 | - **Type:** String 928 | - **Required:** Y 929 | 930 | The column to use to sort the values in result sets. This applies to MDX queries only (queries received through the XMLA interface). 931 | 932 | ## is_hidden 933 | 934 | - **Type:** boolean 935 | - **Required:** N 936 | 937 | Determines whether the alias is visible in BI tools. 938 | 939 | Supported values: 940 | 941 | - `true` 942 | - `false` 943 | 944 | ## exclude\_from\_dim_agg 945 | 946 | - **Type:** boolean 947 | - **Required:** N 948 | 949 | Excludes this level attribute from system generated dimension-only 950 | aggregates. This is useful if the alias contains a large number 951 | (millions) of distinct values that you don't want to aggregate. 952 | 953 | Supported values: 954 | 955 | - `true` 956 | - `false` 957 | 958 | ## is_aggregatable 959 | 960 | - **Type:** boolean 961 | - **Required:** N 962 | 963 | Determines whether the alias's member values can be aggregated. When enabled, an `All` member is created for the alias, whose value is the aggregation of all of the alias's member values. The `All` member sits at the top of the alias's hierarchy, though it is not a part of the alias itself. It often serves as the alias's default member. 964 | 965 | ## exclude\_from\_fact_agg 966 | 967 | - **Type:** boolean 968 | - **Required:** N 969 | 970 | Excludes this level attribute from system generated fact-based 971 | aggregates. This is useful if the alias contains a large number 972 | (millions) of distinct values that you don't want to aggregate. 973 | 974 | Supported values: 975 | 976 | - `true` 977 | - `false` 978 | 979 | ## custom\_empty\_member 980 | 981 | - **Type:** object 982 | - **Required:** N 983 | 984 | Defines custom empty member values for the alias. This feature allows fact data with missing or invalid foreign key values to be isolated and independently aggregated from those with valid foreign key values. Because fact records with invalid foreign keys are aggregated separately from records referencing valid dimension members, analysts can easily spot data integrity problems and further investigate them. Use this feature to ensure that un-joinable values are included in query results and aggregated under a specially designated dimension member called the Custom Empty Member. 985 | 986 | 987 | # Metrical Attribute Properties 988 | 989 | ## unique_name 990 | 991 | - **Type:** String 992 | - **Required:** Y 993 | 994 | The unique name of the metrical attribute. This must be unique within the dimension. 995 | 996 | ## label 997 | 998 | - **Type:** String 999 | - **Required:** Y 1000 | 1001 | The name of the metrical attribute as it appears in the consunmption tool. This value does not need to be unique. 1002 | 1003 | ## description 1004 | 1005 | - **Type:** String 1006 | - **Required:** N 1007 | 1008 | A description of the metrical attribute. 1009 | 1010 | ## folder 1011 | 1012 | - **Type:** String 1013 | - **Required:** N 1014 | 1015 | The name of the folder in which the metrical attribute appears in BI tools. 1016 | 1017 | ## format 1018 | 1019 | - **Type:** String 1020 | - **Required:** N 1021 | 1022 | The format in which query results are returned. You can use one of SML's built-in named formats or a custom format string. 1023 | 1024 | Supported named formats: 1025 | 1026 | `fixed`,`general number`, `none`, `percent`, `scientific`, `standard` 1027 | 1028 | Custom format strings should be in quotes and contain one to four sections, separated by semicolons. 1029 | 1030 | ## dataset 1031 | 1032 | - **Type:** String 1033 | - **Required:** Y 1034 | 1035 | The source dataset that contains the column that the metrical attribute is based on. 1036 | 1037 | ## column 1038 | 1039 | - **Type:** String 1040 | - **Required:** Y 1041 | 1042 | The dataset column that the metrical attribute is based on. 1043 | 1044 | ## calculation_method 1045 | - **Type:** string 1046 | - **Required:** Y 1047 | 1048 | The calculation to apply to the data. 1049 | 1050 | Supported values: 1051 | 1052 | `average`, `count distinct`,`count non-null`, `estimated count distinct`, `maximum`, `minimum`, 1053 | `percentile`, `stddev_pop`, `stddev_samp`, `sum`, `var_pop`,`var_samp` 1054 | 1055 | ## is_hidden 1056 | 1057 | - **Type:** boolean 1058 | - **Required:** N 1059 | 1060 | Determines whether the alias is visible in BI tools. 1061 | 1062 | Supported values: 1063 | 1064 | - `true` 1065 | - `false` 1066 | 1067 | ## exclude\_from\_dim_agg 1068 | 1069 | - **Type:** boolean 1070 | - **Required:** N 1071 | 1072 | Excludes this level attribute from system generated dimension-only 1073 | aggregates. This is useful if the metrical attribute contains a large number 1074 | (millions) of distinct values that you don't want to aggregate. 1075 | 1076 | Supported values: 1077 | 1078 | - `true` 1079 | - `false` 1080 | 1081 | ## is_aggregatable 1082 | 1083 | - **Type:** boolean 1084 | - **Required:** N 1085 | 1086 | Determines whether the metrical attribute's member values can be aggregated. When enabled, an `All` member is created for the attribute, whose value is the aggregation of all of the attribute's member values. The `All` member sits at the top of the attribute's hierarchy, though it is not a part of the attribute itself. It often serves as the metrical attribute's default member. 1087 | 1088 | ## exclude\_from\_fact_agg 1089 | 1090 | - **Type:** boolean 1091 | - **Required:** N 1092 | 1093 | Excludes this level attribute from system generated fact-based 1094 | aggregates. This is useful if the metrical attribute contains a large number 1095 | (millions) of distinct values that you don't want to aggregate. 1096 | 1097 | Supported values: 1098 | 1099 | - `true` 1100 | - `false` 1101 | 1102 | ## custom\_empty\_member 1103 | 1104 | - **Type:** object 1105 | - **Required:** N 1106 | 1107 | Defines custom empty member values for the metrical attribute. This feature allows fact data with missing or invalid foreign key values to be isolated and independently aggregated from those with valid foreign key values. Because fact records with invalid foreign keys are aggregated separately from records referencing valid dimension members, analysts can easily spot data integrity problems and further investigate them. Use this feature to ensure that un-joinable values are included in query results and aggregated under a specially designated dimension member called the Custom Empty Member. 1108 | 1109 | ## unrelated\_dimensions\_handling 1110 | 1111 | - **Type:** enum 1112 | - **Required:** N 1113 | 1114 | Determines how the query engine behaves when all of the following conditions are true: 1115 | 1116 | - A client queries a model that contains multiple fact datasets. 1117 | - The data in each fact dataset are at a different level of 1118 | granularity than the data in the other fact datasets. 1119 | - The query references dimensions that are not related to the 1120 | metrics being queried. 1121 | 1122 | Supported values: 1123 | 1124 | - `error`: Query Engine rejects the query and returns an error message. 1125 | - `empty`: Query Engine displays empty cells in the query results. 1126 | - `repeat`: In the query results, Query Engine repeats the values for the secondary metrical attribute at a level of aggregation that is determined from the shared dimensions in the query. 1127 | 1128 | # Parallel Period Properties 1129 | 1130 | ## level 1131 | 1132 | - **Type:** string 1133 | - **Required:** Y 1134 | 1135 | The level to compare the current level to. Both levels must be in the same time hierarchy. 1136 | 1137 | ## key_columns 1138 | 1139 | - **Type:** array 1140 | - **Required:** Y 1141 | 1142 | The key column(s) in the dimension's source table that contain key values pointing to the desired parallel period. 1143 | 1144 | # Level Attributes Properties 1145 | 1146 | ## unique_name 1147 | 1148 | - **Type:** string 1149 | - **Required:** Y 1150 | 1151 | The unique name of the level attribute. This must be unique within the 1152 | dimension. 1153 | 1154 | ## label 1155 | 1156 | - **Type:** string 1157 | - **Required:** Y 1158 | 1159 | The name of the level attribute, as it appears in the consumption tool. This value 1160 | does not need to be unique. 1161 | 1162 | ## dataset 1163 | 1164 | - **Type:** string 1165 | - **Required:** Required if `shared_degenerate_columns` is not specified. 1166 | 1167 | The source dataset that contains the columns that this level attribute 1168 | is based on. 1169 | 1170 | ## name_column 1171 | 1172 | - **Type:** string 1173 | - **Required:** Required if `shared_degenerate_columns` is not specified. 1174 | 1175 | The column whose values appear for this level in BI tools. For example, 1176 | the key may be a product ID number, but you want users to see product 1177 | names instead. 1178 | 1179 | ## key_columns 1180 | 1181 | - **Type:** array 1182 | - **Required:** Required if `shared_degenerate_columns` is not specified. 1183 | 1184 | The dataset column that the level attribute is based on. If the level 1185 | has a compound key, list all columns that make up the key. 1186 | 1187 | If the key consists of one column, the values in that column must be 1188 | unique. If the key is a compound key, the columns together must provide 1189 | unique values. 1190 | 1191 | ## shared_degenerate_columns 1192 | 1193 | - **Type:** array 1194 | - **Required:** Required if `dataset`, `name_column`, and `key_columns` are not specified. 1195 | 1196 | Defines the dimension as a shared degenerate dimension (one composed of data from multiple fact datasets). 1197 | 1198 | Shared degenerate dimensions must meet the following requirements: 1199 | 1200 | - You must use the same number of columns from each fact dataset. 1201 | - The data types across the columns used must be consistent: 1202 | 1203 | - The name columns must all have the same data type. 1204 | - The key columns must all have the same data type. 1205 | 1206 | - If you set a sort column on one dataset, you must set them on all other datasets. Additionally, all sort columns on all datasets must have the same data type. 1207 | - If an order column is selected in one dataset, the order column from each of the other fact datasets must also be selected. 1208 | - For shared degenerate dimensions with multiple levels, all levels must be defined using the same datasets as the other levels. 1209 | - Shared degenerate dimensions cannot contain secondary attributes. 1210 | 1211 | Supported properties: 1212 | 1213 | - `dataset`: String, required. A fact dataset the shared degenerate dimension is based on. 1214 | - `name_column`: String, required. The column from the dataset whose values appear for the dimension in the consumption tool. For example, you can use a product ID column as they `key_columns`, but use a product name column as the `name_column`. All name columns must have the same data type. 1215 | - `key_columns`: Array, required. The column from the dataset that the shared degenerate dimension is based on. You can only include one `key_columns` value per dataset. All key columns must have the same data type. 1216 | - `sort_column`: String, optional. Defines the column from the dataset that is used to sort query results. If this property is specified for one dataset, it must be specified for all others. Additionally, all sort columns must have the same data type. 1217 | - `is_unique_key`: Boolean, optional. Determines whether values of the `key_columns` column are unique for each row. 1218 | 1219 | For example: 1220 | 1221 | ```yaml 1222 | level_attributes: 1223 | 1224 | - unique_name: Order Degen Shared Level 1225 | label: Order Degen Shared Level 1226 | 1227 | shared_degenerate_columns: 1228 | - name_column: salesordernumber 1229 | key_columns: 1230 | - salesordernumber 1231 | sort_column: salesordernumber 1232 | dataset: factinternetsales 1233 | is_unique_key: false 1234 | - name_column: productkey 1235 | key_columns: 1236 | - productkey 1237 | sort_column: productkey 1238 | dataset: dimproduct 1239 | is_unique_key: false 1240 | ``` 1241 | 1242 | ## description 1243 | 1244 | - **Type:** string 1245 | - **Required:** N 1246 | 1247 | A description of the level attribute. 1248 | 1249 | ## is_hidden 1250 | 1251 | - **Type:** boolean 1252 | - **Required:** N 1253 | 1254 | Determines whether the level attribute is visible in BI tools. 1255 | 1256 | Supported values: 1257 | 1258 | - `true` 1259 | - `false` 1260 | 1261 | ## is\_unique\_key 1262 | 1263 | - **Type:** boolean 1264 | - **Required:** N 1265 | 1266 | Determines whether the `key_columns` values are unique for each row. 1267 | 1268 | Supported values: 1269 | 1270 | - `true`: The key column values are unique for each row. The join 1271 | behavior considers the first matching row at query runtime. 1272 | - `false`: The key column values are multi-valued. The join behavior 1273 | considers all matching rows at query runtime. 1274 | 1275 | ## contains\_unique\_names 1276 | 1277 | - **Type:** boolean 1278 | - **Required:** N 1279 | 1280 | Determines whether each member of this level attribute has a unique 1281 | name. Do not enable this functionality if two members have different 1282 | keys but the same name. 1283 | 1284 | Supported values: 1285 | 1286 | - `true` 1287 | - `false` 1288 | 1289 | ## exclude\_from\_dim_agg 1290 | 1291 | - **Type:** boolean 1292 | - **Required:** N 1293 | 1294 | Excludes this level attribute from system generated dimension-only 1295 | aggregates. This is useful if the attribute contains a large number 1296 | (millions) of distinct values that you don't want to aggregate. 1297 | 1298 | Supported values: 1299 | 1300 | - `true` 1301 | - `false` 1302 | 1303 | ## is_aggregatable 1304 | 1305 | - **Type:** boolean 1306 | - **Required:** N 1307 | 1308 | Determines whether the level attribute's member values can be aggregated. When enabled, an `All` member is created for the attribute, whose value is the aggregation of all of the attribute's member values. The `All` member sits at the top of the attribute's hierarchy, though it is not a part of the attribute itself. It often serves as the level attribute's default member. 1309 | 1310 | ## exclude\_from\_fact_agg 1311 | 1312 | - **Type:** boolean 1313 | - **Required:** N 1314 | 1315 | Excludes this level attribute from system generated fact-based 1316 | aggregates. This is useful if the attribute contains a large number 1317 | (millions) of distinct values that you don't want to aggregate. 1318 | 1319 | Supported values: 1320 | 1321 | - `true` 1322 | - `false` 1323 | 1324 | ## sort_column 1325 | 1326 | - **Type:** string 1327 | - **Required:** N 1328 | 1329 | Defines the column to sort query results on. By default, this is the 1330 | `name_column`; however, you can optionally use this property to specify 1331 | a different column. 1332 | 1333 | **Note:** This only applies to MDX queries (queries received through the 1334 | XMLA interface). 1335 | 1336 | ## allowed\_calcs\_for_dma 1337 | 1338 | - **Type:** array 1339 | - **Required:** N 1340 | 1341 | A list of the calculations that can be used when creating dimensionally 1342 | modified aggregates for the level attribute. 1343 | 1344 | ## folder 1345 | 1346 | - **Type:** string 1347 | - **Required:** N 1348 | 1349 | The name of the folder in which this level attribute appears in BI 1350 | tools. 1351 | 1352 | ## time_unit 1353 | 1354 | - **Type:** string 1355 | - **Required:** N 1356 | 1357 | For time dimensions only. The unit of time to use. 1358 | 1359 | Supported values: 1360 | 1361 | `year`, `halfyear`, `trimester`, `quarter`, `month`, 1362 | `week`, `day`, `hour`, `minute`, `second`, `undefined` 1363 | 1364 | # Custom Empty Member Properties 1365 | 1366 | ## key 1367 | 1368 | - **Type:** array 1369 | - **Required:** Y 1370 | 1371 | A list of the empty member values to use for key fields. 1372 | 1373 | ## name 1374 | 1375 | - **Type:** string 1376 | - **Required:** Y 1377 | 1378 | The empty member value to use for name fields. 1379 | 1380 | ## sort_name 1381 | 1382 | - **Type:** string 1383 | - **Required:** N 1384 | 1385 | The empty member value to use for the attribute's sort column, if one is specified. 1386 | 1387 | -------------------------------------------------------------------------------- /sml-reference/metric.md: -------------------------------------------------------------------------------- 1 | # Metric 2 | 3 | Metric files define metrics to be used in your repository. A metric is a 4 | numeric value representing a summarized (or aggregated) dataset metric, 5 | such as the sum of sales or average order quantity. Metrics always 6 | result from an aggregate calculation applied to one or more columns of a 7 | fact dataset. 8 | 9 | SML supports the following types of metrics: 10 | 11 | - **Additive:** Metrics whose values can be summarized for any dimension 12 | attribute of the model and then combined consistently. 13 | - **Non-additive:** Metrics whose values cannot be summed across any 14 | dimensional groupings using basic addition, since this would typically 15 | produce an inaccurate result. The most common example of a 16 | non-additive metric is a distinct count of an attribute value. 17 | - **Semi-additive:** Metrics whose values can be summarized for some 18 | dimensions in a model, but not all. Ratios such as average are also 19 | considered semi-additive metrics. 20 | 21 | Sample `metric` file: 22 | 23 | ```yaml 24 | unique_name: m_catalog_sales_coupon_amount_avg 25 | object_type: metric 26 | label: Catalog Sales Average Coupon Amount 27 | calculation_method: average 28 | dataset: catalog_sales 29 | column: cs_sales_price 30 | ``` 31 | 32 | # Entity Relationships 33 | 34 | ```mermaid 35 | classDiagram 36 | Metric *-- SemiAdditive 37 | namespace Metrics{ 38 | class Metric{ 39 | String unique_name 40 | String label 41 | const object_type 42 | String description 43 | String calculation_method 44 | String dataset 45 | String column 46 | SemiAdditive semi_additive 47 | int compression 48 | String named_quantiles 49 | String format 50 | enum unrelated_dimensions_handling 51 | Boolean is_hidden 52 | } 53 | class SemiAdditive{ 54 | String position 55 | ~Array~Relationship~~ relationships 56 | } 57 | } 58 | ``` 59 | 60 | # Metric Properties 61 | 62 | ## unique_name 63 | 64 | - **Type:** string 65 | - **Required** Y 66 | 67 | The unique name of the metric. This must be unique across all 68 | repositories and subrepositories. 69 | 70 | ## object_type 71 | 72 | - **Type:** const 73 | - **Required** Y 74 | 75 | The type of object defined by the file. For metrics, the value of this 76 | property must be `metric`. 77 | 78 | ## label 79 | 80 | - **Type:** string 81 | - **Required** Y 82 | 83 | The name of the metric, as it appears in the consunmption tool. This value does not 84 | need to be unique. 85 | 86 | ## calculation_method 87 | 88 | - **Type:** string 89 | - **Required** Y 90 | 91 | The method used to aggregate query results for the metric. 92 | 93 | Supported values: 94 | 95 | - `average` 96 | - `count distinct` 97 | - `count non-null` 98 | - `estimated count distinct` 99 | - `maximum` 100 | - `minimum` 101 | - `percentile` 102 | - `stddev_pop` 103 | - `stddev_samp` 104 | - `sum` 105 | - `sum distinct` 106 | - `var_pop` 107 | - `var_samp` 108 | 109 | The calculation method you can use depends on the type of metric you're 110 | creating: 111 | 112 | - **Semi-additive:** `average`, `sum`, `minimum`, `maximum` 113 | - **Non-additive:** `count distinct`, `sum distinct`, `percentile` 114 | - **Additive:** All other options 115 | 116 | ## dataset 117 | 118 | - **Type:** string 119 | - **Required:** Y 120 | 121 | The source dataset that contains the column the metric is based on. 122 | 123 | ## column 124 | 125 | - **Type:** string 126 | - **Required:** Y 127 | 128 | The specific column within the `dataset` that the metric is based on. 129 | 130 | ## description 131 | 132 | - **Type:** string 133 | - **Required** N 134 | 135 | A description of the metric. 136 | 137 | ## semi_additive 138 | 139 | - **Type:** object 140 | - **Required** N 141 | 142 | Defines the metric as a semi-additive metric. 143 | 144 | `semi_additive` supports the following properties. 145 | 146 | ### position 147 | 148 | - **Type:** string 149 | - **Required:** Y 150 | 151 | Determines whether the metric is First Non-Empty, Last Non-Empty, First Child or Last Child. 152 | 153 | Supported values: 154 | - `first` 155 | - `last` 156 | - `first_child` - introduced in version `1.1` 157 | - `last_child` - introduced in version `1.1` 158 | 159 | ### relationships 160 | 161 | - **Type:** Array 162 | - **Required:** Required if `degenerate_dimensions` is undefined; otherwise, it is optional. 163 | 164 | A list of the relationships connecting to the dimensional attributes whose values should not be summarized by the metric. You can define as many as needed in the list. 165 | 166 | **Note:** This list should not include relationships to degenerate dimensions; those must be defined via the `degenerate_dimensions` property (see below). 167 | 168 | Relationships to embedded dimensions must be defined as indented lists, whose subitems construct the path to the nested dimension: 169 | 170 | ```yaml 171 | position: first 172 | relationships: 173 | - relationship1 174 | - relationship2 175 | - 176 | - relationship3 177 | - relationship4 178 | - relationship5 179 | ``` 180 | 181 | In the above example, `relationship1`, `relationship2`, and `relationship5` all define relationships to attributes in first-level dimensions. The third item in the list defines the path to an embedded dimension, connected to the fact table via `relationship3` and `relationship4`. 182 | 183 | **Note:** For semi-additive metrics that contain embedded dimensions, the following restrictions apply to the non-additive component of first non-empty/last non-empty value metrics: 184 | - Only the leaf levels of the embedded dimension hierarchies can be referenced. Note that this does not restrict the ability to query at higher levels of the embedded dimension hierarchy. 185 | - Embedded dimensions cannot be referenced via many-to-many relationships. 186 | - Embedded dimensions cannot be referenced through a path that involves role-playing. 187 | 188 | 189 | ### degenerate_dimensions 190 | 191 | - **Type:** Array 192 | - **Required:** Required if `relationships` is undefined; otherwise, it is optional. 193 | 194 | A list of degenerate dimensions whose values should not be summarized. 195 | 196 | **Note:** This list must only include degenerate dimensions; non-degenerate dimensions must be defined via the `relationships` property (see above). 197 | 198 | Supported properties: 199 | 200 | - `name`: String, required. The unique name of the degenerate dimension. 201 | - `level`: String, required. The specific level within the degenerate dimension. 202 | 203 | For example: 204 | 205 | ```yaml 206 | position: first 207 | degenerate_dimensions: 208 | - name: dim1 209 | level: level1 210 | - name: dim2 211 | level: level2 212 | ``` 213 | 214 | ## compression 215 | 216 | - **Type:** number 217 | - **Required:** N 218 | 219 | Only for non-additive metrics using a `calulation_method` of 220 | `percentile`. Defines the compression score the semantic engine uses when 221 | estimating percentile values for query results. 222 | 223 | You can specify a value 1 - 50,000. 224 | 225 | Using a higher compression score yields more accurate query results but 226 | requires more memory from the engine to process. You may need to run 227 | tests to determine the right level of compression for your needs. 228 | 229 | 230 | ## named_quantiles 231 | 232 | - **Type:** string 233 | - **Required:** Required if `calculation_method` is `percentile` 234 | 235 | Only for non-additive metrics using a `calulation_method` of 236 | `percentile`. Defines the quantile to use for query results. 237 | 238 | Supported values: 239 | 240 | - `quartiles` 241 | - `median` 242 | - `deciles` 243 | 244 | ## format 245 | 246 | - **Type:** string 247 | - **Required:** N 248 | 249 | The format in which query results are returned. You can use one of 250 | SML's built-in named formats or a custom format string. 251 | 252 | Supported named formats: 253 | 254 | - `fixed` 255 | - `general number` 256 | - `none` 257 | - `percent` 258 | - `scientific` 259 | - `standard` 260 | 261 | Custom format strings should be in quotes and contain one to four 262 | sections, separated by semicolons. 263 | 264 | ## unrelated_dimensions_handling 265 | 266 | - **Type:** string 267 | - **Required:** N 268 | 269 | Determines how the semantic engine behaves when all of the following 270 | conditions are true: 271 | 272 | - A client queries a model that contains multiple fact datasets. 273 | - The data in each fact dataset are at a different level of granularity 274 | than the data in the other fact datasets. 275 | - The query references dimensions that are not related to the metrics 276 | being queried. 277 | 278 | Supported values: 279 | 280 | - `error`: Query Engine rejects the query and returns an error message. 281 | - `empty`: Query Engine displays empty cells in the query results. 282 | - `repeat`: In the query results, Query Engine repeats the values for the 283 | metric at a level of aggregation that is determined from the shared 284 | dimensions in the query. 285 | 286 | ## is_hidden 287 | 288 | - **Type:** boolean 289 | - **Required:** N 290 | 291 | Determines whether the metric is visible in BI tools. 292 | -------------------------------------------------------------------------------- /sml-reference/model.md: -------------------------------------------------------------------------------- 1 | # Model 2 | 3 | Model files define SML models. In SML, a model is a metadata 4 | layer that overlays a multi-dimensional model format on top of the 5 | datasets stored in a connected database. The model is virtual, meaning 6 | the data is not moved or processed up front. Instead, it contains the 7 | logic about how to process and optimize the data at query runtime. 8 | 9 | **Note:** Some properties can appear in both `catalog.yml` and model 10 | files. Those defined in model files override their counterparts in 11 | `catalog.yml`. 12 | 13 | Sample `model` file: 14 | 15 | ```yaml 16 | unique_name: Internet Sales 17 | object_type: model 18 | label: Internet Sales 19 | visible: true 20 | 21 | relationships: 22 | 23 | - unique_name: factinternetsales_Date_Dimension_Order 24 | from: 25 | dataset: factinternetsales 26 | join_columns: 27 | - orderdatekey 28 | to: 29 | dimension: Date Dimension 30 | level: DayMonth 31 | role_play: "Order {0}" 32 | 33 | - unique_name: factinternetsales_Date_Dimension_Ship 34 | from: 35 | dataset: factinternetsales 36 | join_columns: 37 | - shipdatekey 38 | to: 39 | dimension: Date Dimension 40 | level: DayMonth 41 | role_play: "Ship {0}" 42 | 43 | - unique_name: factinternetsales_Date_Dimension_Order_1 44 | from: 45 | dataset: factinternetsales 46 | join_columns: 47 | - orderdatekey 48 | to: 49 | dimension: Date Dimension 50 | level: Reporting_Day 51 | role_play: "Order {0}" 52 | 53 | - unique_name: factinternetsales_Date_Dimension_Ship_1 54 | from: 55 | dataset: factinternetsales 56 | join_columns: 57 | - shipdatekey 58 | to: 59 | dimension: Date Dimension 60 | level: Reporting_Day 61 | role_play: "Ship {0}" 62 | 63 | - unique_name: factinternetsales_Product_Dimension 64 | from: 65 | dataset: factinternetsales 66 | join_columns: 67 | - productkey 68 | to: 69 | dimension: Product Dimension 70 | level: Product Name 71 | 72 | - unique_name: factinternetsales_Order_Dimension 73 | from: 74 | dataset: factinternetsales 75 | join_columns: 76 | - salesorderlinenumber 77 | - salesordernumber 78 | - currencykey 79 | to: 80 | dimension: Order Dimension 81 | level: order_line_item 82 | 83 | - unique_name: factinternetsales_Date_Dimension_Order_2 84 | from: 85 | dataset: factinternetsales 86 | join_columns: 87 | - orderdatekey 88 | to: 89 | dimension: Date Dimension 90 | level: customday 91 | role_play: "Order {0}" 92 | 93 | - unique_name: factinternetsales_Date_Dimension_Ship_2 94 | from: 95 | dataset: factinternetsales 96 | join_columns: 97 | - shipdatekey 98 | to: 99 | dimension: Date Dimension 100 | level: customday 101 | role_play: "Ship {0}" 102 | 103 | - unique_name: factinternetsales_Customer_Dimension 104 | from: 105 | dataset: factinternetsales 106 | join_columns: 107 | - customerkey 108 | to: 109 | dimension: Customer Dimension 110 | level: Customer Name 111 | 112 | dimensions: 113 | - Color Dimension 114 | - Size Dimension 115 | - Style Dimension 116 | - Weight 117 | 118 | metrics: 119 | 120 | - unique_name: orderquantity 121 | folder: Sales Metrics 122 | 123 | - unique_name: salesamount 124 | folder: Sales Metrics 125 | 126 | perspectives: 127 | - unique_name: Internet Sales - No PII 128 | dimensions: 129 | - hierarchies: 130 | - levels: 131 | - Customer Name 132 | name: Customer Hierarchy 133 | name: Customer Dimension 134 | secondaryattributes: 135 | - d_firstname 136 | - d_lastname 137 | 138 | drillthroughs: 139 | 140 | - unique_name: Customer Details 141 | attributes: 142 | 143 | - name: State 144 | dimension: Geography Dimension 145 | 146 | - name: Customer Name 147 | dimension: Customer Dimension 148 | 149 | - name: City 150 | dimension: Geography Dimension 151 | 152 | - name: Zip Code 153 | dimension: Geography Dimension 154 | 155 | metrics: 156 | - orderquantity 157 | - salesamount 158 | 159 | - unique_name: Shipping Details 160 | attributes: 161 | 162 | - name: Size 163 | dimension: Size Dimension 164 | 165 | - name: Style 166 | dimension: Style Dimension 167 | 168 | - name: Color 169 | dimension: Color Dimension 170 | 171 | - name: Customer Name 172 | dimension: Customer Dimension 173 | 174 | - name: Product Name 175 | dimension: Product Dimension 176 | 177 | metrics: 178 | - orderquantity 179 | - salesamount 180 | ``` 181 | 182 | # Entity Relationships 183 | 184 | ```mermaid 185 | classDiagram 186 | Relationship *-- From 187 | Relationship *-- To 188 | Model ..> DimensionReference 189 | Model ..> MetricReference 190 | Model *-- Relationship 191 | Model *-- Perspective 192 | Model *-- Aggregate 193 | Model *-- Partition 194 | Model *-- Drillthrough 195 | Aggregate *-- AttributeReference 196 | Perspective *-- PerspectiveDimension 197 | PerspectiveDimension *-- PerspectiveHierarchy 198 | Drillthrough *-- AttributeReferenceDrillthrough 199 | namespace Models{ 200 | class Model{ 201 | String unique_name 202 | const object_type 203 | String label 204 | String description 205 | Array~Relationship~ relationships 206 | Array~String~ dimensions 207 | Array~MetricReference~ metrics 208 | Array~Aggregate~ aggregates 209 | Array~Perspective~ perspectives 210 | Array~Drillthrough~ drillthroughs 211 | Array~Partition~ partitions 212 | Boolean include_default_drillthrough 213 | Object overrides 214 | } 215 | class Relationship{ 216 | String unique_name 217 | Object from 218 | Object to 219 | String role_play 220 | String type 221 | } 222 | class From{ 223 | String dataset 224 | Array~Column~ columns 225 | } 226 | class To{ 227 | String dimension 228 | String level 229 | String row_security 230 | } 231 | class Aggregate{ 232 | String unique_name 233 | String label 234 | enum caching 235 | Array~String~ metrics 236 | Array~AttributeReference~ attributes 237 | } 238 | class Drillthrough{ 239 | String unique_name 240 | String notes 241 | Array~String~ metrics 242 | Array~AttributeReferenceDrillthrough~ attributes 243 | } 244 | class AttributeReferenceDrillthrough{ 245 | String name 246 | String dimension 247 | Array~String~ relationships_path 248 | } 249 | class AttributeReference{ 250 | String name 251 | String dimension 252 | String partition 253 | String distribution 254 | Array~String~ relationships_path 255 | } 256 | class Partition{ 257 | String unique_name 258 | String dimension 259 | String attribute 260 | String type 261 | } 262 | class Perspective{ 263 | String unique_name 264 | Array~String~ metrics 265 | Array~PerspectiveDimension~ dimensions 266 | } 267 | class MetricReference{ 268 | String unique_name 269 | String folder 270 | } 271 | class DimensionReference{ 272 | String unique_name 273 | } 274 | class PerspectiveDimension{ 275 | String name 276 | Array~PerspectiveHierarchy~ hierarchies 277 | Array~String~ secondary_attributes 278 | Array~String~ relationships_path 279 | } 280 | class PerspectiveHierarchy{ 281 | String name 282 | Array~String~ levels 283 | } 284 | } 285 | ``` 286 | 287 | # Model Properties 288 | 289 | ## unique_name 290 | 291 | - **Type:** string 292 | - **Required:** Y 293 | 294 | The unique name of the model. This must be unique across all 295 | repositories and subrepositories. 296 | 297 | ## object_type 298 | 299 | - **Type:** const 300 | - **Required:** Y 301 | 302 | The type of object defined by the file. For models, the value of this 303 | property should be `model`. 304 | 305 | ## label 306 | 307 | - **Type:** string 308 | - **Required:** Y 309 | 310 | The name of the model, as it appears in the consunmption tool. This value does not 311 | need to be unique. 312 | 313 | ## relationships 314 | 315 | - **Type:** array 316 | - **Required:** Y 317 | 318 | Defines the relationships between the model's fact datasets and first 319 | order dimensions. These are called fact relationships. 320 | 321 | **Note:** These relationships are separate from those defined at the 322 | dimension level: relationships at the model level involve fact datasets, 323 | while those at the dimension level do not. 324 | 325 | **Note:** Degenerate dimensions have relationships to the fact datasets on 326 | which they are based. However, these dimensions do not need a 327 | `relationships` property as they are created by referencing the fact 328 | dataset columns directly. 329 | 330 | If you do not want to add relationships to the model, the value of this 331 | property must be `[]`. For example: `relationships: []` 332 | 333 | The `relationships` property of a model file supports the following 334 | properties. 335 | 336 | ### unique_name 337 | 338 | - **Type:** string 339 | - **Required:** Y 340 | 341 | The unique name of the relationship. This must be unique within the 342 | model file. 343 | 344 | ### from 345 | 346 | - **Type:** object 347 | - **Required:** Y 348 | 349 | Defines the side of the relationship that contains the physical fact 350 | dataset. Typically, this is a join column in the fact dataset. 351 | 352 | Supported properties: 353 | 354 | - `dataset`: String, required. The physical fact dataset you want to 355 | link to a dimension. 356 | - `join_columns`: Array, required. The columns within the `dataset` that 357 | you want to use as join columns. 358 | 359 | ### to 360 | 361 | - **Type:** object 362 | - **Required:** Y 363 | 364 | Defines the dimension that the `from` dataset is linked to. 365 | 366 | Supported properties: 367 | 368 | - `dimension`: String, required if `row_security` is undefined. The name 369 | of the dimension to which the `from` dataset is joined. 370 | - `level`: String, required if `row_security` is undefined. The 371 | `unique_name` of the level attribute within the `dimension` to use for 372 | the relationship. 373 | - `row_security`: String, required if `dimension` and `level` are 374 | undefined. For security relationships, the row 375 | security object that the `from` dataset is joined to. 376 | 377 | ### role_play 378 | 379 | - **Type:** string 380 | - **Required:** N 381 | 382 | For role-playing relationships only. Defines the role-playing template 383 | for the relationship. 384 | 385 | The role-playing template is the prefix and/or suffix that is added to 386 | every attribute in the role-played dimension. 387 | 388 | This value must be in one of the following formats (including quotation 389 | marks): 390 | 391 | - **Prefix:** `" {0}"` 392 | - **Suffix:** `"{0} "` 393 | - **Prefix and suffix:** `" {0} "` 394 | 395 | For example, if you wanted to use the prefix **Order**, you would set 396 | `role_play` to `"Order {0}"`. 397 | 398 | ## metrics 399 | 400 | - **Type:** array 401 | - **Required:** Y 402 | 403 | A list of references to metrics and calculations used in the model. 404 | 405 | Supported properties: 406 | 407 | - `unique_name`: String, required. The unique name of the metric or 408 | calculation. 409 | - `folder`: String, optional. The name of the folder in which the 410 | metric/calculation is displayed in BI tools. If your model has a lot 411 | of metrics/calculations, folders are a good way to organize them. 412 | 413 | **Note:** If you do not want to add metrics to the model, the value of 414 | this property must be `[]`. For example: `metrics: []` 415 | 416 | ## description 417 | 418 | - **Type:** string 419 | - **Required:** N 420 | 421 | A description of the model. 422 | 423 | ## dimensions 424 | 425 | - **Type:** array 426 | - **Required:** N 427 | 428 | A list of references to degenerate dimensions defined on a specific fact 429 | dataset in the model. 430 | 431 | ## perspectives 432 | 433 | - **Type:** array 434 | - **Required:** N 435 | 436 | Perspectives are deployable subsets of the data model. They are meant to 437 | make it easier for analysts to query only the subset of data that is 438 | relevant to their purposes or responsibilities. Rather than provide 439 | analysts with the entire data model, you can make specific dimensions, 440 | hierarchies, levels, secondary attributes, measures, and calculated 441 | measures invisible to them. 442 | 443 | **Note:** We recommend that you add perspectives *after* a model has 444 | been fully tested. Although you can edit a model after adding 445 | perspectives, any changes might require you to update the perspectives 446 | to hide new objects that would otherwise be visible to all users. 447 | 448 | The semantic engine imposes no limit on the number of perspectives that 449 | you can add to a model. Perspectives contain no data themselves, but are 450 | simply virtual views of the data. 451 | 452 | The `perspectives` property in a model file supports the following 453 | properties. 454 | 455 | ### unique_name 456 | 457 | - **Type:** string 458 | - **Required:** Y 459 | 460 | The unique name of the perspective. This must be unique within the model 461 | file. 462 | 463 | ### metrics 464 | 465 | - **Type:** array 466 | - **Required:** N 467 | 468 | A list of the specific metrics and calculations to be hidden in the 469 | perspective. 470 | 471 | ### dimensions 472 | 473 | - **Type:** array 474 | - **Required:** N 475 | 476 | A list of the specific dimensions and their hierarchies to be hidden in the 477 | perspective. 478 | 479 | By default, all objects within a dimension are visible. The lowest granularity objects specified are 480 | hidden and the objects above it are not. Hiding a level in a hierarchy hides all levels below it. 481 | Hiding a hierarchy hides all levels in it. Hiding a dimension hides all objects within it including hierarchies 482 | and secondary attributes. If a dimension is not hidden, secondary attributes can be hidden individually. 483 | 484 | Supported properties: 485 | 486 | - `name`: String, required. The name of the dimension to be hidden in the 487 | perspective. 488 | 489 | - `hierarchies`: Array, optional. A list of the specific hierarchies 490 | within the dimension to hide in the perspective. Supported properties: 491 | - `name`: String, required. The name of the hierarchy. 492 | - `levels`: Array, optional. Defines a single level in the hierarchy to be hidden in the perspective. All levels below the specified level will also be hidden. Only one level should be provided. 493 | 494 | - `secondary_attributes`: Array, optional. A list of the dimension's 495 | secondary attributes to hide in the perspective. 496 | 497 | - `relationships_path`: Array, optional. A list of relationships used to specify role-playing. 498 | 499 | ## drillthroughs 500 | 501 | - **Type:** array 502 | - **Required:** N 503 | 504 | In BI tools, a drillthrough enables you to view detailed information 505 | about a specific cell within a visualization as needed. This provides an 506 | alternative to including lots of fine-grained attributes in large pivot 507 | tables, which can result in performance issues. Moving these attributes 508 | to drillthroughs means they are only returned if a user requests them 509 | for a specific cell, rather than for the entire table. 510 | 511 | In an SML model, you can define drillthroughs that include the specific 512 | level of detail to return for these types of queries. 513 | 514 | The `drillthroughs` property in a model file supports the following 515 | properties. 516 | 517 | ### unique_name 518 | 519 | - **Type:** string 520 | - **Required:** Y 521 | 522 | The unique name of the drillthrough. This must be unique within the 523 | model file. 524 | 525 | ### metrics 526 | 527 | - **Type:** array 528 | - **Required:** Y 529 | 530 | A list of the metrics to include in the drillthrough. 531 | 532 | ### notes 533 | 534 | - **Type:** string 535 | - **Required:** N 536 | 537 | Notes about the drillthrough. 538 | 539 | ### attributes 540 | 541 | - **Type:** array 542 | - **Required:** N 543 | 544 | A list of the specific attributes to include in the drillthrough. 545 | 546 | Supported properties: 547 | 548 | - `name`: String, required. The name of the attribute to include in the 549 | drillthrough. 550 | - `dimension`: String, optional. The dimension that the attribute 551 | defined by `name` appears in. 552 | - `relationships_path`: Array, optional. A list of relationships path. 553 | 554 | ## aggregates 555 | 556 | - **Type:** array 557 | - **Required:** N 558 | 559 | The `aggregates` property in a model file enables you to add 560 | user-defined aggregates (UDAs). 561 | 562 | In general, we recommend relying on the aggregate tables 563 | automatically generated by the semantic engine. However, there are cases 564 | that are not covered by system-defined aggregates. For example: 565 | 566 | - **Metrics on dimensions:** The semantic engine does not generate 567 | aggregate tables for metrics that are local to a dimension only (a 568 | secondary metrical attribute in the model). 569 | - **Non-additive metrics:** The semantic engine does not generate 570 | aggregate tables for non-additive metrics, which are useful for 571 | distinct counts. This is because such an aggregate table defined for 572 | one query would not be usable by other queries. 573 | 574 | If you require aggregate tables that contain these types of dimensional 575 | attributes or metrics, you should define your own manually using the 576 | `aggregates` property. 577 | 578 | The `aggregates` property in a model file supports the following 579 | properties. 580 | 581 | ### unique_name 582 | 583 | - **Type:** string 584 | - **Required:** Y 585 | 586 | The unique name of the aggregate. This must be unique within the model 587 | file. 588 | 589 | Aggregate table names used by the query engine are system-generated, but 590 | they include the first 14 characters of the user-supplied name at the 591 | end of the internal ID name. This name can help you identify when a 592 | user-defined aggregate is used in a query. For example: 593 | `as_agg_internal-id_my-uda-name` 594 | 595 | ### label 596 | 597 | - **Type:** string 598 | - **Required:** Y 599 | 600 | The name of the aggregate, as it appears in the consunmption tool. This value does not 601 | need to be unique. 602 | 603 | ### caching 604 | 605 | - **Type:** enum 606 | - **Required:** N 607 | 608 | This setting will control whether the aggregate is pinned in local cache. 609 | 610 | Supported values: 611 | 612 | - `engine-memory` 613 | 614 | ### metrics 615 | 616 | - **Type:** array 617 | - **Required:** Y 618 | 619 | A list of the metrics and calculations to include in the aggregate 620 | definition. This is the data that is summarized in the resulting 621 | aggregate table. 622 | 623 | ### attributes 624 | 625 | - **Type:** array 626 | - **Required:** N 627 | 628 | A list of the dimension attributes to include in the aggregate 629 | definition. 630 | 631 | Supported properties: 632 | 633 | - `name`: String, required. The name of the dimension attribute to 634 | include. These values are used to group the summarized metric data in 635 | the resulting aggregate table. Note that user-defined aggregate 636 | definitions are fixed: they do not include every level of a hierarchy 637 | unless they are explicitly defined. 638 | 639 | - `dimension`: String, required. The dimension to which the attribute 640 | defined by `name` belongs. 641 | 642 | - `partition`: String, optional. Adds a partition to the aggregate, and 643 | determines whether it should be defined on the key column, name 644 | column, or both. Supported values: `name`, `key`, `name+key` 645 | 646 | When the engine builds an instance of this aggregate, it creates 647 | a partition for each combination of values in the dimensional 648 | attributes. The number of partitions depends on the 649 | left-to-right order of the attributes, as well as the number of 650 | values for each attribute. 651 | 652 | Essentially, the partitioning key functions as a `GROUP BY` 653 | column. Queries against the aggregate must use this dimensional 654 | attribute in a `WHERE` clause. A good candidate for a 655 | partitioning key is a set of dimensional attributes that 656 | together have a few hundred to under 1000 value combinations. 657 | 658 | - `distribution`: String, optional. The distribution keys to use when 659 | creating the aggregate table. If your aggregate data warehouse 660 | supports distribution keys, then the semantic engine uses the specified keys when 661 | creating the aggregate table. 662 | 663 | ## partitions 664 | 665 | - **Type:** array 666 | - **Required:** N 667 | 668 | The `partitions` property in a model file enables you to create 669 | prioritized partitioning hints that the semantic engine uses to create 670 | partitioned aggregate tables. The actual partitioning scheme used by the 671 | engine depends on a number of factors, including: 672 | 673 | - Whether the aggregate includes a column that matches a partition hint. 674 | - Whether the semantic engine statistics suggest that partitioning would be 675 | worthwhile. 676 | - Whether the target data warehouse supports table partitioning. 677 | 678 | Within SML, all partitions used in a model are defined in the model file 679 | itself. 680 | 681 | The `partitions` property in a model supports the following properties. 682 | 683 | ### unique_name 684 | 685 | - **Type:** string 686 | - **Required:** Y 687 | 688 | The unique name of the partition. This must be unique within the model 689 | file. 690 | 691 | ### dimension 692 | 693 | - **Type:** string 694 | - **Required:** Y 695 | 696 | The dimension that contains the `attribute` the partition is based on. 697 | 698 | ### attribute 699 | 700 | - **Type:** string 701 | - **Required:** Y 702 | 703 | The attribute that the partition is based on. 704 | 705 | ### type 706 | 707 | - **Type:** string 708 | - **Required:** Y 709 | 710 | Determines whether the partition is defined on the name column, key 711 | column, or both. 712 | 713 | Supported values: 714 | 715 | - `name` 716 | - `key` 717 | - `name+key` 718 | 719 | ## dataset_properties 720 | 721 | - **Type:** object 722 | - **Required:** N 723 | 724 | Defines dataset properties that are specific to the model, rather than 725 | the repository. 726 | 727 | Supported properties: 728 | 729 | - `allow_aggregates`: Boolean, optional. Enables the semantic engine to 730 | create aggregates for datasets in the repository. 731 | - `allow_local_aggs`: Boolean, optional. Enables local aggregation for 732 | datasets in the repository. 733 | - `allow_peer_aggs`: Boolean, optional. Enables aggregation on data 734 | derived from datasets in data warehouses that are different from the 735 | source dataset. 736 | - `allow_preferred_aggs`: Boolean, optional. Allow aggregates to be built 737 | in preferred storage. 738 | - `create_hinted_aggregate`: Boolean, options. Enables the creation of 739 | hinted aggregates for the dataset. 740 | 741 | Specify the name of the dataset followed by the properties and values 742 | you want to set for it at the model level. For example: 743 | 744 | dataset1: 745 | create_hinted_aggregate: true 746 | 747 | ## overrides 748 | 749 | - **Type:** object 750 | - **Required:** N 751 | 752 | The `overrides` property in a model file enables the creation of query name overrides for metrics and degenerate dimensions referenced in а model. 753 | This scenario arises from legacy projects where metrics or dimensions in different models can use the same `unique_name` when deployed. 754 | Since project scope uniqueness is enforced these metrics/dimensions are required have a different `unique_name` value in the repo. 755 | When a model is deployed the overridden `unique_name` is replaced by the original value. 756 | The best practice is to never use the same `unique_name` for different objects across models so overrides should only be used when migrating 757 | from a legacy model and wanting to maintain the same query name for existing interfaces. 758 | 759 | **Note:** This applies only for degenerate dimensions, NOT the dimensions part of relationships. 760 | 761 | - The object key must be a metric or a dimension referenced in the model. 762 | 763 | ### query_name 764 | 765 | - **Type:** string 766 | - **Required:** Y 767 | 768 | The query name that the metric or dimension should be resolved to. 769 | 770 | Sample `overrides`: 771 | 772 | ```yaml 773 | overrides: 774 | salesamount: 775 | query_name: deployed query name for metric 776 | Color Dimension: 777 | query_name: deployed query name for dimension 778 | ``` -------------------------------------------------------------------------------- /sml-reference/package.md: -------------------------------------------------------------------------------- 1 | # Package 2 | 3 | Package files enable you to define additional Git repositories whose 4 | objects can be used in the current repository. This enables you to share 5 | individual objects (such as dimensions) across multiple models. 6 | 7 | Sample `package` file: 8 | 9 | ```yaml 10 | packages: 11 | - name: shared 12 | url: https://github.com/company/shared 13 | branch: main 14 | version: 'latest' # 'commit:f35ce2d975cee7c8d95f9e4c93ef4946089950fd', 'tag:v2024.01' 15 | - name: shared2 16 | url: https://github.com/company/shared2 17 | branch: main 18 | version: 'latest' # 'commit:f35ce2d975cee7c8d95f9e4c93ef4946089950fd', 'tag:v2024.01' 19 | ``` 20 | 21 | # Entitity Relationships 22 | 23 | ```mermaid 24 | classDiagram 25 | Packages *-- Package : Contains 26 | class Packages{ 27 | int version 28 | Array~Package~ packages 29 | } 30 | class Package{ 31 | String name 32 | String url 33 | String branch 34 | int version 35 | } 36 | ``` 37 | 38 | # Package Properties 39 | 40 | ## version 41 | 42 | - **Type:** number 43 | - **Required:** Y 44 | 45 | The schema version for the file. The value of this property should be 46 | `1`. 47 | 48 | ## packages 49 | 50 | - **Type:** array 51 | - **Required:** Y 52 | 53 | A list of the Git repositories that the current repository can use 54 | objects from. 55 | 56 | `package` supports the following properties: 57 | 58 | - `name`: String, required. The name of the repository. 59 | - `url`: String, required. The URL for the repository. 60 | - `branch`: String, required. The specific branch from the repository to 61 | use. 62 | - `version`: String, required. The ID for a specific commit to use. This 63 | should begin with `commit:` followed by 8-40 alphanumeric characters. 64 | -------------------------------------------------------------------------------- /sml-reference/row-security.md: -------------------------------------------------------------------------------- 1 | # Row Security 2 | 3 | Row security files enable you to define security objects, which restrict 4 | access to data in a model. These restrictions can be configured at 5 | either the user or the group level. When users run queries against a 6 | model, the semantic engine uses the `row_security` object as a runtime constraint. 7 | 8 | Row security requires a separate dataset that maps user or group IDs to 9 | specific rows in a dimension or fact dataset. Each user or group can 10 | only access the data in rows that match the filter; for example, you can 11 | restrict a user's access to rows relating to specific countries only. 12 | 13 | Once you create a security row object, you can use it to secure other 14 | dimensions and datasets in a model by creating a relationship from the 15 | dataset/dimension you want secured to the security row file. 16 | 17 | Sample `row_security` file: 18 | 19 | ```yaml 20 | unique_name: Country Security Filter 21 | label: Country Security Filter 22 | object_type: row_security 23 | dataset: user_country_mapping 24 | filter_key_column: country 25 | use_filter_key: true 26 | ids_column: username 27 | id_type: group 28 | scope: related 29 | secure_totals: true 30 | ``` 31 | 32 | How to reference the above `row_security` object in a `dimension`: 33 | 34 | ```yaml 35 | - unique_name: GeographyDimension_CustomerDimSecurity 36 | from: 37 | hierarchy: Geography City 38 | level: CountryCity 39 | dataset: dim_geo_country 40 | join_columns: 41 | - country 42 | to: 43 | row_security: Country Security Filter 44 | type: embedded 45 | ``` 46 | 47 | # Entitity Relationships 48 | 49 | ```mermaid 50 | classDiagram 51 | class RowSecurity{ 52 | String unique_name 53 | String label 54 | const object_type 55 | String description 56 | String dataset 57 | String filter_key_column 58 | Boolean use_filter_key 59 | String ids_column 60 | String id_type 61 | String scope 62 | Boolean secure_totals 63 | } 64 | ``` 65 | 66 | # Row Security Properties 67 | 68 | ## unique_name 69 | 70 | - **Type:** string 71 | - **Required:** Y 72 | 73 | The unique name of the security object. This must be unique across the 74 | repositories and all subrepositories. 75 | 76 | ## object_type 77 | 78 | - **Type:** const 79 | - **Required:** Y 80 | 81 | The type of object defined by the file. For row security files, the 82 | value of this property must be `row_security`. 83 | 84 | ## label 85 | 86 | - **Type:** string 87 | - **Required:** Y 88 | 89 | The name of the security object, as it appears in the consunmption tool. This value 90 | does not need to be unique. 91 | 92 | ## dataset 93 | 94 | - **Type:** string 95 | - **Required:** Y 96 | 97 | The dataset that contains the user-to-attribute mappings determining 98 | which rows each user/group can access. 99 | 100 | ## filter_key_column 101 | 102 | - **Type:** string 103 | - **Required:** Y 104 | 105 | The column in the security dataset that defines the rows each user/group 106 | has access to. 107 | 108 | ## ids_column 109 | 110 | - **Type:** string 111 | - **Required:** Y 112 | 113 | The column of the security dataset that contains user/group IDs. 114 | 115 | ## id_type 116 | 117 | - **Type:** string 118 | - **Required:** Y 119 | 120 | Determines whether the IDs are for users or groups. 121 | 122 | Supported values: 123 | 124 | - `user` 125 | - `group` 126 | 127 | ## scope 128 | 129 | - **Type:** string 130 | - **Required:** Y 131 | 132 | Determines which queries the security constraint is applied to. 133 | 134 | Supported values: 135 | 136 | - `related`: The security constraint is applied when the query selects 137 | any dimension or secondary attribute that has a path to the security 138 | dataset, as long as no fact table is used. The security constraint is 139 | *not* applied to dimension-only queries that select multiple 140 | dimensions related through a fact table. 141 | - `fact`: The security constraint is applied to the same queries as the 142 | `related` option, as well as queries that include a measure from a 143 | fact table connected to the secure dimension. The security constraint 144 | is *not* applied to single-dimension-only queries that are related to 145 | the secured dimension via the fact table. However, 146 | multi-dimension-only queries do have security applied because they are 147 | joined using a synthetic measure from the fact table that relates 148 | them. 149 | - `all`: The security constraint is applied to all queries, unless there 150 | is no path to the security dimension. This is the case with two 151 | separate fact tables, each with their own unrelated dimensions. 152 | 153 | ## description 154 | 155 | - **Type:** string 156 | - **Required:** N 157 | 158 | A description of the security object. 159 | 160 | ## use_filter_key 161 | 162 | - **Type:** boolean 163 | - **Required:** N 164 | 165 | Determines how SML enforces security. 166 | 167 | Supported values: 168 | 169 | - `true`: The system first looks up the `filter_key_column` values using 170 | the user or goup's ID, and then uses those values as a constraint in a 171 | second query against the fact dataset or dimension. Some data 172 | warehouses perform better with this option. 173 | - `false`: The system enforces security by joining with the security 174 | table. 175 | 176 | ## secure_totals 177 | 178 | - **Type:** boolean 179 | - **Required:** N 180 | 181 | Enables/disables the secure totals functionality. 182 | 183 | When enabled, the security restriction applies to the following: 184 | 185 | - Subtotal measures of the secured hierarchy level or reachable 186 | attributes of higher levels. 187 | - Queries that select secured fact tables (a `scope` of `all` or 188 | `fact`), but do not select the secured dimension. 189 | - The grouping of the secured level. 190 | - The secured level's secondary attributes. 191 | - Attributes and nested dimensions that are reachable from hierarchy 192 | levels lower than the secured level. 193 | 194 | When secured totals is disabled, the security restriction only applies 195 | to the following: 196 | 197 | - The grouping of the secured level. 198 | - The secured level's secondary attributes. 199 | - Attributes and nested dimensions that are reachable from hierarchy 200 | levels lower than the secured level. 201 | 202 | Supported values: 203 | 204 | - `true` (default) 205 | - `false` 206 | --------------------------------------------------------------------------------