├── .github ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── AzureDataFactory ├── README.md ├── adf-arm-template-cdm-to-dw │ ├── arm_template.json │ └── arm_template_parameters.json ├── adf-arm-template-databricks-cdm-to-dw │ ├── arm_template.json │ └── arm_template_parameters.json ├── arm-template-azure-function-app │ ├── DeploymentHelper.cs │ ├── deploy.ps1 │ ├── deploy.sh │ ├── deployer.rb │ ├── parameters.json │ └── template.json └── azure-function-zip │ ├── Parse.cs │ └── cdmtodwparser.zip ├── AzureDatabricks ├── Library │ └── spark-cdm-assembly-0.3.jar ├── README.md └── Samples │ └── read-write-demo-wide-world-importers.py ├── AzureMachineLearning ├── CdmModel.py ├── README.md └── cdm-customer-classification-demo.ipynb ├── AzureSqlDataWarehouse ├── README.md └── WWI-Sales.sql ├── AzureSqlDatabase ├── README.md └── WideWorldImporters-Standard.bacpac ├── CDM ├── README.md ├── dotnet │ ├── Microsoft.CdmFolders.SampleLibraries.csproj │ ├── Model │ │ ├── Annotation.cs │ │ ├── AnnotationCollection.cs │ │ ├── Attribute.cs │ │ ├── AttributeCollection.cs │ │ ├── AttributeReference.cs │ │ ├── CsvFormatSettings.cs │ │ ├── CsvQuoteStyle.cs │ │ ├── CsvStyle.cs │ │ ├── DataObject.cs │ │ ├── DataType.cs │ │ ├── Entity.cs │ │ ├── EntityCollection.cs │ │ ├── FileFormatSettings.cs │ │ ├── LocalEntity.cs │ │ ├── MetadataObject.cs │ │ ├── MetadataObjectCollection.cs │ │ ├── Model.cs │ │ ├── ObjectCollection.cs │ │ ├── Partition.cs │ │ ├── PartitionCollection.cs │ │ ├── ReferenceEntity.cs │ │ ├── ReferenceModel.cs │ │ ├── ReferenceModelCollection.cs │ │ ├── Relationship.cs │ │ ├── RelationshipCollection.cs │ │ ├── SchemaCollection.cs │ │ ├── SchemaEntityInfo.cs │ │ ├── SingleKeyRelationship.cs │ │ └── UriExtensions.cs │ ├── README.md │ └── SerializationHelpers │ │ ├── CollectionsContractResolver.cs │ │ ├── SerializationOrderConstants.cs │ │ ├── StringEnumCamelCaseConverter.cs │ │ ├── TypeNameSerializationBinder.cs │ │ └── TypeNameSerializationBinderHelper.cs ├── python │ ├── CdmModel.py │ └── README.md └── schema │ ├── README.md │ ├── examples │ ├── OrdersProducts │ │ └── model.json │ └── OrdersProductsCustomersLinked │ │ └── model.json │ └── modeljsonschema.json ├── CHANGELOG.md ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md ├── Tutorial ├── CDM-Azure-Data-Services-Integration-Tutorial.md ├── README.md └── media │ ├── adfauthor.png │ ├── adfpipeline.png │ ├── authormonitor.png │ ├── azuresqldb.png │ ├── cdmparserresources.png │ ├── createdataflow.png │ ├── dataflowase.png │ ├── dataflowase2.png │ ├── dataflowdone.png │ ├── dataflowsettings.png │ ├── folderlocation.png │ ├── mountcdmpipeline.png │ ├── overview.png │ ├── pipelineparameters.png │ ├── refreshdataflow.png │ ├── refresheddataflow.png │ ├── savedataflow.png │ ├── selecttables.png │ └── ssmsWWI.png └── Using ADF Mapping Data Flows with CDM.pdf /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 4 | > Please provide us with the following information: 5 | > --------------------------------------------------------------- 6 | 7 | ### This issue is for a: (mark with an `x`) 8 | ``` 9 | - [ ] bug report -> please search issues before submitting 10 | - [ ] feature request 11 | - [ ] documentation issue or request 12 | - [ ] regression (a behavior that used to work and stopped in a new release) 13 | ``` 14 | 15 | ### Minimal steps to reproduce 16 | > 17 | 18 | ### Any log messages given by the failure 19 | > 20 | 21 | ### Expected/desired behavior 22 | > 23 | 24 | ### OS and Version? 25 | > Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) 26 | 27 | ### Versions 28 | > 29 | 30 | ### Mention any other details that might be useful 31 | 32 | > --------------------------------------------------------------- 33 | > Thanks! We'll be in touch soon. 34 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Purpose 2 | 3 | * ... 4 | 5 | ## Does this introduce a breaking change? 6 | 7 | ``` 8 | [ ] Yes 9 | [ ] No 10 | ``` 11 | 12 | ## Pull Request Type 13 | What kind of change does this Pull Request introduce? 14 | 15 | 16 | ``` 17 | [ ] Bugfix 18 | [ ] Feature 19 | [ ] Code style update (formatting, local variables) 20 | [ ] Refactoring (no functional changes, no api changes) 21 | [ ] Documentation content changes 22 | [ ] Other... Please describe: 23 | ``` 24 | 25 | ## How to Test 26 | * Get the code 27 | 28 | ``` 29 | git clone [repo-address] 30 | cd [repo-name] 31 | git checkout [branch-name] 32 | npm install 33 | ``` 34 | 35 | * Test the code 36 | 37 | ``` 38 | ``` 39 | 40 | ## What to Check 41 | Verify that the following are valid 42 | * ... 43 | 44 | ## Other Information 45 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.suo 8 | *.user 9 | *.userosscache 10 | *.sln.docstates 11 | 12 | # User-specific files (MonoDevelop/Xamarin Studio) 13 | *.userprefs 14 | 15 | # Build results 16 | [Dd]ebug/ 17 | [Dd]ebugPublic/ 18 | [Rr]elease/ 19 | [Rr]eleases/ 20 | x64/ 21 | x86/ 22 | bld/ 23 | [Bb]in/ 24 | [Oo]bj/ 25 | [Ll]og/ 26 | 27 | # Visual Studio 2015/2017 cache/options directory 28 | .vs/ 29 | # Uncomment if you have tasks that create the project's static files in wwwroot 30 | #wwwroot/ 31 | 32 | # Visual Studio 2017 auto generated files 33 | Generated\ Files/ 34 | 35 | # MSTest test Results 36 | [Tt]est[Rr]esult*/ 37 | [Bb]uild[Ll]og.* 38 | 39 | # NUNIT 40 | *.VisualState.xml 41 | TestResult.xml 42 | 43 | # Build Results of an ATL Project 44 | [Dd]ebugPS/ 45 | [Rr]eleasePS/ 46 | dlldata.c 47 | 48 | # Benchmark Results 49 | BenchmarkDotNet.Artifacts/ 50 | 51 | # .NET Core 52 | project.lock.json 53 | project.fragment.lock.json 54 | artifacts/ 55 | **/Properties/launchSettings.json 56 | 57 | # StyleCop 58 | StyleCopReport.xml 59 | 60 | # Files built by Visual Studio 61 | *_i.c 62 | *_p.c 63 | *_i.h 64 | *.ilk 65 | *.meta 66 | *.obj 67 | *.iobj 68 | *.pch 69 | *.pdb 70 | *.ipdb 71 | *.pgc 72 | *.pgd 73 | *.rsp 74 | *.sbr 75 | *.tlb 76 | *.tli 77 | *.tlh 78 | *.tmp 79 | *.tmp_proj 80 | *.log 81 | *.vspscc 82 | *.vssscc 83 | .builds 84 | *.pidb 85 | *.svclog 86 | *.scc 87 | 88 | # Chutzpah Test files 89 | _Chutzpah* 90 | 91 | # Visual C++ cache files 92 | ipch/ 93 | *.aps 94 | *.ncb 95 | *.opendb 96 | *.opensdf 97 | *.sdf 98 | *.cachefile 99 | *.VC.db 100 | *.VC.VC.opendb 101 | 102 | # Visual Studio profiler 103 | *.psess 104 | *.vsp 105 | *.vspx 106 | *.sap 107 | 108 | # Visual Studio Trace Files 109 | *.e2e 110 | 111 | # TFS 2012 Local Workspace 112 | $tf/ 113 | 114 | # Guidance Automation Toolkit 115 | *.gpState 116 | 117 | # ReSharper is a .NET coding add-in 118 | _ReSharper*/ 119 | *.[Rr]e[Ss]harper 120 | *.DotSettings.user 121 | 122 | # JustCode is a .NET coding add-in 123 | .JustCode 124 | 125 | # TeamCity is a build add-in 126 | _TeamCity* 127 | 128 | # DotCover is a Code Coverage Tool 129 | *.dotCover 130 | 131 | # AxoCover is a Code Coverage Tool 132 | .axoCover/* 133 | !.axoCover/settings.json 134 | 135 | # Visual Studio code coverage results 136 | *.coverage 137 | *.coveragexml 138 | 139 | # NCrunch 140 | _NCrunch_* 141 | .*crunch*.local.xml 142 | nCrunchTemp_* 143 | 144 | # MightyMoose 145 | *.mm.* 146 | AutoTest.Net/ 147 | 148 | # Web workbench (sass) 149 | .sass-cache/ 150 | 151 | # Installshield output folder 152 | [Ee]xpress/ 153 | 154 | # DocProject is a documentation generator add-in 155 | DocProject/buildhelp/ 156 | DocProject/Help/*.HxT 157 | DocProject/Help/*.HxC 158 | DocProject/Help/*.hhc 159 | DocProject/Help/*.hhk 160 | DocProject/Help/*.hhp 161 | DocProject/Help/Html2 162 | DocProject/Help/html 163 | 164 | # Click-Once directory 165 | publish/ 166 | 167 | # Publish Web Output 168 | *.[Pp]ublish.xml 169 | *.azurePubxml 170 | # Note: Comment the next line if you want to checkin your web deploy settings, 171 | # but database connection strings (with potential passwords) will be unencrypted 172 | *.pubxml 173 | *.publishproj 174 | 175 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 176 | # checkin your Azure Web App publish settings, but sensitive information contained 177 | # in these scripts will be unencrypted 178 | PublishScripts/ 179 | 180 | # NuGet Packages 181 | *.nupkg 182 | # The packages folder can be ignored because of Package Restore 183 | **/[Pp]ackages/* 184 | # except build/, which is used as an MSBuild target. 185 | !**/[Pp]ackages/build/ 186 | # Uncomment if necessary however generally it will be regenerated when needed 187 | #!**/[Pp]ackages/repositories.config 188 | # NuGet v3's project.json files produces more ignorable files 189 | *.nuget.props 190 | *.nuget.targets 191 | 192 | # Microsoft Azure Build Output 193 | csx/ 194 | *.build.csdef 195 | 196 | # Microsoft Azure Emulator 197 | ecf/ 198 | rcf/ 199 | 200 | # Windows Store app package directories and files 201 | AppPackages/ 202 | BundleArtifacts/ 203 | Package.StoreAssociation.xml 204 | _pkginfo.txt 205 | *.appx 206 | 207 | # Visual Studio cache files 208 | # files ending in .cache can be ignored 209 | *.[Cc]ache 210 | # but keep track of directories ending in .cache 211 | !*.[Cc]ache/ 212 | 213 | # Others 214 | ClientBin/ 215 | ~$* 216 | *~ 217 | *.dbmdl 218 | *.dbproj.schemaview 219 | *.jfm 220 | *.pfx 221 | *.publishsettings 222 | orleans.codegen.cs 223 | 224 | # Including strong name files can present a security risk 225 | # (https://github.com/github/gitignore/pull/2483#issue-259490424) 226 | #*.snk 227 | 228 | # Since there are multiple workflows, uncomment next line to ignore bower_components 229 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 230 | #bower_components/ 231 | 232 | # RIA/Silverlight projects 233 | Generated_Code/ 234 | 235 | # Backup & report files from converting an old project file 236 | # to a newer Visual Studio version. Backup files are not needed, 237 | # because we have git ;-) 238 | _UpgradeReport_Files/ 239 | Backup*/ 240 | UpgradeLog*.XML 241 | UpgradeLog*.htm 242 | ServiceFabricBackup/ 243 | *.rptproj.bak 244 | 245 | # SQL Server files 246 | *.mdf 247 | *.ldf 248 | *.ndf 249 | 250 | # Business Intelligence projects 251 | *.rdl.data 252 | *.bim.layout 253 | *.bim_*.settings 254 | *.rptproj.rsuser 255 | 256 | # Microsoft Fakes 257 | FakesAssemblies/ 258 | 259 | # GhostDoc plugin setting file 260 | *.GhostDoc.xml 261 | 262 | # Node.js Tools for Visual Studio 263 | .ntvs_analysis.dat 264 | node_modules/ 265 | 266 | # Visual Studio 6 build log 267 | *.plg 268 | 269 | # Visual Studio 6 workspace options file 270 | *.opt 271 | 272 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 273 | *.vbw 274 | 275 | # Visual Studio LightSwitch build output 276 | **/*.HTMLClient/GeneratedArtifacts 277 | **/*.DesktopClient/GeneratedArtifacts 278 | **/*.DesktopClient/ModelManifest.xml 279 | **/*.Server/GeneratedArtifacts 280 | **/*.Server/ModelManifest.xml 281 | _Pvt_Extensions 282 | 283 | # Paket dependency manager 284 | .paket/paket.exe 285 | paket-files/ 286 | 287 | # FAKE - F# Make 288 | .fake/ 289 | 290 | # JetBrains Rider 291 | .idea/ 292 | *.sln.iml 293 | 294 | # CodeRush 295 | .cr/ 296 | 297 | # Python Tools for Visual Studio (PTVS) 298 | __pycache__/ 299 | *.pyc 300 | 301 | # Cake - Uncomment if you are using it 302 | # tools/** 303 | # !tools/packages.config 304 | 305 | # Tabs Studio 306 | *.tss 307 | 308 | # Telerik's JustMock configuration file 309 | *.jmconfig 310 | 311 | # BizTalk build output 312 | *.btp.cs 313 | *.btm.cs 314 | *.odx.cs 315 | *.xsd.cs 316 | 317 | # OpenCover UI analysis results 318 | OpenCover/ 319 | 320 | # Azure Stream Analytics local run output 321 | ASALocalRun/ 322 | 323 | # MSBuild Binary and Structured Log 324 | *.binlog 325 | 326 | # NVidia Nsight GPU debugger configuration file 327 | *.nvuser 328 | 329 | # MFractors (Xamarin productivity tool) working folder 330 | .mfractor/ 331 | -------------------------------------------------------------------------------- /AzureDataFactory/README.md: -------------------------------------------------------------------------------- 1 | 2 | # OBSOLETE 3 | 4 | For information on how to use Azure Data Factory mapping data flows to read and write CDM entity data, see [Using ADF Mapping Data Flows with CDM.pdf](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/Using%20ADF%20Mapping%20Data%20Flows%20with%20CDM.pdf) 5 | 6 | --- 7 | 8 | 9 | 10 | # Use Azure Data Factory to load data from a CDM folder into SQL Data Warehouse 11 | 12 | This directory contains the usage details, samples and library to orchestrate your entire workflow with an Azure Data Factory pipeline. 13 | 14 | ## Content 15 | * *adf-arm-template-cdm-to-dw* - deploy this ARM template for the data factory and all its entities for the workflow that copies data from the new CDM tolder and loads it into DW 16 | * *adf-arm-template-databricks-cdm-to-dw/* - deploy this ARM template for the data factory and all its entities if you wish to orchestrate the _entire_ Azure Data Services flow. This pipeline will invoke the Databricks data preperation notebook as well as invoke the pipeline that copies data from the new CDM tolder and loads it into DW 17 | tolder and loads it into DW 18 | * *arm-template-azure-function-app* - template to deploy the function app you will need to host your azure function. 19 | * *sample-azure-function* - you will need to deploy this Azure function that will read the entity definitions and translate it into SQL scripts to create the staging tables. This will be invoked by the data factory pipeline. See "Parse.cs" for what code the function is executing. 20 | -------------------------------------------------------------------------------- /AzureDataFactory/adf-arm-template-cdm-to-dw/arm_template_parameters.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "factoryName": { 6 | "value": "CdmToSqlDw" 7 | }, 8 | "AzureStorage_connectionString": { 9 | "value": "" 10 | }, 11 | "AzureSqlDW_connectionString": { 12 | "value": "" 13 | }, 14 | "ADLSGen2_url": { 15 | "value": "" 16 | }, 17 | "ADLSGen2_accountKey": { 18 | "value": "" 19 | } 20 | } 21 | } -------------------------------------------------------------------------------- /AzureDataFactory/adf-arm-template-databricks-cdm-to-dw/arm_template_parameters.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "factoryName": { 6 | "value": "" 7 | }, 8 | "AzureSqlDW_connectionString": { 9 | "value": "" 10 | }, 11 | "ADLSGen2_accountKey": { 12 | "value": "" 13 | }, 14 | "AzureStorage_connectionString": { 15 | "value": "" 16 | }, 17 | "CDMParser_URL": { 18 | "value": "https://.azurewebsites.net/api/Parse" 19 | }, 20 | "AzureDatabricks_domain": { 21 | "value": "" 22 | }, 23 | "AzureDatabricks_accessToken": { 24 | "value": "" 25 | }, 26 | "ADLSGen2_url": { 27 | "value": "" 28 | } 29 | } 30 | } -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/DeploymentHelper.cs: -------------------------------------------------------------------------------- 1 | // Requires the following Azure NuGet packages and related dependencies: 2 | // package id="Microsoft.Azure.Management.Authorization" version="2.0.0" 3 | // package id="Microsoft.Azure.Management.ResourceManager" version="1.4.0-preview" 4 | // package id="Microsoft.Rest.ClientRuntime.Azure.Authentication" version="2.2.8-preview" 5 | 6 | using Microsoft.Azure.Management.ResourceManager; 7 | using Microsoft.Azure.Management.ResourceManager.Models; 8 | using Microsoft.Rest.Azure.Authentication; 9 | using Newtonsoft.Json; 10 | using Newtonsoft.Json.Linq; 11 | using System; 12 | using System.IO; 13 | 14 | namespace PortalGenerated 15 | { 16 | /// 17 | /// This is a helper class for deploying an Azure Resource Manager template 18 | /// More info about template deployments can be found here https://go.microsoft.com/fwLink/?LinkID=733371 19 | /// 20 | class DeploymentHelper 21 | { 22 | string subscriptionId = "your-subscription-id"; 23 | string clientId = "your-service-principal-clientId"; 24 | string clientSecret = "your-service-principal-client-secret"; 25 | string resourceGroupName = "resource-group-name"; 26 | string deploymentName = "deployment-name"; 27 | string resourceGroupLocation = "resource-group-location"; // must be specified for creating a new resource group 28 | string pathToTemplateFile = "path-to-template.json-on-disk"; 29 | string pathToParameterFile = "path-to-parameters.json-on-disk"; 30 | string tenantId = "tenant-id"; 31 | 32 | public async void Run() 33 | { 34 | // Try to obtain the service credentials 35 | var serviceCreds = await ApplicationTokenProvider.LoginSilentAsync(tenantId, clientId, clientSecret); 36 | 37 | // Read the template and parameter file contents 38 | JObject templateFileContents = GetJsonFileContents(pathToTemplateFile); 39 | JObject parameterFileContents = GetJsonFileContents(pathToParameterFile); 40 | 41 | // Create the resource manager client 42 | var resourceManagementClient = new ResourceManagementClient(serviceCreds); 43 | resourceManagementClient.SubscriptionId = subscriptionId; 44 | 45 | // Create or check that resource group exists 46 | EnsureResourceGroupExists(resourceManagementClient, resourceGroupName, resourceGroupLocation); 47 | 48 | // Start a deployment 49 | DeployTemplate(resourceManagementClient, resourceGroupName, deploymentName, templateFileContents, parameterFileContents); 50 | } 51 | 52 | /// 53 | /// Reads a JSON file from the specified path 54 | /// 55 | /// The full path to the JSON file 56 | /// The JSON file contents 57 | private JObject GetJsonFileContents(string pathToJson) 58 | { 59 | JObject templatefileContent = new JObject(); 60 | using (StreamReader file = File.OpenText(pathToJson)) 61 | { 62 | using (JsonTextReader reader = new JsonTextReader(file)) 63 | { 64 | templatefileContent = (JObject)JToken.ReadFrom(reader); 65 | return templatefileContent; 66 | } 67 | } 68 | } 69 | 70 | /// 71 | /// Ensures that a resource group with the specified name exists. If it does not, will attempt to create one. 72 | /// 73 | /// The resource manager client. 74 | /// The name of the resource group. 75 | /// The resource group location. Required when creating a new resource group. 76 | private static void EnsureResourceGroupExists(ResourceManagementClient resourceManagementClient, string resourceGroupName, string resourceGroupLocation) 77 | { 78 | if (resourceManagementClient.ResourceGroups.CheckExistence(resourceGroupName) != true) 79 | { 80 | Console.WriteLine(string.Format("Creating resource group '{0}' in location '{1}'", resourceGroupName, resourceGroupLocation)); 81 | var resourceGroup = new ResourceGroup(); 82 | resourceGroup.Location = resourceGroupLocation; 83 | resourceManagementClient.ResourceGroups.CreateOrUpdate(resourceGroupName, resourceGroup); 84 | } 85 | else 86 | { 87 | Console.WriteLine(string.Format("Using existing resource group '{0}'", resourceGroupName)); 88 | } 89 | } 90 | 91 | /// 92 | /// Starts a template deployment. 93 | /// 94 | /// The resource manager client. 95 | /// The name of the resource group. 96 | /// The name of the deployment. 97 | /// The template file contents. 98 | /// The parameter file contents. 99 | private static void DeployTemplate(ResourceManagementClient resourceManagementClient, string resourceGroupName, string deploymentName, JObject templateFileContents, JObject parameterFileContents) 100 | { 101 | Console.WriteLine(string.Format("Starting template deployment '{0}' in resource group '{1}'", deploymentName, resourceGroupName)); 102 | var deployment = new Deployment(); 103 | 104 | deployment.Properties = new DeploymentProperties 105 | { 106 | Mode = DeploymentMode.Incremental, 107 | Template = templateFileContents, 108 | Parameters = parameterFileContents["parameters"].ToObject() 109 | }; 110 | 111 | var deploymentResult = resourceManagementClient.Deployments.CreateOrUpdate(resourceGroupName, deploymentName, deployment); 112 | Console.WriteLine(string.Format("Deployment status: {0}", deploymentResult.Properties.ProvisioningState)); 113 | } 114 | } 115 | } -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/deploy.ps1: -------------------------------------------------------------------------------- 1 | <# 2 | .SYNOPSIS 3 | Deploys a template to Azure 4 | 5 | .DESCRIPTION 6 | Deploys an Azure Resource Manager template 7 | 8 | .PARAMETER subscriptionId 9 | The subscription id where the template will be deployed. 10 | 11 | .PARAMETER resourceGroupName 12 | The resource group where the template will be deployed. Can be the name of an existing or a new resource group. 13 | 14 | .PARAMETER resourceGroupLocation 15 | Optional, a resource group location. If specified, will try to create a new resource group in this location. If not specified, assumes resource group is existing. 16 | 17 | .PARAMETER deploymentName 18 | The deployment name. 19 | 20 | .PARAMETER templateFilePath 21 | Optional, path to the template file. Defaults to template.json. 22 | 23 | .PARAMETER parametersFilePath 24 | Optional, path to the parameters file. Defaults to parameters.json. If file is not found, will prompt for parameter values based on template. 25 | #> 26 | 27 | param( 28 | [Parameter(Mandatory=$True)] 29 | [string] 30 | $subscriptionId, 31 | 32 | [Parameter(Mandatory=$True)] 33 | [string] 34 | $resourceGroupName, 35 | 36 | [string] 37 | $resourceGroupLocation, 38 | 39 | [Parameter(Mandatory=$True)] 40 | [string] 41 | $deploymentName, 42 | 43 | [string] 44 | $templateFilePath = "template.json", 45 | 46 | [string] 47 | $parametersFilePath = "parameters.json" 48 | ) 49 | 50 | <# 51 | .SYNOPSIS 52 | Registers RPs 53 | #> 54 | Function RegisterRP { 55 | Param( 56 | [string]$ResourceProviderNamespace 57 | ) 58 | 59 | Write-Host "Registering resource provider '$ResourceProviderNamespace'"; 60 | Register-AzureRmResourceProvider -ProviderNamespace $ResourceProviderNamespace; 61 | } 62 | 63 | #****************************************************************************** 64 | # Script body 65 | # Execution begins here 66 | #****************************************************************************** 67 | $ErrorActionPreference = "Stop" 68 | 69 | # sign in 70 | $context = Get-AzureRmContext; 71 | if(!$context.Account) { 72 | Write-Host "Logging in..."; 73 | Login-AzureRmAccount; 74 | } 75 | 76 | # select subscription 77 | Write-Host "Selecting subscription '$subscriptionId'"; 78 | Select-AzureRmSubscription -SubscriptionID $subscriptionId; 79 | 80 | # Register RPs 81 | $resourceProviders = @("microsoft.storage","microsoft.web"); 82 | if($resourceProviders.length) { 83 | Write-Host "Registering resource providers" 84 | foreach($resourceProvider in $resourceProviders) { 85 | RegisterRP($resourceProvider); 86 | } 87 | } 88 | 89 | #Create or check for existing resource group 90 | $resourceGroup = Get-AzureRmResourceGroup -Name $resourceGroupName -ErrorAction SilentlyContinue 91 | if(!$resourceGroup) { 92 | Write-Host "Resource group '$resourceGroupName' does not exist. To create a new resource group, please enter a location."; 93 | if(!$resourceGroupLocation) { 94 | $resourceGroupLocation = Read-Host "resourceGroupLocation"; 95 | } 96 | Write-Host "Creating resource group '$resourceGroupName' in location '$resourceGroupLocation'"; 97 | New-AzureRmResourceGroup -Name $resourceGroupName -Location $resourceGroupLocation 98 | } 99 | else{ 100 | Write-Host "Using existing resource group '$resourceGroupName'"; 101 | } 102 | 103 | # Start the deployment 104 | Write-Host "Starting deployment..."; 105 | if($parametersFilePath -and (Test-Path $parametersFilePath)) { 106 | New-AzureRmResourceGroupDeployment -ResourceGroupName $resourceGroupName -TemplateFile $templateFilePath -TemplateParameterFile $parametersFilePath; 107 | } else { 108 | New-AzureRmResourceGroupDeployment -ResourceGroupName $resourceGroupName -TemplateFile $templateFilePath; 109 | } -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/deploy.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -euo pipefail 3 | IFS=$'\n\t' 4 | 5 | # -e: immediately exit if any command has a non-zero exit status 6 | # -o: prevents errors in a pipeline from being masked 7 | # IFS new value is less likely to cause confusing bugs when looping arrays or arguments (e.g. $@) 8 | 9 | usage() { echo "Usage: $0 -i -g -n -l " 1>&2; exit 1; } 10 | 11 | declare subscriptionId="" 12 | declare resourceGroupName="" 13 | declare deploymentName="" 14 | declare resourceGroupLocation="" 15 | 16 | # Initialize parameters specified from command line 17 | while getopts ":i:g:n:l:" arg; do 18 | case "${arg}" in 19 | i) 20 | subscriptionId=${OPTARG} 21 | ;; 22 | g) 23 | resourceGroupName=${OPTARG} 24 | ;; 25 | n) 26 | deploymentName=${OPTARG} 27 | ;; 28 | l) 29 | resourceGroupLocation=${OPTARG} 30 | ;; 31 | esac 32 | done 33 | shift $((OPTIND-1)) 34 | 35 | #Prompt for parameters is some required parameters are missing 36 | if [[ -z "$subscriptionId" ]]; then 37 | echo "Your subscription ID can be looked up with the CLI using: az account show --out json " 38 | echo "Enter your subscription ID:" 39 | read subscriptionId 40 | [[ "${subscriptionId:?}" ]] 41 | fi 42 | 43 | if [[ -z "$resourceGroupName" ]]; then 44 | echo "This script will look for an existing resource group, otherwise a new one will be created " 45 | echo "You can create new resource groups with the CLI using: az group create " 46 | echo "Enter a resource group name" 47 | read resourceGroupName 48 | [[ "${resourceGroupName:?}" ]] 49 | fi 50 | 51 | if [[ -z "$deploymentName" ]]; then 52 | echo "Enter a name for this deployment:" 53 | read deploymentName 54 | fi 55 | 56 | if [[ -z "$resourceGroupLocation" ]]; then 57 | echo "If creating a *new* resource group, you need to set a location " 58 | echo "You can lookup locations with the CLI using: az account list-locations " 59 | 60 | echo "Enter resource group location:" 61 | read resourceGroupLocation 62 | fi 63 | 64 | #templateFile Path - template file to be used 65 | templateFilePath="template.json" 66 | 67 | if [ ! -f "$templateFilePath" ]; then 68 | echo "$templateFilePath not found" 69 | exit 1 70 | fi 71 | 72 | #parameter file path 73 | parametersFilePath="parameters.json" 74 | 75 | if [ ! -f "$parametersFilePath" ]; then 76 | echo "$parametersFilePath not found" 77 | exit 1 78 | fi 79 | 80 | if [ -z "$subscriptionId" ] || [ -z "$resourceGroupName" ] || [ -z "$deploymentName" ]; then 81 | echo "Either one of subscriptionId, resourceGroupName, deploymentName is empty" 82 | usage 83 | fi 84 | 85 | #login to azure using your credentials 86 | az account show 1> /dev/null 87 | 88 | if [ $? != 0 ]; 89 | then 90 | az login 91 | fi 92 | 93 | #set the default subscription id 94 | az account set --subscription $subscriptionId 95 | 96 | set +e 97 | 98 | #Check for existing RG 99 | az group show --name $resourceGroupName 1> /dev/null 100 | 101 | if [ $? != 0 ]; then 102 | echo "Resource group with name" $resourceGroupName "could not be found. Creating new resource group.." 103 | set -e 104 | ( 105 | set -x 106 | az group create --name $resourceGroupName --location $resourceGroupLocation 1> /dev/null 107 | ) 108 | else 109 | echo "Using existing resource group..." 110 | fi 111 | 112 | #Start deployment 113 | echo "Starting deployment..." 114 | ( 115 | set -x 116 | az group deployment create --name "$deploymentName" --resource-group "$resourceGroupName" --template-file "$templateFilePath" --parameters "@${parametersFilePath}" 117 | ) 118 | 119 | if [ $? == 0 ]; 120 | then 121 | echo "Template has been successfully deployed" 122 | fi 123 | -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/deployer.rb: -------------------------------------------------------------------------------- 1 | require 'azure_mgmt_resources' 2 | 3 | class Deployer 4 | 5 | # Initialize the deployer class with subscription, resource group and resource group location. The class will raise an 6 | # ArgumentError if there are empty values for Tenant Id, Client Id or Client Secret environment variables. 7 | # 8 | # @param [String] subscription_id the subscription to deploy the template 9 | # @param [String] resource_group the resource group to create or update and then deploy the template 10 | # @param [String] resource_group_location the location of the resource group 11 | def initialize(subscription_id, resource_group, resource_group_location) 12 | raise ArgumentError.new("Missing template file 'template.json' in current directory.") unless File.exist?('template.json') 13 | raise ArgumentError.new("Missing parameters file 'parameters.json' in current directory.") unless File.exist?('parameters.json') 14 | @resource_group = resource_group 15 | @subscription_id = subscription_id 16 | @resource_group_location = resource_group_location 17 | provider = MsRestAzure::ApplicationTokenProvider.new( 18 | ENV['AZURE_TENANT_ID'], 19 | ENV['AZURE_CLIENT_ID'], 20 | ENV['AZURE_CLIENT_SECRET']) 21 | credentials = MsRest::TokenCredentials.new(provider) 22 | @client = Azure::ARM::Resources::ResourceManagementClient.new(credentials) 23 | @client.subscription_id = @subscription_id 24 | end 25 | 26 | # Deploy the template to a resource group 27 | def deploy 28 | # ensure the resource group is created 29 | params = Azure::ARM::Resources::Models::ResourceGroup.new.tap do |rg| 30 | rg.location = @resource_group_location 31 | end 32 | @client.resource_groups.create_or_update(@resource_group, params).value! 33 | 34 | # build the deployment from a json file template from parameters 35 | template = File.read(File.expand_path(File.join(__dir__, 'template.json'))) 36 | deployment = Azure::ARM::Resources::Models::Deployment.new 37 | deployment.properties = Azure::ARM::Resources::Models::DeploymentProperties.new 38 | deployment.properties.template = JSON.parse(template) 39 | deployment.properties.mode = Azure::ARM::Resources::Models::DeploymentMode::Incremental 40 | 41 | # build the deployment template parameters from Hash to {key: {value: value}} format 42 | deploy_params = File.read(File.expand_path(File.join(__dir__, 'parameters.json'))) 43 | deployment.properties.parameters = JSON.parse(deploy_params)["parameters"] 44 | 45 | # put the deployment to the resource group 46 | @client.deployments.create_or_update(@resource_group, 'azure-sample', deployment) 47 | end 48 | end 49 | 50 | # Get user inputs and execute the script 51 | if(ARGV.empty?) 52 | puts "Please specify subscriptionId resourceGroupName resourceGroupLocation as command line arguments" 53 | exit 54 | end 55 | 56 | subscription_id = ARGV[0] # Azure Subscription Id 57 | resource_group = ARGV[1] # The resource group for deployment 58 | resource_group_location = ARGV[2] # The resource group location 59 | 60 | msg = "\nInitializing the Deployer class with subscription id: #{subscription_id}, resource group: #{resource_group}" 61 | msg += "\nand resource group location: #{resource_group_location}...\n\n" 62 | puts msg 63 | 64 | # Initialize the deployer class 65 | deployer = Deployer.new(subscription_id, resource_group, resource_group_location) 66 | 67 | puts "Beginning the deployment... \n\n" 68 | # Deploy the template 69 | deployment = deployer.deploy 70 | 71 | puts "Done deploying!!" -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/parameters.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "sites_cdmparser_name": { 6 | "value": "", 7 | "metadata": { 8 | "description": "Globally unique name that identifies your new function app. Valid characters are a-z, 0-9, and -." 9 | } 10 | }, 11 | "serverfarms_cdmparserPlan_name": { 12 | "value": "", 13 | "metadata": { 14 | "description": "The Azure function app runs on an App Service Plan. This is a unique name for the name of your app service plan." 15 | } 16 | }, 17 | "storageAccounts_cdmparserstorage_name": { 18 | "value": "", 19 | "metadata": { 20 | "description": "Name of the storage account used by your function app. You can create a new one or use an existing one. Must be lowercase alphanumeric characters, globally unique, and between 3 and 24 characters in length." 21 | } 22 | }, 23 | "hostNameBindings_cdmparser.azurewebsites.net_name": { 24 | "value": "", 25 | "metadata": { 26 | "description": "This will be in format of '.azurewebsites.net'" 27 | } 28 | } 29 | } 30 | } -------------------------------------------------------------------------------- /AzureDataFactory/arm-template-azure-function-app/template.json: -------------------------------------------------------------------------------- 1 | { 2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#", 3 | "contentVersion": "1.0.0.0", 4 | "parameters": { 5 | "sites_cdmparser_name": { 6 | "type": "string", 7 | "defaultValue": "[concat('CDM-', uniqueString(subscription().subscriptionId, resourceGroup().id), '-Site')]" 8 | }, 9 | "serverfarms_cdmparserPlan_name": { 10 | "type": "string", 11 | "defaultValue": "[concat('CDM-', uniqueString(subscription().subscriptionId, resourceGroup().id), '-Plan')]" 12 | }, 13 | "storageAccounts_cdmparserstorage_name": { 14 | "type": "string", 15 | "defaultValue": "[concat('cdm', uniqueString(subscription().subscriptionId, resourceGroup().id), 'stg')]" 16 | }, 17 | "hostNameBindings_cdmparser.azurewebsites.net_name": { 18 | "type": "string", 19 | "defaultValue": "[concat(parameters('sites_cdmparser_name'), '.azurewebsites.net')]" 20 | } 21 | }, 22 | "variables": {}, 23 | "resources": [ 24 | { 25 | "type": "Microsoft.Storage/storageAccounts", 26 | "sku": { 27 | "name": "Standard_LRS", 28 | "tier": "Standard" 29 | }, 30 | "kind": "Storage", 31 | "name": "[parameters('storageAccounts_cdmparserstorage_name')]", 32 | "apiVersion": "2018-07-01", 33 | "location": "centralus", 34 | "tags": {}, 35 | "scale": null, 36 | "properties": { 37 | "networkAcls": { 38 | "bypass": "AzureServices", 39 | "virtualNetworkRules": [], 40 | "ipRules": [], 41 | "defaultAction": "Allow" 42 | }, 43 | "supportsHttpsTrafficOnly": false, 44 | "encryption": { 45 | "services": { 46 | "file": { 47 | "enabled": true 48 | }, 49 | "blob": { 50 | "enabled": true 51 | } 52 | }, 53 | "keySource": "Microsoft.Storage" 54 | } 55 | }, 56 | "dependsOn": [] 57 | }, 58 | { 59 | "type": "Microsoft.Web/serverfarms", 60 | "sku": { 61 | "name": "Y1", 62 | "tier": "Dynamic", 63 | "size": "Y1", 64 | "family": "Y", 65 | "capacity": 0 66 | }, 67 | "kind": "functionapp", 68 | "name": "[parameters('serverfarms_cdmparserPlan_name')]", 69 | "apiVersion": "2016-09-01", 70 | "location": "West US", 71 | "scale": null, 72 | "properties": { 73 | "name": "[parameters('serverfarms_cdmparserPlan_name')]", 74 | "workerTierName": null, 75 | "adminSiteName": null, 76 | "hostingEnvironmentProfile": null, 77 | "perSiteScaling": false, 78 | "reserved": false, 79 | "targetWorkerCount": 0, 80 | "targetWorkerSizeId": 0 81 | }, 82 | "dependsOn": [] 83 | }, 84 | { 85 | "type": "Microsoft.Web/sites", 86 | "kind": "functionapp", 87 | "name": "[parameters('sites_cdmparser_name')]", 88 | "apiVersion": "2016-08-01", 89 | "location": "West US", 90 | "identity": { 91 | "type": "SystemAssigned" 92 | }, 93 | "tags": {}, 94 | "scale": null, 95 | "properties": { 96 | "enabled": true, 97 | "hostNameSslStates": [ 98 | { 99 | "name": "[concat(parameters('sites_cdmparser_name'),'.azurewebsites.net')]", 100 | "sslState": "Disabled", 101 | "virtualIP": null, 102 | "thumbprint": null, 103 | "toUpdate": null, 104 | "hostType": "Standard" 105 | }, 106 | { 107 | "name": "[concat(parameters('sites_cdmparser_name'),'.scm.azurewebsites.net')]", 108 | "sslState": "Disabled", 109 | "virtualIP": null, 110 | "thumbprint": null, 111 | "toUpdate": null, 112 | "hostType": "Repository" 113 | } 114 | ], 115 | "serverFarmId": "[resourceId('Microsoft.Web/serverfarms', parameters('serverfarms_cdmparserPlan_name'))]", 116 | "reserved": false, 117 | "siteConfig": null, 118 | "scmSiteAlsoStopped": false, 119 | "hostingEnvironmentProfile": null, 120 | "clientAffinityEnabled": true, 121 | "clientCertEnabled": false, 122 | "hostNamesDisabled": false, 123 | "containerSize": 1536, 124 | "dailyMemoryTimeQuota": 0, 125 | "cloningInfo": null, 126 | "httpsOnly": false 127 | }, 128 | "dependsOn": [ 129 | "[resourceId('Microsoft.Web/serverfarms', parameters('serverfarms_cdmparserPlan_name'))]" 130 | ] 131 | }, 132 | { 133 | "type": "Microsoft.Web/sites/hostNameBindings", 134 | "name": "[concat(parameters('sites_cdmparser_name'), '/', parameters('hostNameBindings_cdmparser.azurewebsites.net_name'))]", 135 | "apiVersion": "2016-08-01", 136 | "location": "West US", 137 | "scale": null, 138 | "properties": { 139 | "siteName": "cdmparser", 140 | "domainId": null, 141 | "hostNameType": "Verified" 142 | }, 143 | "dependsOn": [ 144 | "[resourceId('Microsoft.Web/sites', parameters('sites_cdmparser_name'))]" 145 | ] 146 | } 147 | ] 148 | } -------------------------------------------------------------------------------- /AzureDataFactory/azure-function-zip/Parse.cs: -------------------------------------------------------------------------------- 1 | // 2 | // Copyright (c) Microsoft. All rights reserved. 3 | // 4 | 5 | using System; 6 | using System.Linq; 7 | using System.Net; 8 | using System.Net.Http; 9 | using System.Net.Http.Formatting; 10 | using System.Text; 11 | using System.Threading.Tasks; 12 | using System.Web; 13 | using Microsoft.Azure.WebJobs; 14 | using Microsoft.Azure.WebJobs.Extensions.Http; 15 | using Microsoft.Azure.WebJobs.Host; 16 | using Microsoft.CdmFolders.ObjectModel; 17 | using Newtonsoft.Json; 18 | using Newtonsoft.Json.Converters; 19 | using Newtonsoft.Json.Linq; 20 | 21 | namespace cdmtosqldw 22 | { 23 | public static class Parse 24 | { 25 | [FunctionName("Parse")] 26 | public static async Task Run([HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = null)]HttpRequestMessage req, TraceWriter log) 27 | { 28 | string requestContent = await req.Content.ReadAsStringAsync(); 29 | JObject payload = JObject.Parse(requestContent); 30 | 31 | string modelJSON = payload["model"].ToString(); 32 | JObject dataTypeMap = payload["dataTypeMap"] as JObject; 33 | 34 | JArray results = new JArray(); 35 | Model model = new Model(); 36 | try 37 | { 38 | model.FromJson(modelJSON); 39 | model.ValidateModel(); 40 | 41 | foreach (LocalEntity entity in model.Entities) 42 | { 43 | JObject entityJObject = new JObject(); 44 | 45 | string createTableQuery = "Create Table {0} ({1})"; 46 | StringBuilder columns = new StringBuilder(); 47 | string tableName = string.Format("[{0}]", entity.Name.Replace(" ", string.Empty)); 48 | 49 | string modifiedTimeJPath = string.Format("$..entities[?(@.name == '{0}')].modifiedTime", entity.Name); 50 | IsoDateTimeConverter dateTimeConvertor = new IsoDateTimeConverter() 51 | { 52 | DateTimeStyles = System.Globalization.DateTimeStyles.AdjustToUniversal, 53 | }; 54 | 55 | string entityModifiedTime = payload.SelectToken(modifiedTimeJPath)?.ToString(Formatting.None, dateTimeConvertor).Replace("\"", ""); 56 | 57 | JArray tableStructure = new JArray(); 58 | foreach (var attr in entity.Attributes) 59 | { 60 | string columnType = attr.DataType.ToString(); 61 | JToken sqlDwTypeToken = null; 62 | if(!dataTypeMap.TryGetValue(columnType, StringComparison.OrdinalIgnoreCase, out sqlDwTypeToken)) 63 | { 64 | throw new Exception($"No type mapping found for {columnType}"); 65 | } 66 | string sqlDwType = sqlDwTypeToken.ToString(); 67 | string column = string.Format("[{0}] {1},", attr.Name, sqlDwType); 68 | columns.Append(column); 69 | 70 | JObject destColumnStructure = new JObject(); 71 | destColumnStructure.Add("name", attr.Name); 72 | destColumnStructure.Add("type", columnType); 73 | tableStructure.Add(destColumnStructure); 74 | } 75 | 76 | string columnsQuery = columns.ToString(); 77 | createTableQuery = string.Format(createTableQuery, tableName, columnsQuery.TrimEnd(',')); 78 | 79 | JArray dataFileLocationsArray = new JArray(); 80 | foreach (var partition in entity.Partitions) 81 | { 82 | string relativePath = HttpUtility.UrlDecode(partition.Location.AbsolutePath); 83 | string folderPath = relativePath.Substring(1, relativePath.LastIndexOf('/')); 84 | string filePath = relativePath.Substring(relativePath.LastIndexOf('/') + 1); 85 | 86 | JObject datafileLocation = new JObject(); 87 | datafileLocation.Add("folderPath", folderPath); 88 | datafileLocation.Add("filePath", filePath); 89 | datafileLocation.Add("refreshTime", partition.RefreshTime.Value.DateTime.ToUniversalTime()); 90 | dataFileLocationsArray.Add(datafileLocation); 91 | } 92 | 93 | entityJObject.Add("name", entity.Name); 94 | entityJObject.Add("tableName", tableName); 95 | entityJObject.Add("modifiedTime", entityModifiedTime); 96 | entityJObject.Add("tableStructure", tableStructure); 97 | entityJObject.Add("query", createTableQuery); 98 | entityJObject.Add("datafileLocations", dataFileLocationsArray); 99 | 100 | results.Add(entityJObject); 101 | } 102 | 103 | JObject response = new JObject(); 104 | response.Add("result", results); 105 | 106 | return req.CreateResponse(HttpStatusCode.OK, response); 107 | } 108 | catch (Exception e) 109 | { 110 | HttpResponseMessage responseMessage = req.CreateResponse(HttpStatusCode.BadRequest, string.Format("Failed to Parse Model. Error {0}", e.Message)); 111 | return responseMessage; 112 | } 113 | } 114 | } 115 | } 116 | -------------------------------------------------------------------------------- /AzureDataFactory/azure-function-zip/cdmtodwparser.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/AzureDataFactory/azure-function-zip/cdmtodwparser.zip -------------------------------------------------------------------------------- /AzureDatabricks/Library/spark-cdm-assembly-0.3.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/AzureDatabricks/Library/spark-cdm-assembly-0.3.jar -------------------------------------------------------------------------------- /AzureDatabricks/README.md: -------------------------------------------------------------------------------- 1 | # Reading and writing CDM folders using Azure Databricks 2 | 3 | This directory contains the usage details, samples and library to read and write CDM folders using Spark data sources and Azure Databricks. 4 | 5 | ## Installing the library on Azure Databricks to read and write CDM folders -------------------------------------------------------------------------------- /AzureDatabricks/Samples/read-write-demo-wide-world-importers.py: -------------------------------------------------------------------------------- 1 | # Databricks notebook source 2 | # MAGIC %md 3 | # MAGIC 4 | # MAGIC # Summary 5 | # MAGIC 6 | # MAGIC ### This notebook reads a CDM folder, applies transformations to some of the entities and then writes out all entities including the modified ones to a new CDM folder 7 | 8 | # COMMAND ---------- 9 | 10 | # MAGIC %md 11 | # MAGIC 12 | # MAGIC # Define input variables 13 | 14 | # COMMAND ---------- 15 | 16 | # Take as inputs the CDM folder locations. Switch to defaults if no inputs are specified 17 | 18 | # Get values from the widgets if specified 19 | dbutils.widgets.text("inputCDMFolderLocation", "", "InputCDMFolderLocation") 20 | dbutils.widgets.text("outputCDMFolderLocation", "","OutputCDMFolderLocation") 21 | inputLocation = dbutils.widgets.get("inputCDMFolderLocation") 22 | outputLocation = dbutils.widgets.get("outputCDMFolderLocation") 23 | 24 | # Default values if no values specified in widgets. Replace and with your values 25 | if inputLocation == '': 26 | inputLocation = "https://.dfs.core.windows.net/powerbi//WideWorldImporters-Sales/model.json" 27 | 28 | if outputLocation == '': 29 | outputLocation = "https://.dfs.core.windows.net/powerbi//WideWorldImporters-Sales-PrepTest" 30 | 31 | # Parameters to authenticate to ADLS Gen 2. Replace with the Azure Key Vault-backed secret scope that you created. Refer to 32 | # https://docs.azuredatabricks.net/user-guide/secrets/index.html for instructions 33 | appID = dbutils.secrets.get(scope = "", key = "appID") 34 | appKey = dbutils.secrets.get(scope = "", key = "appKey") 35 | tenantID = dbutils.secrets.get(scope = "", key = "tenantID") 36 | 37 | # Alternatively, specify the credentials in the notebook but that is not recommended 38 | #appID = "" 39 | #appKey = "" 40 | #tenantID = "" 41 | 42 | 43 | # COMMAND ---------- 44 | 45 | # MAGIC %md 46 | # MAGIC 47 | # MAGIC # Read entities from input CDM folder 48 | 49 | # COMMAND ---------- 50 | 51 | salesOrderDf = (spark.read.format("com.microsoft.cdm") 52 | .option("cdmModel", inputLocation) 53 | .option("entity", "Sales Orders") 54 | .option("appId", appID) 55 | .option("appKey", appKey) 56 | .option("tenantId", tenantID) 57 | .load()) 58 | 59 | # COMMAND ---------- 60 | 61 | salesOrderLinesDf = (spark.read.format("com.microsoft.cdm") 62 | .option("cdmModel", inputLocation) 63 | .option("entity", "Sales OrderLines") 64 | .option("appId", appID) 65 | .option("appKey", appKey) 66 | .option("tenantId", tenantID) 67 | .load()) 68 | 69 | # COMMAND ---------- 70 | 71 | salesCustomerDf = (spark.read.format("com.microsoft.cdm") 72 | .option("cdmModel", inputLocation) 73 | .option("entity", "Sales Customers") 74 | .option("appId", appID) 75 | .option("appKey", appKey) 76 | .option("tenantId", tenantID) 77 | .load()) 78 | 79 | # COMMAND ---------- 80 | 81 | salesCustomerCategoriesDf = (spark.read.format("com.microsoft.cdm") 82 | .option("cdmModel", inputLocation) 83 | .option("entity", "Sales CustomerCategories") 84 | .option("appId", appID) 85 | .option("appKey", appKey) 86 | .option("tenantId", tenantID) 87 | .load()) 88 | 89 | # COMMAND ---------- 90 | 91 | salesBuyingGroupsDf = (spark.read.format("com.microsoft.cdm") 92 | .option("cdmModel", inputLocation) 93 | .option("entity", "Sales BuyingGroups") 94 | .option("appId", appID) 95 | .option("appKey", appKey) 96 | .option("tenantId", tenantID) 97 | .load()) 98 | 99 | # COMMAND ---------- 100 | 101 | warehouseStockItemsDf = (spark.read.format("com.microsoft.cdm") 102 | .option("cdmModel", inputLocation) 103 | .option("entity", "Warehouse StockItems") 104 | .option("appId", appID) 105 | .option("appKey", appKey) 106 | .option("tenantId", tenantID) 107 | .load()) 108 | 109 | # COMMAND ---------- 110 | 111 | warehouseColorsDf = (spark.read.format("com.microsoft.cdm") 112 | .option("cdmModel", inputLocation) 113 | .option("entity", "Warehouse Colors") 114 | .option("appId", appID) 115 | .option("appKey", appKey) 116 | .option("tenantId", tenantID) 117 | .load()) 118 | 119 | # COMMAND ---------- 120 | 121 | warehousePackageTypesDf = (spark.read.format("com.microsoft.cdm") 122 | .option("cdmModel", inputLocation) 123 | .option("entity", "Warehouse PackageTypes") 124 | .option("appId", appID) 125 | .option("appKey", appKey) 126 | .option("tenantId", tenantID) 127 | .load()) 128 | 129 | # COMMAND ---------- 130 | 131 | # MAGIC %md 132 | # MAGIC 133 | # MAGIC # Add an ‘Unassigned’ entry to the buying group entity 134 | 135 | # COMMAND ---------- 136 | 137 | display(salesBuyingGroupsDf) 138 | 139 | # COMMAND ---------- 140 | 141 | from pyspark.sql.types import StructType, StructField, LongType, StringType, DateType 142 | from pyspark.sql.functions import to_date, lit 143 | 144 | unassignedBuyingGroupDf = spark.sql("select -1, 'Unassigned', 0, to_date('2013-01-01 00:00:00.0000000', 'yyyy-MM-dd H:mm:ss.SSSSSSS'), to_date('9999-12-31 23:59:59', 'yyyy-MM-dd H:mm:ss')") 145 | 146 | newSalesBuyingGroupsDf = salesBuyingGroupsDf.union(unassignedBuyingGroupDf) 147 | 148 | display(newSalesBuyingGroupsDf) 149 | 150 | # COMMAND ---------- 151 | 152 | display(newSalesBuyingGroupsDf) 153 | 154 | # COMMAND ---------- 155 | 156 | 157 | 158 | # COMMAND ---------- 159 | 160 | # MAGIC %md 161 | # MAGIC 162 | # MAGIC # Replace NULL BuyingGroupID with -1 (Unassigned) 163 | 164 | # COMMAND ---------- 165 | 166 | from pyspark.sql.functions import col, lit 167 | 168 | salesCustomerDf.filter(col("BuyingGroupId").isNull()).count() 169 | 170 | # COMMAND ---------- 171 | 172 | newSalesCustomerDf = salesCustomerDf.fillna({'BuyingGroupId' : -1}) 173 | 174 | # COMMAND ---------- 175 | 176 | newSalesCustomerDf.filter(col("BuyingGroupId").isNull()).count() 177 | 178 | # COMMAND ---------- 179 | 180 | display(newSalesCustomerDf) 181 | 182 | # COMMAND ---------- 183 | 184 | # MAGIC %md 185 | # MAGIC 186 | # MAGIC # Add computed column for history tracking 187 | 188 | # COMMAND ---------- 189 | 190 | from pyspark.sql.functions import concat 191 | 192 | newSalesCustomerDf = (newSalesCustomerDf.withColumn("ChangeTrackingHash", 193 | concat(col("BuyingGroupID"), 194 | col("StandardDiscountPercentage"), 195 | col("IsOnCreditHold"), 196 | col("DeliveryPostalCode")))) 197 | 198 | # COMMAND ---------- 199 | 200 | # MAGIC %md 201 | # MAGIC 202 | # MAGIC # Exclude corporate customers 203 | 204 | # COMMAND ---------- 205 | 206 | newSalesCustomerDf.createOrReplaceTempView("newSalesCustomer") 207 | salesOrderDf.createOrReplaceTempView("salesOrders") 208 | salesBuyingGroupsDf.createOrReplaceTempView("salesBuyingGroups") 209 | salesCustomerCategoriesDf.createOrReplaceTempView("salesCustomerCategories") 210 | 211 | 212 | # COMMAND ---------- 213 | 214 | corporateSalesCustomerDf = spark.sql("select c.* from newSalesCustomer c, salesCustomerCategories cc where c.customerCategoryID = cc.customerCategoryID and cc.CustomerCategoryName != 'Corporate'") 215 | 216 | # COMMAND ---------- 217 | 218 | # MAGIC %md 219 | # MAGIC 220 | # MAGIC # Write all entities to output CDM folder 221 | 222 | # COMMAND ---------- 223 | 224 | # Specify the CDM model name to output 225 | cdmModelName = "Transformed-Wide-World-Importers" 226 | 227 | # COMMAND ---------- 228 | 229 | (salesOrderDf.write.format("com.microsoft.cdm") 230 | .option("entity", "Sales Orders") 231 | .option("appId", appID) 232 | .option("appKey", appKey) 233 | .option("tenantId", tenantID) 234 | .option("cdmFolder", outputLocation) 235 | .option("cdmModelName", cdmModelName) 236 | .save()) 237 | 238 | # COMMAND ---------- 239 | 240 | (salesOrderLinesDf.write.format("com.microsoft.cdm") 241 | .option("entity", "Sales OrderLines") 242 | .option("appId", appID) 243 | .option("appKey", appKey) 244 | .option("tenantId", tenantID) 245 | .option("cdmFolder", outputLocation) 246 | .option("cdmModelName", cdmModelName) 247 | .save()) 248 | 249 | # COMMAND ---------- 250 | 251 | (corporateSalesCustomerDf.write.format("com.microsoft.cdm") 252 | .option("entity", "Sales Customers") 253 | .option("appId", appID) 254 | .option("appKey", appKey) 255 | .option("tenantId", tenantID) 256 | .option("cdmFolder", outputLocation) 257 | .option("cdmModelName", cdmModelName) 258 | .save()) 259 | 260 | # COMMAND ---------- 261 | 262 | (salesCustomerCategoriesDf.write.format("com.microsoft.cdm") 263 | .option("entity", "Sales CustomerCategories") 264 | .option("appId", appID) 265 | .option("appKey", appKey) 266 | .option("tenantId", tenantID) 267 | .option("cdmFolder", outputLocation) 268 | .option("cdmModelName", cdmModelName) 269 | .save()) 270 | 271 | # COMMAND ---------- 272 | 273 | (newSalesBuyingGroupsDf.write.format("com.microsoft.cdm") 274 | .option("entity", "Sales BuyingGroups") 275 | .option("appId", appID) 276 | .option("appKey", appKey) 277 | .option("tenantId", tenantID) 278 | .option("cdmFolder", outputLocation) 279 | .option("cdmModelName", cdmModelName) 280 | .save()) 281 | 282 | # COMMAND ---------- 283 | 284 | (warehouseStockItemsDf.write.format("com.microsoft.cdm") 285 | .option("entity", "Warehouse StockItems") 286 | .option("appId", appID) 287 | .option("appKey", appKey) 288 | .option("tenantId", tenantID) 289 | .option("cdmFolder", outputLocation) 290 | .option("cdmModelName", cdmModelName) 291 | .save()) 292 | 293 | # COMMAND ---------- 294 | 295 | (warehouseColorsDf.write.format("com.microsoft.cdm") 296 | .option("entity", "Warehouse Colors") 297 | .option("appId", appID) 298 | .option("appKey", appKey) 299 | .option("tenantId", tenantID) 300 | .option("cdmFolder", outputLocation) 301 | .option("cdmModelName", cdmModelName) 302 | .save()) 303 | 304 | # COMMAND ---------- 305 | 306 | (warehousePackageTypesDf.write.format("com.microsoft.cdm") 307 | .option("entity", "Warehouse PackageTypes") 308 | .option("appId", appID) 309 | .option("appKey", appKey) 310 | .option("tenantId", tenantID) 311 | .option("cdmFolder", outputLocation) 312 | .option("cdmModelName", cdmModelName) 313 | .save()) 314 | 315 | # COMMAND ---------- 316 | 317 | 318 | -------------------------------------------------------------------------------- /AzureMachineLearning/CdmModel.py: -------------------------------------------------------------------------------- 1 | import json 2 | import re 3 | from enum import Enum 4 | from collections import OrderedDict 5 | 6 | 7 | class SchemaEntry(object): 8 | __unassigned = object() 9 | 10 | def __init__(self, name, cls, defaultValue = None, verbose = False): 11 | self.name = name 12 | self.cls = cls 13 | if defaultValue is None and issubclass(cls, list): 14 | defaultValue = cls() 15 | self.defaultValue = defaultValue 16 | self.verbose = verbose 17 | 18 | def shouldSerialize(self, value): 19 | if self.verbose: 20 | return True 21 | if issubclass(self.cls, list): 22 | return len(value) > 0 23 | return self.defaultValue != value 24 | 25 | 26 | class PolymorphicMeta(type): 27 | classes = {} 28 | 29 | def __new__(cls, name, bases, attrs): 30 | cls = type.__new__(cls, name, bases, attrs) 31 | cls.classes[cls] = {cls.__name__ : cls} # TODO: abstract? 32 | cls.__appendBases(bases, cls) 33 | return cls 34 | 35 | @staticmethod 36 | def __appendBases(bases, cls): 37 | for base in bases: 38 | basemap = cls.classes.get(base, None) 39 | if basemap is not None: 40 | basemap[cls.__name__] = cls 41 | cls.__appendBases(base.__bases__, cls) 42 | 43 | class Polymorphic(metaclass=PolymorphicMeta): 44 | @classmethod 45 | def fromJson(cls, value): 46 | actualClass = PolymorphicMeta.classes[cls][value["$type"]] 47 | return super(Polymorphic, actualClass).fromJson(value) 48 | 49 | class Base(object): 50 | __ctors = {} 51 | schema = () 52 | 53 | def __init__(self): 54 | for entry in self.schema: 55 | setattr(self, entry.name, entry.defaultValue) 56 | 57 | @classmethod 58 | def fromJson(cls, value): 59 | result = cls() 60 | for entry in cls.schema: 61 | element = value.pop(entry.name, result) 62 | if element != result: 63 | setattr(result, entry.name, cls.__getCtor(entry.cls)(element)) 64 | result.customProperties = value 65 | return result 66 | 67 | @classmethod 68 | def __getCtor(cls, type): 69 | ctor = cls.__ctors.get(type, None) 70 | if not ctor: 71 | ctor = getattr(type, "fromJson", type) 72 | cls.__ctors[type] = ctor 73 | return ctor 74 | 75 | def validate(self): 76 | tmp = object() 77 | className = self.__class__.__name__ 78 | for entry in self.schema: 79 | element = getattr(self, entry.name, tmp) 80 | if element != tmp and element is not None: 81 | if not isinstance(element, entry.cls): 82 | raise TypeError("%s.%s must be of type %s" % (className, entry.name, entry.cls)) 83 | getattr(element, "validate", lambda: None)() 84 | 85 | def toJson(self): 86 | result = OrderedDict() 87 | if isinstance(self, Polymorphic): 88 | result["$type"] = self.__class__.__name__ 89 | for entry in self.schema: 90 | element = getattr(self, entry.name, result) 91 | if element != result and entry.shouldSerialize(element): 92 | result[entry.name] = getattr(element, "toJson", lambda: element)() 93 | result.update(getattr(self, "customProperties", {})) 94 | return result 95 | 96 | class ObjectCollection(list, Base): 97 | def append(self, item): 98 | if not isinstance(item, self.itemType): 99 | raise TypeError("item is not of type %s" % self.itemType) 100 | super(ObjectCollection, self).append(item) 101 | 102 | @classmethod 103 | def fromJson(cls, value): 104 | result = cls() 105 | ctor = cls.itemType.fromJson 106 | for item in value: 107 | super(ObjectCollection, result).append(ctor(item)) 108 | return result 109 | 110 | def toJson(self): 111 | result = [] 112 | for item in self: 113 | result.append(item.toJson()) 114 | return result 115 | 116 | def validate(self): 117 | for item in self: 118 | item.validate() 119 | 120 | 121 | String = str 122 | Uri = str 123 | DateTimeOffset = str 124 | 125 | class JsonEnum(Enum): 126 | def toJson(self): 127 | return self.value 128 | 129 | class CsvQuoteStyle(JsonEnum): 130 | Csv = "QuoteStyle.Csv" 131 | None_ = "QuoteStyle.None" 132 | 133 | class CsvStyle(JsonEnum): 134 | QuoteAlways = "CsvStyle.QuoteAlways" 135 | QuoteAfterDelimiter = "CsvStyle.QuoteAfterDelimiter" 136 | 137 | class DataType(JsonEnum): 138 | # TODO: Fix autogeneration 139 | Unclassified = "unclassified" 140 | String = "string" 141 | Int64 = "int64" 142 | Double = "double" 143 | DateTime = "dateTime" 144 | DateTimeOffset = "dateTimeOffset" 145 | Decimal = "decimal" 146 | Boolean = "boolean" 147 | Guid = "guid" 148 | Json = "json" 149 | 150 | class Annotation(Base): 151 | schema = Base.schema + ( 152 | SchemaEntry("name", String), 153 | SchemaEntry("value", String) 154 | ) 155 | 156 | def validate(self): 157 | super().validate() 158 | className = self.__class__.__name__ 159 | if not self.name: 160 | raise ValueError("%s.name is not set." % (className, )) 161 | 162 | class AnnotationCollection(ObjectCollection): 163 | itemType = Annotation 164 | 165 | class MetadataObject(Base): 166 | schema = Base.schema + ( 167 | SchemaEntry("name", String), 168 | SchemaEntry("description", String), 169 | SchemaEntry("annotations", AnnotationCollection) 170 | ) 171 | 172 | nameLengthMin = 1 173 | nameLengthMax = 256 174 | invalidNameRegex = re.compile("^\\s|\\s$") 175 | descriptionLengthMax = 4000 176 | 177 | def __repr__(self): 178 | name = getattr(self, "name", None) 179 | className = self.__class__.__name__ 180 | if name: 181 | return "<%s '%s'>" % (className, name) 182 | else: 183 | return "<%s>" % (className, ) 184 | 185 | def validate(self): 186 | super().validate() 187 | className = self.__class__.__name__ 188 | if self.name is not None: 189 | if len(self.name) > self.nameLengthMax or len(self.name) < self.nameLengthMin: 190 | raise ValueError("Length of %s.name (%d) is not between %d and %d." % (className, len(self.name), self.nameLengthMin, self.nameLengthMax)) 191 | if self.invalidNameRegex.search(self.name): 192 | raise ValueError("%s.name cannot contain leading or trailing blank spaces or consist only of whitespace." % (className, )) 193 | if self.description is not None and len(self.description) > self.descriptionLengthMax: 194 | raise ValueError("Length of %s.description (%d) may not exceed %d." % (className, len(self.name), self.nameLengthMin, self.nameLengthMax)) 195 | 196 | class MetadataObjectCollection(ObjectCollection): 197 | def __getitem__(self, index): 198 | if type(index) == str: 199 | index = next((i for i,item in enumerate(self) if item.name == index), None) 200 | if index is None: 201 | return None 202 | return super(MetadataObjectCollection, self).__getitem__(index) 203 | 204 | def validate(self): 205 | super().validate() 206 | className = self.__class__.__name__ 207 | s = set() 208 | for item in self: 209 | if item.name != None and item.name in s: 210 | raise ValueError("%s contains non-unique item name '%s'" % (className, item.name)) 211 | s.add(item.name) 212 | 213 | class DataObject(MetadataObject): 214 | schema = MetadataObject.schema + ( 215 | SchemaEntry("isHidden", bool, False), 216 | ) 217 | 218 | def validate(self): 219 | super().validate() 220 | className = self.__class__.__name__ 221 | if self.name is None: 222 | raise ValueError("%s.name is not set" % (className, )) 223 | 224 | class SchemaCollection(ObjectCollection): 225 | itemType = Uri 226 | 227 | class Reference(Base): 228 | schema = Base.schema + ( 229 | SchemaEntry("id", String), 230 | SchemaEntry("location", Uri) 231 | ) 232 | 233 | class ReferenceCollection(ObjectCollection): 234 | itemType = Reference 235 | 236 | class AttributeReference(Base): 237 | schema = Base.schema + ( 238 | SchemaEntry("entityName", String), 239 | SchemaEntry("attributeName", String) 240 | ) 241 | 242 | def __eq__(self, other): 243 | return isinstance(other, self.__class__) and self.entityName == other.entityName and self.attributeName == other.attributeName 244 | 245 | def __ne__(self, other): 246 | return not self.__eq__(other) 247 | 248 | def validate(self): 249 | super().validate() 250 | className = self.__class__.__name__ 251 | if not self.entityName: 252 | raise ValueError("%s.entityName is not set" % (className, )) 253 | if not self.attributeName: 254 | raise ValueError("%s.attributeName is not set" % (className, )) 255 | 256 | class Relationship(Polymorphic, Base): 257 | pass 258 | 259 | class SingleKeyRelationship(Relationship): 260 | schema = Relationship.schema + ( 261 | SchemaEntry("fromAttribute", AttributeReference), 262 | SchemaEntry("toAttribute", AttributeReference) 263 | ) 264 | 265 | def validate(self): 266 | super().validate() 267 | className = self.__class__.__name__ 268 | if self.fromAttribute is None: 269 | raise ValueError("%s.fromAttribute is not set" % (className, )) 270 | if self.toAttribute is None: 271 | raise ValueError("%s.toAttribute is not set" % (className, )) 272 | if self.fromAttribute == self.toAttribute: 273 | raise ValueError("%s must exist between different attribute references" % (className, )) 274 | 275 | class RelationshipCollection(ObjectCollection): 276 | itemType = Relationship 277 | 278 | class FileFormatSettings(Polymorphic, Base): 279 | pass 280 | 281 | class CsvFormatSettings(FileFormatSettings): 282 | schema = FileFormatSettings.schema + ( 283 | SchemaEntry("columnHeaders", bool, False), 284 | SchemaEntry("delimiter", String, ","), 285 | SchemaEntry("quoteStyle", CsvQuoteStyle, CsvQuoteStyle.Csv), 286 | SchemaEntry("csvStyle", CsvStyle, CsvStyle.QuoteAlways), 287 | SchemaEntry("encoding", int, 65001) # UTF-8 code page 288 | ) 289 | 290 | class Partition(DataObject): 291 | schema = DataObject.schema + ( 292 | SchemaEntry("refreshTime", DateTimeOffset), 293 | SchemaEntry("location", Uri), 294 | SchemaEntry("fileFormatSettings", FileFormatSettings) 295 | ) 296 | 297 | class PartitionCollection(MetadataObjectCollection): 298 | itemType = Partition 299 | 300 | class Attribute(MetadataObject): 301 | schema = MetadataObject.schema + ( 302 | SchemaEntry("dataCategory", String), 303 | SchemaEntry("dataType", DataType) 304 | ) 305 | 306 | def __repr__(self): 307 | return "<[%s]>" % (getattr(self, "name", "(unnamed)"), ) 308 | 309 | class AttributeCollection(MetadataObjectCollection): 310 | itemType = Attribute 311 | 312 | class Entity(Polymorphic, DataObject): 313 | invalidEntityNameRegex = re.compile("\\.|\"") 314 | 315 | def validate(self): 316 | super().validate() 317 | if self.invalidEntityNameRegex.search(self.name): 318 | raise ValueError("%s.name cannot contain dot or quotation mark." % (self.__class__.__name__, )) 319 | 320 | class LocalEntity(Entity): 321 | schema = Entity.schema + ( 322 | SchemaEntry("dataCategory", String), 323 | SchemaEntry("schemas", SchemaCollection), 324 | SchemaEntry("attributes", AttributeCollection), 325 | SchemaEntry("partitions", PartitionCollection) 326 | ) 327 | 328 | class ReferenceEntity(Entity): 329 | schema = Entity.schema + ( 330 | SchemaEntry("refreshTime", DateTimeOffset), 331 | SchemaEntry("source", String), 332 | SchemaEntry("modelId", String) 333 | ) 334 | 335 | def validate(self): 336 | super().validate() 337 | className = self.__class__.__name__ 338 | if not self.source: 339 | raise ValueError("%s.source is not set." % (className, )) 340 | if not self.modelId: 341 | raise ValueError("%s.modelId is not set." % (className, )) 342 | 343 | # TODO: Validate model references 344 | 345 | class EntityCollection(MetadataObjectCollection): 346 | itemType = Entity 347 | 348 | class Model(DataObject): 349 | schema = DataObject.schema + ( 350 | SchemaEntry("application", String), 351 | SchemaEntry("version", String), 352 | SchemaEntry("modifiedTime", DateTimeOffset), 353 | SchemaEntry("culture", String), 354 | SchemaEntry("referenceModels", ReferenceCollection), 355 | SchemaEntry("entities", EntityCollection, verbose=True), 356 | SchemaEntry("relationships", RelationshipCollection) 357 | ) 358 | 359 | currentSchemaVersion = "1.0" 360 | 361 | def __init__(self, name = None): 362 | super().__init__() 363 | self.name = name 364 | self.version = self.currentSchemaVersion 365 | 366 | @classmethod 367 | def fromJson(cls, value): 368 | if isinstance(value, str): 369 | value = json.loads(value) 370 | elif not isinstance(value, dict): 371 | value = json.load(value) 372 | return super(Model, cls).fromJson(value) 373 | 374 | def toJson(self): 375 | return json.dumps(super().toJson()) 376 | 377 | def validate(self, allowUnresolvedModelReferences = True): 378 | super().validate() 379 | if self.version != self.currentSchemaVersion: 380 | raise ValueError("Invalid model version '%s'", self.version) 381 | if not allowUnresolvedModelReferences: 382 | for entity in self.entities: 383 | if isinstance(entity, ReferenceEntity): 384 | found = next((model for model in self.referenceModels if model.id == entity.modelId), None) 385 | if found is None: 386 | raise ValueError("ReferenceEntity '%s' doesn't have a reference model" % (entity.name, )) 387 | -------------------------------------------------------------------------------- /AzureMachineLearning/README.md: -------------------------------------------------------------------------------- 1 | # Use Machine Learning to train a machine learning model 2 | 3 | This directory contains the notebook file and Python library needed in the Azure ML section of the tutorial. -------------------------------------------------------------------------------- /AzureMachineLearning/cdm-customer-classification-demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 19, 6 | "metadata": { 7 | "scrolled": true 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "# install packages as needed\n", 12 | "#! pip install adal\n", 13 | "#! pip install pandas\n", 14 | "#! pip install scikit-learn \n", 15 | "\n" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "## Utility methods for authenticating and retrieving data from ADLS Gen2 \n", 23 | "The next cell contains a series of helper methods which are primarily used to abstract away connectivity, security and enumeration.\n", 24 | "\n", 25 | "All of the secrets will need to be filled in with secrets of your own." 26 | ] 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": 20, 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "from adal import AuthenticationContext\n", 35 | "import requests\n", 36 | "import pandas as pd\n", 37 | "from datetime import datetime\n", 38 | "from io import StringIO, BytesIO\n", 39 | "\n", 40 | "def read_stream_from_adls(endpoint, auth):\n", 41 | " headers = {\"Authorization\": \"Bearer \" + auth['accessToken']}\n", 42 | " return requests.get(endpoint, data = None, headers = headers, stream=True)\n", 43 | "\n", 44 | "def read_from_adls(endpoint, auth):\n", 45 | " headers = {\"Authorization\": \"Bearer \" + auth['accessToken']}\n", 46 | " return requests.get(endpoint, data = None, headers = headers)\n", 47 | "\n", 48 | "# generate AAD token for REST API authentication\n", 49 | "def generate_aad_token():\n", 50 | " resource = \"https://storage.azure.com/\"\n", 51 | " client_secret = \"<>\"\n", 52 | " client_id = \"<>\"\n", 53 | " authority_url = \"<>\"\n", 54 | " auth_context = AuthenticationContext(authority_url, api_version = None)\n", 55 | " return auth_context.acquire_token_with_client_credentials(resource, client_id, client_secret)\n", 56 | "\n", 57 | "\n", 58 | "def type_conveter(input_type):\n", 59 | " switcher = {\n", 60 | " 'boolean': 'bool',\n", 61 | " 'int64': 'int64'\n", 62 | " }\n", 63 | " return switcher.get(input_type, 'str')\n", 64 | "\n", 65 | "def read_from_adls_with_cdm_format(entity, schema = \"cdm\"):\n", 66 | " auth = generate_aad_token()\n", 67 | " csv_path = entity.partitions[0].location\n", 68 | " csv_bytes = read_stream_from_adls(endpoint = csv_path, auth = auth).content\n", 69 | " \n", 70 | " # read to pandas dataframe with defined schema from model.json\n", 71 | " names = [attribute.name for attribute in entity.attributes]\n", 72 | " types = dict([(attribute.name, type_conveter(attribute.dataType.value)) for attribute in entity.attributes]) if schema is \"cdm\" else dict([(attribute.name, 'str') for attribute in entity.attributes])\n", 73 | " \n", 74 | " # Generate the data frame forcing the column names and types to be those from the model.json schema\n", 75 | " buff = BytesIO(csv_bytes)\n", 76 | " df = pd.read_csv(buff, names=names, dtype=types, na_filter = False)\n", 77 | " buff.close()\n", 78 | " return df" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "## Retrieve CDM specific metadata\n", 86 | "The first step is to read a model file that contains information about the CDM Entities that can be used later. This information will be supplied to the helper methods above so that the information in the model file can be used to ensure that the dataframe that is used for modelling at the end of this notebook is correct and matches the model specification.\n", 87 | "\n", 88 | "NOTE: the CdmModel.py file must be available so that it can be imported, the easiest way to do this is simply to have it in the same directory as the notebook, although it can also be referenced as a library. For this notebook the assumption is that its in the same notebook." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 21, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "# read model.json\n", 98 | "import CdmModel\n", 99 | "\n", 100 | "model_endpoint = \"https<>WWI-Sales/model.json\"\n", 101 | "aad_token = generate_aad_token()\n", 102 | "model_json = read_from_adls(endpoint = model_endpoint, auth = aad_token).json()\n", 103 | "model = CdmModel.Model.fromJson(model_json)" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "metadata": {}, 109 | "source": [ 110 | "# Scenario: Customer Order Classification\n", 111 | "\n", 112 | "Our hypothesis is that larger customers (by category) will have alrger purchuses (by invoice). Currently we have the following customer category {'Novelty Shop', 'Supermarket', 'Computer Store', 'General Retailer', 'Agent', 'Gift Store', 'Wholesaler', 'Corporate'}" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": [ 119 | "## Preprocessing Data\n", 120 | "Before we start modelling we need to do some basic data preparation of the data. \n", 121 | "\n", 122 | "Our first step is to read the data using the CDM information we got from the model file to enforce column naming and column type for each of the entities.\n", 123 | "\n", 124 | "Once we have good clean dataframes for each of the entities we join them to generate a single flat data frame that is the preferred input for most types of models.\n", 125 | "\n", 126 | "NOTE: The cell below can take some time to execute." 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 22, 132 | "metadata": {}, 133 | "outputs": [ 134 | { 135 | "data": { 136 | "text/html": [ 137 | "
\n", 138 | "\n", 151 | "\n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | "
InvoiceLineIDInvoiceIDStockItemIDPackageTypeIDQuantityUnitPriceLineProfitExtendedPriceDeliveryMethodIDCustomerIDCustomerCategoryID
01167710230.0850.02645.0038324
19745164750112.02650.06440.0038324
21363496893240.0454.5828.0038324
313644961967724.1151.2339.4838324
440501296647930.0135.0310.5038324
\n", 241 | "
" 242 | ], 243 | "text/plain": [ 244 | " InvoiceLineID InvoiceID StockItemID PackageTypeID Quantity UnitPrice \\\n", 245 | "0 1 1 67 7 10 230.0 \n", 246 | "1 97 45 164 7 50 112.0 \n", 247 | "2 1363 496 8 9 3 240.0 \n", 248 | "3 1364 496 196 7 72 4.1 \n", 249 | "4 4050 1296 64 7 9 30.0 \n", 250 | "\n", 251 | " LineProfit ExtendedPrice DeliveryMethodID CustomerID CustomerCategoryID \n", 252 | "0 850.0 2645.00 3 832 4 \n", 253 | "1 2650.0 6440.00 3 832 4 \n", 254 | "2 454.5 828.00 3 832 4 \n", 255 | "3 151.2 339.48 3 832 4 \n", 256 | "4 135.0 310.50 3 832 4 " 257 | ] 258 | }, 259 | "execution_count": 22, 260 | "metadata": {}, 261 | "output_type": "execute_result" 262 | } 263 | ], 264 | "source": [ 265 | "import numpy as np\n", 266 | "from IPython.display import display\n", 267 | "\n", 268 | "sales_customer_categories_df = read_from_adls_with_cdm_format(model.entities[\"Sales CustomerCategories\"], \"cdm\")[['CustomerCategoryID', 'CustomerCategoryName']]\n", 269 | "sales_customer_df = read_from_adls_with_cdm_format(model.entities[\"Sales Customers\"], \"default\")[['CustomerID', 'CustomerCategoryID']]\n", 270 | "sales_customer_df = sales_customer_df[['CustomerID', 'CustomerCategoryID']].astype(np.int64)\n", 271 | "\n", 272 | "sales_invoice_line_df = read_from_adls_with_cdm_format(model.entities[\"Sales InvoiceLines\"], \"cdm\")\n", 273 | "sales_invoice_df = read_from_adls_with_cdm_format(model.entities[\"Sales Invoices\"], \"cdm\")\n", 274 | "\n", 275 | "#Join the 2 elements of invoice together\n", 276 | "order_invoice_df = pd.merge(sales_invoice_line_df, sales_invoice_df, on=['InvoiceID'])\n", 277 | "\n", 278 | "#Join customers to their invoices and fix up the datatypes\n", 279 | "combined_df = pd.merge(order_invoice_df, sales_customer_df, on=['CustomerID'])\n", 280 | "combined_df = combined_df[['InvoiceLineID', 'InvoiceID', 'StockItemID', 'PackageTypeID', 'Quantity', 'UnitPrice', 'LineProfit', 'ExtendedPrice', 'DeliveryMethodID', 'CustomerID', 'CustomerCategoryID']]\n", 281 | "\n", 282 | "#These columns come back as object we need them to be floats.\n", 283 | "for col in ['UnitPrice', 'LineProfit', 'ExtendedPrice']:\n", 284 | " combined_df[col] = combined_df[col].astype(np.float64)\n", 285 | " \n", 286 | "combined_df.head(5)" 287 | ] 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "metadata": {}, 292 | "source": [ 293 | "## Create Models\n", 294 | "Using sklearn we will take the CDM data and build a (simple) machine learning model from it. In this case we are going to build a simple logistic regression model just to demonstrate the process and approach." 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 23, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "name": "stdout", 304 | "output_type": "stream", 305 | "text": [ 306 | "Logistic Regression: 0.71\n" 307 | ] 308 | } 309 | ], 310 | "source": [ 311 | "from sklearn.model_selection import train_test_split\n", 312 | "from sklearn.linear_model import LogisticRegression\n", 313 | "\n", 314 | "#Set up the target and feature columns before splitting into training and testing\n", 315 | "target_df = combined_df['CustomerCategoryID']\n", 316 | "features_df = combined_df.drop(['CustomerCategoryID'], axis = 1) \n", 317 | "X_train, X_test, y_train, y_test = train_test_split(features_df, target_df, test_size = 0.3)\n", 318 | "\n", 319 | "lr = LogisticRegression()\n", 320 | "\n", 321 | "lr.fit(X_train, y_train)\n", 322 | "y_pred = lr.predict(X_test)\n", 323 | "\n", 324 | "lr_accuracy = lr.score(X_test, y_test)\n", 325 | "prob = lr.predict_proba(X_test)[:,1]\n", 326 | "\n", 327 | "print(\"Logistic Regression: \" + str(round(lr_accuracy,2)))" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "The accuracy of the result should be 0.71, which is not especially good. In a real modelling exercise we would go back and tweak the columns in the model, featurise them and also potentially experiment with parameters to the LR model. All in the hope that a more accurate model is possible." 335 | ] 336 | } 337 | ], 338 | "metadata": { 339 | "kernelspec": { 340 | "display_name": "Python [conda env:amlsdk]", 341 | "language": "python", 342 | "name": "conda-env-amlsdk-py" 343 | }, 344 | "language_info": { 345 | "codemirror_mode": { 346 | "name": "ipython", 347 | "version": 3 348 | }, 349 | "file_extension": ".py", 350 | "mimetype": "text/x-python", 351 | "name": "python", 352 | "nbconvert_exporter": "python", 353 | "pygments_lexer": "ipython3", 354 | "version": "3.6.6" 355 | } 356 | }, 357 | "nbformat": 4, 358 | "nbformat_minor": 2 359 | } 360 | -------------------------------------------------------------------------------- /AzureSqlDataWarehouse/README.md: -------------------------------------------------------------------------------- 1 | # Deploy the dimensional schema to the DW and transform the staged data 2 | 3 | This directory contains the SQL schema needed to deploy the dimensional schema. 4 | -------------------------------------------------------------------------------- /AzureSqlDatabase/README.md: -------------------------------------------------------------------------------- 1 | # Create the Wide World Importers database 2 | 3 | This directory contains the bacpac file needed to create the Wide World Importers database. -------------------------------------------------------------------------------- /AzureSqlDatabase/WideWorldImporters-Standard.bacpac: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/AzureSqlDatabase/WideWorldImporters-Standard.bacpac -------------------------------------------------------------------------------- /CDM/README.md: -------------------------------------------------------------------------------- 1 | # CDM & CDM Folders sample 2 | 3 | Data stored in the [Common Data Model (CDM)](https://docs.microsoft.com/common-data-model) format provides semantic consistency across apps and deployments. With the evolution of the CDM metadata system, the CDM brings the same structural consistency and semantic meaning to the data stored in Azure Data Lake Storage Gen2 Preview with hierarchical namespaces and folders that contain schematized data in standard CDM format. The standardized metadata and self-describing data in an Azure data lake facilitates metadata discovery and interoperability between data producers and consumers such as Power BI, Azure Data Factory, Azure Databricks, and Azure Machine Learning service. 4 | 5 | A "CDM folder" is a folder in the Azure Data Lake Storage Gen2, conforming to specific, well-defined and standardized metadata structures and self-describing data, to facilitates effortless metadata discovery and interop between data producers (e.g. Dynamics 365 business application suite) and data consumers, such as Power BI analytics, Azure data platform services (e.g. Azure Machine Learning, Azure Data Factory, Azure Databricks, etc.) and turn-key SaaS applications (Dynamics 365 AI for Sales, etc.) The standardized metadata is defined in the model.json file, which exists in the folder and containers pointers to the actual data file locations. 6 | 7 | The subfolders in this directory provide a set of sample libraries and schema files to read and write the model.json file used in other samples in this account. 8 | 9 | The latest versions of these libraries can be found in https://github.com/Microsoft/CDM 10 | 11 | ## More information 12 | - [CDM](https://docs.microsoft.com/common-data-model) 13 | - [CDM folders](https://docs.microsoft.com/common-data-model/data-lake) 14 | - [model.json](https://docs.microsoft.com/common-data-model/model-json) 15 | -------------------------------------------------------------------------------- /CDM/dotnet/Microsoft.CdmFolders.SampleLibraries.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | netstandard2.0 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/Annotation.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.IO; 7 | using Newtonsoft.Json; 8 | 9 | /// 10 | /// Annotation 11 | /// 12 | [JsonObject(MemberSerialization.OptIn)] 13 | public class Annotation 14 | { 15 | /// 16 | /// Gets or sets the name 17 | /// 18 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 19 | public string Name { get; set; } 20 | 21 | /// 22 | /// Gets or sets the value 23 | /// 24 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 25 | public string Value { get; set; } 26 | 27 | /// 28 | /// Validates that loaded model is correct and can function. 29 | /// 30 | internal void Validate() 31 | { 32 | if (string.IsNullOrWhiteSpace(this.Name)) 33 | { 34 | throw new InvalidDataException("Annotation Name is not set."); 35 | } 36 | } 37 | } 38 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/AnnotationCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Linq; 8 | 9 | /// 10 | /// AnnotationCollection 11 | /// 12 | public class AnnotationCollection : ObjectCollection 13 | { 14 | /// 15 | /// Indexer 16 | /// 17 | /// Name 18 | /// Value 19 | public string this[string name] => this.FirstOrDefault(a => StringComparer.OrdinalIgnoreCase.Equals(a.Name, name))?.Value; 20 | 21 | /// 22 | /// Validates that loaded model is correct and can function. 23 | /// 24 | internal void Validate() 25 | { 26 | foreach (var annotation in this) 27 | { 28 | annotation.Validate(); 29 | } 30 | } 31 | } 32 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/Attribute.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Newtonsoft.Json; 7 | 8 | /// 9 | /// Attribute 10 | /// 11 | [JsonObject(MemberSerialization.OptIn)] 12 | public class Attribute : MetadataObject 13 | { 14 | /// 15 | /// Gets or sets the DataType 16 | /// 17 | [JsonProperty(DefaultValueHandling = DefaultValueHandling.Ignore)] 18 | public DataType DataType { get; set; } 19 | } 20 | } 21 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/AttributeCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Linq; 8 | 9 | /// 10 | /// AttributeCollection 11 | /// 12 | public class AttributeCollection : MetadataObjectCollection 13 | { 14 | /// 15 | /// Initializes a new instance of the class. 16 | /// 17 | /// The parent 18 | public AttributeCollection(Entity parent) 19 | : base(parent) 20 | { 21 | } 22 | 23 | /// 24 | internal override void Validate(bool allowUnresolvedModelReferences = true) 25 | { 26 | base.Validate(allowUnresolvedModelReferences); 27 | this.ValidateUniqueNames(); 28 | } 29 | } 30 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/AttributeReference.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.IO; 7 | using Newtonsoft.Json; 8 | 9 | /// 10 | /// AttributeReferenc 11 | /// 12 | [JsonObject(MemberSerialization.OptIn)] 13 | public class AttributeReference 14 | { 15 | /// 16 | /// Gets or sets entity 17 | /// 18 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 19 | public string EntityName { get; set; } 20 | 21 | /// 22 | /// Gets or Sets attribute 23 | /// 24 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 25 | public string AttributeName { get; set; } 26 | 27 | /// 28 | /// Validates that loaded model is correct and can function. 29 | /// 30 | internal void Validate() 31 | { 32 | if (string.IsNullOrWhiteSpace(this.EntityName)) 33 | { 34 | throw new InvalidDataException($"{nameof(this.EntityName)} is not set for '{this.GetType().Name}'."); 35 | } 36 | 37 | if (string.IsNullOrWhiteSpace(this.AttributeName)) 38 | { 39 | throw new InvalidDataException($"{nameof(this.AttributeName)} is not set for '{this.GetType().Name}'."); 40 | } 41 | } 42 | } 43 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/CsvFormatSettings.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Newtonsoft.Json; 7 | 8 | /// 9 | /// CSV file format settings 10 | /// 11 | [JsonObject(MemberSerialization.OptIn)] 12 | public class CsvFormatSettings : FileFormatSettings 13 | { 14 | /// 15 | /// Gets or sets a value indicating whether the csv contains headers 16 | /// 17 | [JsonProperty(DefaultValueHandling = DefaultValueHandling.Populate)] 18 | public bool ColumnHeaders { get; set; } = false; 19 | 20 | /// 21 | /// Gets or sets the csv delimiter 22 | /// 23 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 24 | public string Delimiter { get; set; } = ","; 25 | 26 | /// 27 | /// Gets or sets the quote style 28 | /// 29 | [JsonProperty(DefaultValueHandling = DefaultValueHandling.Populate)] 30 | public CsvQuoteStyle QuoteStyle { get; set; } = CsvQuoteStyle.Csv; 31 | 32 | /// 33 | /// Gets or sets the csv style 34 | /// 35 | [JsonProperty(DefaultValueHandling = DefaultValueHandling.Populate)] 36 | public CsvStyle CsvStyle { get; set; } = CsvStyle.QuoteAlways; 37 | 38 | /// 39 | public override FileFormatSettings Clone() 40 | { 41 | return new CsvFormatSettings 42 | { 43 | ColumnHeaders = this.ColumnHeaders, 44 | QuoteStyle = this.QuoteStyle, 45 | CsvStyle = this.CsvStyle, 46 | Delimiter = this.Delimiter 47 | }; 48 | } 49 | } 50 | } 51 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/CsvQuoteStyle.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.Runtime.Serialization; 7 | using Newtonsoft.Json; 8 | using Newtonsoft.Json.Converters; 9 | 10 | /// 11 | /// CSV quote style 12 | /// 13 | [JsonConverter(typeof(StringEnumConverter))] 14 | public enum CsvQuoteStyle 15 | { 16 | /// 17 | /// CSV quote style 18 | /// 19 | [EnumMember(Value = "QuoteStyle.Csv")] 20 | Csv, 21 | 22 | /// 23 | /// No quotes 24 | /// 25 | [EnumMember(Value = "QuoteStyle.None")] 26 | None, 27 | } 28 | } 29 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/CsvStyle.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.Runtime.Serialization; 7 | using Newtonsoft.Json; 8 | using Newtonsoft.Json.Converters; 9 | 10 | /// 11 | /// CSV style settings 12 | /// 13 | [JsonConverter(typeof(StringEnumConverter))] 14 | public enum CsvStyle 15 | { 16 | /// 17 | /// CSV quote style 18 | /// 19 | [EnumMember(Value = "CsvStyle.QuoteAlways")] 20 | QuoteAlways, 21 | 22 | /// 23 | /// No quotes 24 | /// 25 | [EnumMember(Value = "CsvStyle.QuoteAfterDelimiter")] 26 | QuoteAfterDelimiter, 27 | } 28 | } 29 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/DataObject.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 9 | using Newtonsoft.Json; 10 | 11 | /// 12 | /// MetadataObject 13 | /// 14 | [JsonObject(MemberSerialization.OptIn)] 15 | public abstract class DataObject : MetadataObject 16 | { 17 | /// 18 | /// Gets or sets a value indicating whether this object is hidden 19 | /// 20 | [JsonProperty(DefaultValueHandling = DefaultValueHandling.Ignore, Order = SerializationOrderConstants.DataObjectSerializationOrder)] 21 | public bool IsHidden { get; set; } 22 | 23 | private DataObject DataObjectParent => this.Parent as DataObject; 24 | 25 | /// 26 | internal override void Validate(bool allowUnresolvedModelReferences = true) 27 | { 28 | base.Validate(allowUnresolvedModelReferences); 29 | 30 | if (this.Name == null) 31 | { 32 | throw new InvalidDataException($"Name is not set for '{this.GetType().Name}'."); 33 | } 34 | } 35 | 36 | /// 37 | /// Copy from other data object 38 | /// 39 | /// the other data object 40 | protected void CopyFrom(DataObject other) 41 | { 42 | base.CopyFrom(other); 43 | this.IsHidden = other.IsHidden; 44 | } 45 | } 46 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/DataType.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 7 | using Newtonsoft.Json; 8 | 9 | /// 10 | /// DataType 11 | /// 12 | [JsonConverter(typeof(StringEnumCamelCaseConverter))] 13 | public enum DataType 14 | { 15 | /// 16 | /// Unclassified 17 | /// 18 | Unclassified, 19 | 20 | /// 21 | /// String 22 | /// 23 | String, 24 | 25 | /// 26 | /// Int64 27 | /// 28 | Int64, 29 | 30 | /// 31 | /// Double 32 | /// 33 | Double, 34 | 35 | /// 36 | /// DateTime 37 | /// 38 | DateTime, 39 | 40 | /// 41 | /// DateTimeOffset 42 | /// 43 | DateTimeOffset, 44 | 45 | /// 46 | /// Decimal 47 | /// 48 | Decimal, 49 | 50 | /// 51 | /// Boolean 52 | /// 53 | Boolean, 54 | 55 | /// 56 | /// GUID 57 | /// 58 | Guid, 59 | 60 | /// 61 | /// Serialized json 62 | /// 63 | Json, 64 | } 65 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/Entity.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.IO; 7 | using System.Text.RegularExpressions; 8 | 9 | /// 10 | /// Entity 11 | /// 12 | public abstract class Entity : DataObject 13 | { 14 | // Regex that detects dot or quotation mark 15 | private static readonly Regex InvalidNameRegex = new Regex(@"\.|""", RegexOptions.Compiled); 16 | 17 | /// 18 | internal override void Validate(bool allowUnresolvedModelReferences = true) 19 | { 20 | base.Validate(allowUnresolvedModelReferences); 21 | 22 | if (InvalidNameRegex.IsMatch(this.Name)) 23 | { 24 | throw new InvalidDataException($"Name of '{this.GetType().Name}' cannot contain dot or quotation mark."); 25 | } 26 | } 27 | } 28 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/EntityCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Newtonsoft.Json; 7 | 8 | /// 9 | /// EntityCollectio 10 | /// 11 | /// The usage of was approved in this by Azure security team since the TypeNameSerializationBinder limits the scope only to this assembly 12 | [JsonArray(ItemTypeNameHandling = TypeNameHandling.Auto)] 13 | public class EntityCollection : MetadataObjectCollection 14 | { 15 | /// 16 | /// Initializes a new instance of the class. 17 | /// 18 | /// The parent 19 | public EntityCollection(Model parent) 20 | : base(parent) 21 | { 22 | } 23 | 24 | /// 25 | internal override void Validate(bool allowUnresolvedModelReferences = true) 26 | { 27 | base.Validate(allowUnresolvedModelReferences); 28 | this.ValidateUniqueNames(); 29 | } 30 | } 31 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/FileFormatSettings.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Newtonsoft.Json; 7 | 8 | /// 9 | /// File format settings abstract class 10 | /// 11 | public abstract class FileFormatSettings 12 | { 13 | /// 14 | /// Clone this file format settings 15 | /// 16 | /// The cloned settings 17 | public abstract FileFormatSettings Clone(); 18 | } 19 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/LocalEntity.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 7 | using Newtonsoft.Json; 8 | 9 | /// 10 | /// Entity 11 | /// 12 | [JsonObject(MemberSerialization.OptIn)] 13 | public class LocalEntity : Entity 14 | { 15 | /// 16 | /// Initializes a new instance of the class. 17 | /// 18 | public LocalEntity() 19 | { 20 | this.Attributes = new AttributeCollection(this); 21 | this.Partitions = new PartitionCollection(this); 22 | this.Schemas = new SchemaCollection(); 23 | } 24 | 25 | /// 26 | /// Gets or sets the Attributes 27 | /// 28 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.CollectionSerializationOrder)] 29 | public AttributeCollection Attributes { get; set; } 30 | 31 | /// 32 | /// Gets or sets the schemas the entity implements 33 | /// 34 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.CollectionSerializationOrder)] 35 | public PartitionCollection Partitions { get; set; } 36 | 37 | /// 38 | /// Gets or sets the schemas the entity implements 39 | /// 40 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.CollectionSerializationOrder)] 41 | public SchemaCollection Schemas { get; set; } 42 | 43 | /// 44 | internal override void Validate(bool allowUnresolvedModelReferences = true) 45 | { 46 | base.Validate(allowUnresolvedModelReferences); 47 | 48 | this.Attributes.Validate(allowUnresolvedModelReferences); 49 | this.Partitions.Validate(allowUnresolvedModelReferences); 50 | this.Schemas.Validate(); 51 | } 52 | } 53 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/MetadataObject.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using System.Linq; 9 | using System.Text.RegularExpressions; 10 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 11 | using Newtonsoft.Json; 12 | 13 | /// 14 | /// MetadataObject 15 | /// 16 | [JsonObject(MemberSerialization.OptIn)] 17 | public abstract class MetadataObject 18 | { 19 | private static readonly int DefaultNameLengthMin = 1; 20 | private static readonly int DefaultNameLengthMax = 256; 21 | 22 | // Regex that detects whitespace, leading blank spaces, or trailing blank spaces 23 | private static readonly Regex InvalidNameRegex = new Regex(@"^\s|\s$", RegexOptions.Compiled); 24 | 25 | private static readonly int DescriptionLengthMax = 4000; 26 | 27 | /// 28 | /// Gets the parent 29 | /// 30 | public MetadataObject Parent { get; internal set; } 31 | 32 | /// 33 | /// Gets or sets the name 34 | /// 35 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.MetadataObjectSerializationOrder)] 36 | public string Name { get; set; } 37 | 38 | /// 39 | /// Gets or sets description 40 | /// 41 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.MetadataObjectSerializationOrder)] 42 | public string Description { get; set; } 43 | 44 | /// 45 | /// Gets the annotations 46 | /// 47 | [JsonProperty(Order = SerializationOrderConstants.MetadataObjectSerializationOrder + SerializationOrderConstants.CollectionSerializationOrder)] 48 | public AnnotationCollection Annotations { get; } = new AnnotationCollection(); 49 | 50 | /// 51 | /// Gets the nameLengthMax 52 | /// 53 | protected virtual int NameLengthMax => DefaultNameLengthMax; 54 | 55 | /// 56 | /// Validates that loaded model is correct and can function. 57 | /// 58 | /// 59 | /// If set to True, the method will skip on validating MetadataObjects that can't be validated due to model references. 60 | /// If set to False, the method will try to validate all MetadataObjects. Will throw if not possible to resolve. 61 | /// 62 | internal virtual void Validate(bool allowUnresolvedModelReferences = true) 63 | { 64 | this.Annotations.Validate(); 65 | 66 | // Validate Name ( If name is set ) 67 | if (this.Name != null) 68 | { 69 | if (this.Name.Length > this.NameLengthMax || this.Name.Length < DefaultNameLengthMin) 70 | { 71 | throw new InvalidDataException($"Name length of '{this.GetType().Name}' is incorrect. Name length: '{this.Name.Length}'. Minimum allowed length: '{DefaultNameLengthMin}'. Maximum allowed length: '{this.NameLengthMax}'"); 72 | } 73 | 74 | if (InvalidNameRegex.IsMatch(this.Name)) 75 | { 76 | throw new InvalidDataException($"Name of '{this.GetType().Name}' cannot contain leading or trailing blank spaces or consist only of whitespace."); 77 | } 78 | } 79 | 80 | // Validate Description 81 | if (this.Description != null && this.Description.Length > DescriptionLengthMax) 82 | { 83 | throw new InvalidDataException($"Description length of '{this.GetType().Name}' has exceeded maximum allowed length. Description length: '{this.Description.Length}'. Maximum allowed length: '{DescriptionLengthMax}'"); 84 | } 85 | } 86 | 87 | /// 88 | /// Copy from other metadata object 89 | /// 90 | /// The other metadata object 91 | protected void CopyFrom(MetadataObject other) 92 | { 93 | this.Parent = other.Parent; 94 | this.Name = other.Name; 95 | this.Description = other.Description; 96 | this.Annotations.Clear(); 97 | this.Annotations.AddRange(other.Annotations.Select(a => new Annotation { Name = a.Name, Value = a.Value })); 98 | } 99 | } 100 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/MetadataObjectCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Collections.Generic; 8 | using System.IO; 9 | using System.Linq; 10 | 11 | /// 12 | /// Base for all the metadata objects collection 13 | /// 14 | /// The collection type 15 | /// The parent 16 | public abstract class MetadataObjectCollection : ObjectCollection 17 | where T : MetadataObject 18 | where TParent : MetadataObject 19 | { 20 | /// 21 | /// Initializes a new instance of the class. 22 | /// 23 | /// The parent 24 | public MetadataObjectCollection(TParent parent) 25 | : base() 26 | { 27 | this.Parent = parent; 28 | } 29 | 30 | /// 31 | /// Initializes a new instance of the class. 32 | /// 33 | /// The parent 34 | /// The collection 35 | public MetadataObjectCollection(TParent parent, IEnumerable collection) 36 | : base(collection) 37 | { 38 | this.Parent = parent; 39 | } 40 | 41 | /// 42 | /// Gets the parent 43 | /// 44 | public TParent Parent { get; } 45 | 46 | /// 47 | /// Indexer that returns the first element according to a given name 48 | /// 49 | /// The name 50 | /// The metadata object 51 | public T this[string name] => this.FirstOrDefault(item => StringComparer.OrdinalIgnoreCase.Equals(item.Name, name)); 52 | 53 | /// 54 | public override bool Contains(T item) 55 | { 56 | return this[item.Name] != null; 57 | } 58 | 59 | /// 60 | /// Validates that loaded model is correct and can function. 61 | /// 62 | /// 63 | /// If set to True, the method will skip on validating MetadataObjects that can't be resolved due to model references. 64 | /// If set to False, the method will try to validate all MetadataObjects. Will throw if not possible to resolve. 65 | /// 66 | internal virtual void Validate(bool allowUnresolvedModelReferences = true) 67 | { 68 | foreach (T metadataObj in this) 69 | { 70 | metadataObj.Validate(allowUnresolvedModelReferences); 71 | } 72 | } 73 | 74 | /// 75 | protected override void OnAdded(T item) 76 | { 77 | item.Parent = this.Parent; 78 | } 79 | 80 | /// 81 | /// Validate unique names 82 | /// 83 | protected void ValidateUniqueNames() 84 | { 85 | var uniqueNames = new HashSet(StringComparer.OrdinalIgnoreCase); 86 | foreach (MetadataObject metadataObj in this.Where(obj => obj.Name != null)) 87 | { 88 | if (!uniqueNames.Contains(metadataObj.Name)) 89 | { 90 | uniqueNames.Add(metadataObj.Name); 91 | } 92 | else 93 | { 94 | throw new InvalidDataException($"'{this.GetType().Name}' contains non-unique item names."); 95 | } 96 | } 97 | } 98 | } 99 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/Model.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Globalization; 8 | using System.IO; 9 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 10 | using Newtonsoft.Json; 11 | using Newtonsoft.Json.Linq; 12 | 13 | /// 14 | /// The Model object represents an entity-relationship model with some extra metadata 15 | /// 16 | [JsonObject(MemberSerialization.OptIn)] 17 | public class Model : DataObject 18 | { 19 | /// 20 | /// The model file extension 21 | /// 22 | public const string ModelFileExtension = ".json"; 23 | 24 | /// 25 | /// The model file name 26 | /// 27 | public const string ModelFileName = "model" + ModelFileExtension; 28 | 29 | /// 30 | /// The current model schema version 31 | /// 32 | private const string CurrentSchemaVersion = "1.0"; 33 | 34 | private static readonly JsonSerializerSettings SerializeSettings = new JsonSerializerSettings() 35 | { 36 | SerializationBinder = new TypeNameSerializationBinder(), 37 | Formatting = Formatting.None, 38 | ContractResolver = new CollectionsContractResolver(), 39 | }; 40 | 41 | private static readonly JsonSerializerSettings DeserializeSettings = new JsonSerializerSettings() 42 | { 43 | SerializationBinder = new TypeNameSerializationBinder(), 44 | }; 45 | 46 | /// 47 | /// Initializes a new instance of the class. 48 | /// 49 | public Model() 50 | { 51 | this.Version = CurrentSchemaVersion; 52 | this.Entities = new EntityCollection(this); 53 | this.Relationships = new RelationshipCollection(this); 54 | this.ReferenceModels = new ReferenceModelCollection(); 55 | } 56 | 57 | /// 58 | /// Gets or sets the schema version 59 | /// 60 | [JsonProperty(NullValueHandling = NullValueHandling.Include)] 61 | public string Version { get; set; } 62 | 63 | /// 64 | /// Gets the Entities 65 | /// 66 | [JsonProperty(Order = SerializationOrderConstants.CollectionSerializationOrder)] 67 | public EntityCollection Entities { get; } 68 | 69 | /// 70 | /// Gets the Relationships 71 | /// 72 | [JsonProperty(Order = SerializationOrderConstants.CollectionSerializationOrder)] 73 | public RelationshipCollection Relationships { get; } 74 | 75 | /// 76 | /// Gets the Reference Models 77 | /// 78 | [JsonProperty(Order = SerializationOrderConstants.CollectionSerializationOrder)] 79 | public ReferenceModelCollection ReferenceModels { get; } 80 | 81 | /// 82 | /// Gets or sets the Culture 83 | /// 84 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 85 | public CultureInfo Culture { get; set; } 86 | 87 | /// 88 | /// Gets or sets the modified time 89 | /// 90 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 91 | public DateTimeOffset? ModifiedTime { get; set; } 92 | 93 | /// 94 | /// Deserializes a model from JObject 95 | /// 96 | /// The JObject of the model 97 | /// JObject of the model 98 | public static Model Import(JObject jObject) 99 | { 100 | if (jObject == null) 101 | { 102 | return null; 103 | } 104 | 105 | var serializer = JsonSerializer.Create(DeserializeSettings); 106 | return jObject.ToObject(serializer); 107 | } 108 | 109 | /// 110 | /// FromJson 111 | /// 112 | /// The string representation of the Json 113 | public void FromJson(string modelJson) 114 | { 115 | this.Version = null; 116 | JsonConvert.PopulateObject(modelJson, this, DeserializeSettings); 117 | } 118 | 119 | /// 120 | /// Serializes the model as JObject 121 | /// 122 | /// JObject of the model 123 | public JObject Export() 124 | { 125 | var serializer = JsonSerializer.Create(SerializeSettings); 126 | return JObject.FromObject(this, serializer); 127 | } 128 | 129 | /// 130 | /// Serializes the model as string 131 | /// 132 | /// String of the model 133 | public string ToJson() 134 | { 135 | return JsonConvert.SerializeObject(this, SerializeSettings); 136 | } 137 | 138 | /// 139 | /// Validates that loaded model is correct and can function 140 | /// 141 | public void ValidateModel() 142 | { 143 | this.Validate(); 144 | } 145 | 146 | /// 147 | internal override void Validate(bool allowUnresolvedModelReferences = true) 148 | { 149 | if (this.Version != CurrentSchemaVersion) 150 | { 151 | throw new InvalidDataException($"Invalid model version: {this.Version}"); 152 | } 153 | 154 | base.Validate(allowUnresolvedModelReferences); 155 | 156 | this.ReferenceModels.Validate(); 157 | this.Entities.Validate(allowUnresolvedModelReferences); 158 | this.Relationships.Validate(allowUnresolvedModelReferences); 159 | } 160 | } 161 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/ObjectCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System.Collections; 7 | using System.Collections.Generic; 8 | 9 | /// 10 | /// Base for all the available collection 11 | /// 12 | /// The collection type 13 | public abstract class ObjectCollection : ICollection 14 | { 15 | private readonly List list; 16 | 17 | /// 18 | /// Initializes a new instance of the class. 19 | /// 20 | public ObjectCollection() 21 | { 22 | this.list = new List(); 23 | } 24 | 25 | /// 26 | /// Initializes a new instance of the class with a base collection. 27 | /// 28 | /// The base collection 29 | public ObjectCollection(IEnumerable collection) 30 | : this() 31 | { 32 | this.AddRange(collection); 33 | } 34 | 35 | /// 36 | public int Count => this.list.Count; 37 | 38 | /// 39 | public bool IsReadOnly => false; 40 | 41 | /// 42 | /// Indexer that returns the first element according to a given position 43 | /// 44 | /// the position 45 | /// The metadata object 46 | public T this[int index] => this.list[index]; 47 | 48 | /// 49 | public void Add(T item) 50 | { 51 | this.list.Add(item); 52 | this.OnAdded(item); 53 | } 54 | 55 | /// 56 | /// Add a range of items 57 | /// 58 | /// The items 59 | public void AddRange(IEnumerable items) 60 | { 61 | foreach (var item in items) 62 | { 63 | this.Add(item); 64 | } 65 | } 66 | 67 | /// 68 | public void Clear() 69 | { 70 | this.list.Clear(); 71 | } 72 | 73 | /// 74 | public virtual bool Contains(T item) 75 | { 76 | return this.list.Contains(item); 77 | } 78 | 79 | /// 80 | public void CopyTo(T[] array, int arrayIndex) 81 | { 82 | this.list.CopyTo(array, arrayIndex); 83 | } 84 | 85 | /// 86 | public bool Remove(T item) 87 | { 88 | return this.list.Remove(item); 89 | } 90 | 91 | /// 92 | public IEnumerator GetEnumerator() 93 | { 94 | return this.list.GetEnumerator(); 95 | } 96 | 97 | /// 98 | IEnumerator IEnumerable.GetEnumerator() 99 | { 100 | return this.GetEnumerator(); 101 | } 102 | 103 | /// 104 | /// On add hook 105 | /// 106 | /// The item that was added 107 | protected virtual void OnAdded(T item) 108 | { 109 | } 110 | } 111 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/Partition.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using Microsoft.CdmFolders.SampleLibraries.SerializationHelpers; 9 | using Newtonsoft.Json; 10 | 11 | /// 12 | /// Partition 13 | /// 14 | [JsonObject(MemberSerialization.OptIn)] 15 | public class Partition : DataObject 16 | { 17 | /// 18 | /// Gets or sets the Refresh Time 19 | /// 20 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 21 | public DateTimeOffset? RefreshTime { get; set; } 22 | 23 | /// 24 | /// Gets or sets location 25 | /// 26 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 27 | public Uri Location { get; set; } 28 | 29 | /// 30 | /// Gets or sets file format settings 31 | /// 32 | /// The usage of was approved in this by Azure security team since the TypeNameSerializationBinder limits the scope only to this assembly 33 | [JsonProperty(TypeNameHandling = TypeNameHandling.Auto, NullValueHandling = NullValueHandling.Ignore, Order = SerializationOrderConstants.ObjectSerializationOrder)] 34 | public FileFormatSettings FileFormatSettings { get; set; } 35 | 36 | /// 37 | /// Clones the partition 38 | /// 39 | /// A clone of the partition 40 | public Partition Clone() 41 | { 42 | var clone = new Partition(); 43 | clone.CopyFrom(this); 44 | clone.Parent = null; 45 | 46 | return clone; 47 | } 48 | 49 | /// 50 | /// Copy from other data object 51 | /// 52 | /// The other data object 53 | protected void CopyFrom(Partition other) 54 | { 55 | base.CopyFrom(other); 56 | this.RefreshTime = other.RefreshTime; 57 | this.Location = (other.Location == null) ? null : new UriBuilder(other.Location).Uri; 58 | this.FileFormatSettings = other.FileFormatSettings?.Clone(); 59 | } 60 | } 61 | } 62 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/PartitionCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | /// 7 | /// PartitionCollection 8 | /// 9 | public class PartitionCollection : MetadataObjectCollection 10 | { 11 | /// 12 | /// Initializes a new instance of the class. 13 | /// 14 | /// The parent 15 | public PartitionCollection(Entity parent) 16 | : base(parent) 17 | { 18 | } 19 | } 20 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/ReferenceEntity.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using System.Linq; 9 | using Newtonsoft.Json; 10 | 11 | /// 12 | /// Initializes a new instance of the class. 13 | /// 14 | [JsonObject(MemberSerialization.OptIn)] 15 | public class ReferenceEntity : Entity 16 | { 17 | /// 18 | /// Gets or sets Referenced Entity Name 19 | /// 20 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 21 | public string Source { get; set; } 22 | 23 | /// 24 | /// Gets or sets Referenced Model Id 25 | /// 26 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 27 | public string ModelId { get; set; } 28 | 29 | /// 30 | internal override void Validate(bool allowUnresolvedModelReferences = true) 31 | { 32 | base.Validate(allowUnresolvedModelReferences); 33 | 34 | if (string.IsNullOrWhiteSpace(this.Source)) 35 | { 36 | throw new InvalidDataException($"{nameof(this.Source)} is not set for '{this.GetType().Name}'."); 37 | } 38 | 39 | if (string.IsNullOrWhiteSpace(this.ModelId)) 40 | { 41 | throw new InvalidDataException($"{nameof(this.ModelId)} is not set for '{this.GetType().Name}'."); 42 | } 43 | 44 | if (!allowUnresolvedModelReferences) 45 | { 46 | ReferenceModel referenceModel = ((Model)this.Parent).ReferenceModels.FirstOrDefault(rm => StringComparer.OrdinalIgnoreCase.Equals(rm.Id, this.ModelId)); 47 | if (referenceModel == null) 48 | { 49 | throw new InvalidDataException($"{nameof(ReferenceEntity)} {this.Name} doesn't have single reference model."); 50 | } 51 | } 52 | } 53 | } 54 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/ReferenceModel.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using System.Web; 9 | using Newtonsoft.Json; 10 | 11 | /// 12 | /// Model reference 13 | /// 14 | [JsonObject(MemberSerialization.OptIn)] 15 | public class ReferenceModel 16 | { 17 | /// 18 | /// Gets or sets the model for the reference 19 | /// 20 | /// For references to models that are managed by Power BI the format must be $"{workspaceId}/{modelId}" 21 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 22 | public string Id { get; set; } 23 | 24 | /// 25 | /// Gets or sets the value 26 | /// 27 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 28 | public Uri Location { get; set; } 29 | 30 | /// 31 | /// Validate 32 | /// 33 | internal void Validate() 34 | { 35 | if (string.IsNullOrWhiteSpace(this.Id)) 36 | { 37 | throw new InvalidDataException($"{this.Id} is not set for '{this.GetType().Name}'."); 38 | } 39 | 40 | if (this.Location == null) 41 | { 42 | throw new InvalidDataException($"{this.Location} is not set for '{this.GetType().Name}'."); 43 | } 44 | 45 | if (HttpUtility.UrlDecode(this.Location.AbsoluteUri).IndexOf(Model.ModelFileName) == -1) 46 | { 47 | throw new InvalidDataException($"{this.GetType().Name} {this.Location} is incorrect. It should point to {Model.ModelFileName}."); 48 | } 49 | } 50 | } 51 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/ReferenceModelCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Collections.Generic; 8 | using System.IO; 9 | 10 | /// 11 | /// Reference Models Collection 12 | /// 13 | public class ReferenceModelCollection : ObjectCollection 14 | { 15 | /// 16 | /// Validates that loaded model is correct and can function. 17 | /// 18 | internal void Validate() 19 | { 20 | var uniqueMonikers = new HashSet(StringComparer.OrdinalIgnoreCase); 21 | 22 | foreach (var referenceModel in this) 23 | { 24 | referenceModel.Validate(); 25 | 26 | if (!uniqueMonikers.Contains(referenceModel.Id)) 27 | { 28 | uniqueMonikers.Add(referenceModel.Id); 29 | } 30 | else 31 | { 32 | throw new InvalidDataException($"'{this.GetType().Name}' contains non-unique monikers."); 33 | } 34 | } 35 | } 36 | } 37 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/Relationship.cs: -------------------------------------------------------------------------------- 1 |  2 | namespace Microsoft.CdmFolders.SampleLibraries 3 | { 4 | /// 5 | /// Relationship 6 | /// 7 | public abstract class Relationship : MetadataObject 8 | { 9 | /// 10 | protected override int NameLengthMax => 1024; 11 | } 12 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/RelationshipCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using Newtonsoft.Json; 7 | 8 | /// 9 | /// RelationshipCollectio 10 | /// 11 | /// The usage of was approved in this by Azure security team since the TypeNameSerializationBinder limis the scope only to this assembly 12 | [JsonArray(ItemTypeNameHandling = TypeNameHandling.Auto)] 13 | public class RelationshipCollection : MetadataObjectCollection 14 | { 15 | /// 16 | /// Initializes a new instance of the class. 17 | /// 18 | /// The parent 19 | public RelationshipCollection(Model parent) 20 | : base(parent) 21 | { 22 | } 23 | } 24 | } 25 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/SchemaCollection.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.Collections.Generic; 8 | using System.IO; 9 | using System.Linq; 10 | using System.Web; 11 | 12 | /// 13 | /// Schema collection class 14 | /// 15 | public class SchemaCollection : ObjectCollection 16 | { 17 | /// 18 | /// Gets the schema uri and entity name for each schema uri 19 | /// 20 | public IEnumerable EntitiesSpec => this.Select(uri => uri.GetSchemaEntityInfo()); 21 | 22 | /// 23 | /// Validate well formed unique schema set 24 | /// 25 | internal void Validate() 26 | { 27 | var set = new HashSet(this.Where(u => u.IsSchemaUri()).Select(uri => uri.GetSchemaEntityInfo().ToString()), StringComparer.OrdinalIgnoreCase); 28 | if (set.Count != this.Count) 29 | { 30 | throw new InvalidDataException( 31 | $"schema collection contains non unique or invalid format schema items. Schema count = {this.Count}, valid, unique schema count = {set.Count}"); 32 | } 33 | } 34 | 35 | /// 36 | /// Check if the provided schema collection is logically equivalent to this one 37 | /// 38 | /// The schema collection to compare 39 | /// bool 40 | internal bool LogicallyEquivalent(SchemaCollection schemas) 41 | { 42 | if (schemas == null) 43 | { 44 | return false; 45 | } 46 | 47 | var set = new HashSet(this.Select(uri => HttpUtility.UrlDecode(uri.AbsoluteUri)), StringComparer.OrdinalIgnoreCase); 48 | return set.SetEquals(schemas.Select(uri => HttpUtility.UrlDecode(uri.AbsoluteUri))); 49 | } 50 | } 51 | } 52 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/SchemaEntityInfo.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | /// 7 | /// Entity information stored in a schema 8 | /// 9 | public class SchemaEntityInfo 10 | { 11 | /// 12 | /// Gets or sets the entity name 13 | /// 14 | public string EntityName { get; set; } 15 | 16 | /// 17 | /// Gets or sets the entity version 18 | /// 19 | public string EntityVersion { get; set; } 20 | 21 | /// 22 | /// Gets or sets the entity namespace 23 | /// 24 | public string EntityNamespace { get; set; } 25 | 26 | /// 27 | public override string ToString() 28 | { 29 | return string.IsNullOrEmpty(this.EntityNamespace) ? 30 | $"{this.EntityName}.{this.EntityVersion}.json" : 31 | $"{this.EntityNamespace}/{this.EntityName}.{this.EntityVersion}.json"; 32 | } 33 | } 34 | } 35 | -------------------------------------------------------------------------------- /CDM/dotnet/Model/SingleKeyRelationship.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using System.Linq; 9 | using Newtonsoft.Json; 10 | 11 | /// 12 | /// SingleKeyRelationship 13 | /// 14 | [JsonObject(MemberSerialization.OptIn)] 15 | public class SingleKeyRelationship : Relationship 16 | { 17 | /// 18 | /// Gets or sets the FromAttribute 19 | /// 20 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 21 | public AttributeReference FromAttribute { get; set; } 22 | 23 | /// 24 | /// Gets or sets the ToAttribute 25 | /// 26 | [JsonProperty(NullValueHandling = NullValueHandling.Ignore)] 27 | public AttributeReference ToAttribute { get; set; } 28 | 29 | /// 30 | internal override void Validate(bool allowUnresolvedModelReferences = true) 31 | { 32 | base.Validate(allowUnresolvedModelReferences); 33 | 34 | if (this.FromAttribute == null) 35 | { 36 | throw new InvalidDataException($"'{nameof(SingleKeyRelationship)}' - '{nameof(this.FromAttribute)}' is not set."); 37 | } 38 | 39 | if (this.ToAttribute == null) 40 | { 41 | throw new InvalidDataException($"'{nameof(SingleKeyRelationship)}' - '{nameof(this.ToAttribute)}' is not set."); 42 | } 43 | 44 | this.FromAttribute.Validate(); 45 | this.ToAttribute.Validate(); 46 | 47 | if (ReferenceEquals(this.FromAttribute, this.ToAttribute)) 48 | { 49 | throw new InvalidDataException($"'{nameof(SingleKeyRelationship)}' must exist between different attribute references."); 50 | } 51 | } 52 | } 53 | } -------------------------------------------------------------------------------- /CDM/dotnet/Model/UriExtensions.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries 5 | { 6 | using System; 7 | using System.IO; 8 | using System.Text.RegularExpressions; 9 | using System.Web; 10 | 11 | /// 12 | /// Reference Model Extensions 13 | /// 14 | internal static class UriExtensions 15 | { 16 | private const string CdmStandardRepoHost = "raw.githubusercontent.com"; 17 | private const string CdmStandardRepoPath = "microsoft/cdm/master/schemadocuments/core/applicationcommon/"; 18 | private static readonly Regex EntityInfoRegex = new Regex(@"^(?[^.]*)\.(?[0-9.]+)(\.cdm){0,1}\.json$", RegexOptions.Compiled); 19 | 20 | /// 21 | /// Utility method to make the model location canonical ( removes query string ) 22 | /// 23 | /// Reference Model Location 24 | /// Reference Model Location with query string removed 25 | public static Uri CanonicalizeModelLocation(this Uri location) 26 | { 27 | string absoluteUri = HttpUtility.UrlDecode(location.AbsoluteUri ?? string.Empty); 28 | 29 | if (absoluteUri.IndexOf(Model.ModelFileName) == -1) 30 | { 31 | throw new InvalidDataException( 32 | $"CanonizeModelLocation: Invalid model location. Received location absolute uri:'{absoluteUri}'. The location should point to {Model.ModelFileName}."); 33 | } 34 | 35 | return new Uri(absoluteUri.Substring(0, absoluteUri.IndexOf(Model.ModelFileName) + Model.ModelFileName.Length)); 36 | } 37 | 38 | /// 39 | /// Checks if the provided uri matches a model schema pattern 40 | /// 41 | /// The uri to check 42 | /// bool 43 | public static bool IsSchemaUri(this Uri uri) 44 | { 45 | if (uri == null) 46 | { 47 | return false; 48 | } 49 | 50 | string path = HttpUtility.UrlDecode(uri.AbsolutePath).Trim('/'); 51 | return 52 | path.EndsWith(Model.ModelFileExtension) && 53 | string.IsNullOrEmpty(uri.Query) && 54 | CdmStandardRepoHost.Equals(HttpUtility.UrlDecode(uri.Host), StringComparison.OrdinalIgnoreCase) && 55 | path.StartsWith(CdmStandardRepoPath, StringComparison.OrdinalIgnoreCase) && 56 | uri.GetSchemaEntityInfo() != null; 57 | } 58 | 59 | /// 60 | /// Get the entity name from a schema Uri 61 | /// 62 | /// the schema Uri 63 | /// The entity information or null 64 | public static SchemaEntityInfo GetSchemaEntityInfo(this Uri uri) 65 | { 66 | if (uri == null || string.IsNullOrEmpty(uri.AbsolutePath)) 67 | { 68 | return null; 69 | } 70 | 71 | string fullEntityInfo = HttpUtility.UrlDecode(uri.AbsolutePath).Trim('/'); 72 | int endOfPrefix = fullEntityInfo.IndexOf(CdmStandardRepoPath, StringComparison.OrdinalIgnoreCase); 73 | if (endOfPrefix == 0) 74 | { 75 | fullEntityInfo = fullEntityInfo.Substring(CdmStandardRepoPath.Length); 76 | } 77 | 78 | int endOfPath = fullEntityInfo.LastIndexOf('/'); 79 | string path = endOfPath < 0 ? string.Empty : fullEntityInfo.Substring(0, endOfPath); 80 | string entityInfo = endOfPath < 1 ? fullEntityInfo : fullEntityInfo.Substring(endOfPath + 1); 81 | Match match = EntityInfoRegex.Match(entityInfo); 82 | return !match.Success 83 | ? null 84 | : new SchemaEntityInfo 85 | { 86 | EntityName = match.Groups["name"].Value, 87 | EntityVersion = match.Groups["version"].Value, 88 | EntityNamespace = path, 89 | }; 90 | } 91 | } 92 | } -------------------------------------------------------------------------------- /CDM/dotnet/README.md: -------------------------------------------------------------------------------- 1 | # CDM Folders sample - .NET libraries 2 | 3 | This project contains sample .NET libraries to read and write a model.json file in a CDM folder. 4 | 5 | We'll continue to extend the capabilities of the libraries, please submit your suggestions as issues in the repo. -------------------------------------------------------------------------------- /CDM/dotnet/SerializationHelpers/CollectionsContractResolver.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries.SerializationHelpers 5 | { 6 | using System; 7 | using System.Collections; 8 | using System.Linq; 9 | using System.Reflection; 10 | using Newtonsoft.Json; 11 | using Newtonsoft.Json.Serialization; 12 | 13 | /// 14 | /// This resolver overrides the property serialization so that 15 | /// the property that is an empty collection is not serialized 16 | /// 17 | public class CollectionsContractResolver : CamelCasePropertyNamesContractResolver 18 | { 19 | /// 20 | protected override JsonDictionaryContract CreateDictionaryContract(Type objectType) 21 | { 22 | JsonDictionaryContract contract = base.CreateDictionaryContract(objectType); 23 | 24 | contract.DictionaryKeyResolver = propertyName => propertyName; 25 | 26 | return contract; 27 | } 28 | 29 | /// 30 | protected override JsonProperty CreateProperty(MemberInfo member, MemberSerialization memberSerialization) 31 | { 32 | JsonProperty property = base.CreateProperty(member, memberSerialization); 33 | Predicate shouldSerializePredicates = property.ShouldSerialize; 34 | 35 | if (property.PropertyType != typeof(string) && property.PropertyType.GetInterfaces().Contains(typeof(IEnumerable))) 36 | { 37 | property.ShouldSerialize = sourceObj => (shouldSerializePredicates == null || shouldSerializePredicates(sourceObj)) 38 | && this.ShouldSerializeCollection(property, sourceObj); 39 | } 40 | 41 | return property; 42 | } 43 | 44 | private bool ShouldSerializeCollection(JsonProperty property, object sourceObj) 45 | { 46 | var enumerable = property.ValueProvider.GetValue(sourceObj) as IEnumerable; 47 | if (enumerable != null) 48 | { 49 | return enumerable.GetEnumerator().MoveNext(); 50 | } 51 | 52 | return false; 53 | } 54 | } 55 | } 56 | -------------------------------------------------------------------------------- /CDM/dotnet/SerializationHelpers/SerializationOrderConstants.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries.SerializationHelpers 5 | { 6 | /// 7 | /// SerializationOrderConstants 8 | /// 9 | internal static class SerializationOrderConstants 10 | { 11 | /// 12 | /// DataObjectSerializationOrder 13 | /// 14 | public const int DataObjectSerializationOrder = -2; 15 | 16 | /// 17 | /// MetadataObjectSerializationOrder 18 | /// 19 | public const int MetadataObjectSerializationOrder = -3; 20 | 21 | /// 22 | /// ObjectSerializationOrder 23 | /// 24 | public const int ObjectSerializationOrder = 10; 25 | 26 | /// 27 | /// CollectionSerializationOrder 28 | /// 29 | public const int CollectionSerializationOrder = 20; 30 | } 31 | } 32 | -------------------------------------------------------------------------------- /CDM/dotnet/SerializationHelpers/StringEnumCamelCaseConverter.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries.SerializationHelpers 5 | { 6 | using Newtonsoft.Json.Converters; 7 | 8 | /// 9 | /// StringEnumCamelCaseConverter 10 | /// 11 | public class StringEnumCamelCaseConverter : StringEnumConverter 12 | { 13 | /// 14 | /// Initializes a new instance of the class. 15 | /// 16 | public StringEnumCamelCaseConverter() 17 | { 18 | this.CamelCaseText = true; 19 | } 20 | } 21 | } 22 | -------------------------------------------------------------------------------- /CDM/dotnet/SerializationHelpers/TypeNameSerializationBinder.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries.SerializationHelpers 5 | { 6 | using System; 7 | using Newtonsoft.Json.Serialization; 8 | 9 | /// 10 | /// TypeNameSerializationBinder 11 | /// 12 | public class TypeNameSerializationBinder : ISerializationBinder 13 | { 14 | /// 15 | public void BindToName(Type serializedType, out string assemblyName, out string typeName) 16 | { 17 | TypeNameSerializationBinderHelper.BindToName(serializedType, out assemblyName, out typeName); 18 | } 19 | 20 | /// 21 | public Type BindToType(string assemblyName, string typeName) 22 | { 23 | return TypeNameSerializationBinderHelper.BindToType(assemblyName, typeName); 24 | } 25 | } 26 | } 27 | -------------------------------------------------------------------------------- /CDM/dotnet/SerializationHelpers/TypeNameSerializationBinderHelper.cs: -------------------------------------------------------------------------------- 1 | // Copyright (c) Microsoft Corporation. All rights reserved. 2 | // Licensed under the MIT License. 3 | 4 | namespace Microsoft.CdmFolders.SampleLibraries.SerializationHelpers 5 | { 6 | using System; 7 | 8 | /// 9 | /// TypeNameSerializationBinderHelper 10 | /// 11 | public class TypeNameSerializationBinderHelper 12 | { 13 | /// 14 | /// BindToName 15 | /// 16 | /// serializedType 17 | /// assemblyName 18 | /// typeName 19 | public static void BindToName(Type serializedType, out string assemblyName, out string typeName) 20 | { 21 | assemblyName = null; 22 | typeName = serializedType.Name; 23 | } 24 | 25 | /// 26 | /// BindToType 27 | /// 28 | /// assemblyName 29 | /// typeName 30 | /// Type 31 | public static Type BindToType(string assemblyName, string typeName) 32 | { 33 | string resolvedTypeName = $"{typeof(MetadataObject).Namespace}.{typeName}, {typeof(MetadataObject).Assembly}"; 34 | return Type.GetType(resolvedTypeName, true); 35 | } 36 | } 37 | } 38 | -------------------------------------------------------------------------------- /CDM/python/CdmModel.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Microsoft Corporation. All rights reserved. 2 | # Licensed under the MIT License. 3 | 4 | # Tested with Python 3.4 5 | import json 6 | import re 7 | from enum import Enum 8 | from collections import OrderedDict 9 | 10 | def getattrIgnoreCase(obj, attr, default=None): 11 | for i in dir(obj): 12 | if i.lower() == attr.lower(): 13 | return getattr(obj, attr, default) 14 | return default 15 | 16 | 17 | class SchemaEntry(object): 18 | __unassigned = object() 19 | 20 | def __init__(self, name, cls, defaultValue = None, verbose = False): 21 | self.name = name 22 | self.cls = cls 23 | if defaultValue is None and issubclass(cls, list): 24 | defaultValue = cls() 25 | self.defaultValue = defaultValue 26 | self.verbose = verbose 27 | 28 | def shouldSerialize(self, value): 29 | if self.verbose: 30 | return True 31 | if issubclass(self.cls, list): 32 | return len(value) > 0 33 | return self.defaultValue != value 34 | 35 | 36 | class PolymorphicMeta(type): 37 | classes = {} 38 | 39 | def __new__(cls, name, bases, attrs): 40 | cls = type.__new__(cls, name, bases, attrs) 41 | cls.classes[cls] = {cls.__name__ : cls} # TODO: abstract? 42 | cls.__appendBases(bases, cls) 43 | return cls 44 | 45 | @staticmethod 46 | def __appendBases(bases, cls): 47 | for base in bases: 48 | basemap = cls.classes.get(base, None) 49 | if basemap is not None: 50 | basemap[cls.__name__] = cls 51 | cls.__appendBases(base.__bases__, cls) 52 | 53 | class Polymorphic(metaclass=PolymorphicMeta): 54 | @classmethod 55 | def fromJson(cls, value): 56 | actualClass = PolymorphicMeta.classes[cls][value["$type"]] 57 | return super(Polymorphic, actualClass).fromJson(value) 58 | 59 | class Base(object): 60 | __ctors = {} 61 | schema = () 62 | 63 | def __init__(self): 64 | for entry in self.schema: 65 | setattr(self, entry.name, entry.defaultValue) 66 | 67 | @classmethod 68 | def fromJson(cls, value): 69 | result = cls() 70 | for entry in cls.schema: 71 | element = value.pop(entry.name, result) 72 | if element != result: 73 | setattr(result, entry.name, cls.__getCtor(entry.cls)(element)) 74 | result.customProperties = value 75 | return result 76 | 77 | @classmethod 78 | def __getCtor(cls, type): 79 | ctor = cls.__ctors.get(type, None) 80 | if not ctor: 81 | ctor = getattr(type, "fromJson", type) 82 | cls.__ctors[type] = ctor 83 | return ctor 84 | 85 | def validate(self): 86 | tmp = object() 87 | className = self.__class__.__name__ 88 | for entry in self.schema: 89 | element = getattrIgnoreCase(self, entry.name, tmp) 90 | if element != tmp and element is not None: 91 | if not isinstance(element, entry.cls): 92 | raise TypeError("%s.%s must be of type %s" % (className, entry.name, entry.cls)) 93 | getattr(element, "validate", lambda: None)() 94 | 95 | def toJson(self): 96 | result = OrderedDict() 97 | if isinstance(self, Polymorphic): 98 | result["$type"] = self.__class__.__name__ 99 | for entry in self.schema: 100 | element = getattrIgnoreCase(self, entry.name, result) 101 | if element != result and entry.shouldSerialize(element): 102 | result[entry.name] = getattr(element, "toJson", lambda: element)() 103 | result.update(getattrIgnoreCase(self, "customProperties", {})) 104 | return result 105 | 106 | class ObjectCollection(list, Base): 107 | def append(self, item): 108 | if not isinstance(item, self.itemType): 109 | raise TypeError("item is not of type %s" % self.itemType) 110 | super(ObjectCollection, self).append(item) 111 | 112 | @classmethod 113 | def fromJson(cls, value): 114 | result = cls() 115 | ctor = getattr(cls.itemType, "fromJson", cls.itemType) 116 | for item in value: 117 | super(ObjectCollection, result).append(ctor(item)) 118 | return result 119 | 120 | def toJson(self): 121 | result = [] 122 | for item in self: 123 | result.append(getattr(item, "toJson", lambda: item)()) 124 | return result 125 | 126 | def validate(self): 127 | for item in self: 128 | item.validate() 129 | 130 | 131 | String = str 132 | Uri = str 133 | DateTimeOffset = str 134 | 135 | class JsonEnum(Enum): 136 | def toJson(self): 137 | return self.value 138 | 139 | class CsvQuoteStyle(JsonEnum): 140 | Csv = "QuoteStyle.Csv" 141 | None_ = "QuoteStyle.None" 142 | 143 | class CsvStyle(JsonEnum): 144 | QuoteAlways = "CsvStyle.QuoteAlways" 145 | QuoteAfterDelimiter = "CsvStyle.QuoteAfterDelimiter" 146 | 147 | class DataType(JsonEnum): 148 | # TODO: Fix autogeneration 149 | Unclassified = "unclassified" 150 | String = "string" 151 | Int64 = "int64" 152 | Double = "double" 153 | DateTime = "dateTime" 154 | DateTimeOffset = "dateTimeOffset" 155 | Decimal = "decimal" 156 | Boolean = "boolean" 157 | Guid = "guid" 158 | Json = "json" 159 | 160 | class Annotation(Base): 161 | schema = Base.schema + ( 162 | SchemaEntry("name", String), 163 | SchemaEntry("value", String) 164 | ) 165 | 166 | def validate(self): 167 | super().validate() 168 | className = self.__class__.__name__ 169 | if not self.name: 170 | raise ValueError("%s.name is not set." % (className, )) 171 | 172 | class AnnotationCollection(ObjectCollection): 173 | itemType = Annotation 174 | 175 | class MetadataObject(Base): 176 | schema = Base.schema + ( 177 | SchemaEntry("name", String), 178 | SchemaEntry("description", String), 179 | SchemaEntry("annotations", AnnotationCollection) 180 | ) 181 | 182 | nameLengthMin = 1 183 | nameLengthMax = 256 184 | invalidNameRegex = re.compile("^\\s|\\s$") 185 | descriptionLengthMax = 4000 186 | 187 | def __repr__(self): 188 | name = getattr(self, "name", None) 189 | className = self.__class__.__name__ 190 | if name: 191 | return "<%s '%s'>" % (className, name) 192 | else: 193 | return "<%s>" % (className, ) 194 | 195 | def validate(self): 196 | super().validate() 197 | className = self.__class__.__name__ 198 | if self.name is not None: 199 | if len(self.name) > self.nameLengthMax or len(self.name) < self.nameLengthMin: 200 | raise ValueError("Length of %s.name (%d) is not between %d and %d." % (className, len(self.name), self.nameLengthMin, self.nameLengthMax)) 201 | if self.invalidNameRegex.search(self.name): 202 | raise ValueError("%s.name cannot contain leading or trailing blank spaces or consist only of whitespace." % (className, )) 203 | if self.description is not None and len(self.description) > self.descriptionLengthMax: 204 | raise ValueError("Length of %s.description (%d) may not exceed %d." % (className, len(self.name), self.nameLengthMin, self.nameLengthMax)) 205 | 206 | class MetadataObjectCollection(ObjectCollection): 207 | def __getitem__(self, index): 208 | if type(index) == str: 209 | index = next((i for i,item in enumerate(self) if item.name.lower() == index.lower()), None) 210 | if index is None: 211 | return None 212 | return super(MetadataObjectCollection, self).__getitem__(index) 213 | 214 | def validate(self): 215 | super().validate() 216 | className = self.__class__.__name__ 217 | s = set() 218 | for item in self: 219 | if item.name != None and item.name in s: 220 | raise ValueError("%s contains non-unique item name '%s'" % (className, item.name)) 221 | s.add(item.name) 222 | 223 | class DataObject(MetadataObject): 224 | schema = MetadataObject.schema + ( 225 | SchemaEntry("isHidden", bool, False), 226 | ) 227 | 228 | def validate(self): 229 | super().validate() 230 | className = self.__class__.__name__ 231 | if self.name is None: 232 | raise ValueError("%s.name is not set" % (className, )) 233 | 234 | class SchemaCollection(ObjectCollection): 235 | itemType = Uri 236 | 237 | class Reference(Base): 238 | schema = Base.schema + ( 239 | SchemaEntry("id", String), 240 | SchemaEntry("location", Uri) 241 | ) 242 | 243 | class ReferenceCollection(ObjectCollection): 244 | itemType = Reference 245 | 246 | class AttributeReference(Base): 247 | schema = Base.schema + ( 248 | SchemaEntry("entityName", String), 249 | SchemaEntry("attributeName", String) 250 | ) 251 | 252 | def __eq__(self, other): 253 | return isinstance(other, self.__class__) and self.entityName == other.entityName and self.attributeName == other.attributeName 254 | 255 | def __ne__(self, other): 256 | return not self.__eq__(other) 257 | 258 | def validate(self): 259 | super().validate() 260 | className = self.__class__.__name__ 261 | if not self.entityName: 262 | raise ValueError("%s.entityName is not set" % (className, )) 263 | if not self.attributeName: 264 | raise ValueError("%s.attributeName is not set" % (className, )) 265 | 266 | class Relationship(Polymorphic, Base): 267 | pass 268 | 269 | class SingleKeyRelationship(Relationship): 270 | schema = Relationship.schema + ( 271 | SchemaEntry("fromAttribute", AttributeReference), 272 | SchemaEntry("toAttribute", AttributeReference) 273 | ) 274 | 275 | def validate(self): 276 | super().validate() 277 | className = self.__class__.__name__ 278 | if self.fromAttribute is None: 279 | raise ValueError("%s.fromAttribute is not set" % (className, )) 280 | if self.toAttribute is None: 281 | raise ValueError("%s.toAttribute is not set" % (className, )) 282 | if self.fromAttribute == self.toAttribute: 283 | raise ValueError("%s must exist between different attribute references" % (className, )) 284 | 285 | class RelationshipCollection(ObjectCollection): 286 | itemType = Relationship 287 | 288 | class FileFormatSettings(Polymorphic, Base): 289 | pass 290 | 291 | class CsvFormatSettings(FileFormatSettings): 292 | schema = FileFormatSettings.schema + ( 293 | SchemaEntry("columnHeaders", bool, False), 294 | SchemaEntry("delimiter", String, ","), 295 | SchemaEntry("quoteStyle", CsvQuoteStyle, CsvQuoteStyle.Csv), 296 | SchemaEntry("csvStyle", CsvStyle, CsvStyle.QuoteAlways), 297 | ) 298 | 299 | class Partition(DataObject): 300 | schema = DataObject.schema + ( 301 | SchemaEntry("refreshTime", DateTimeOffset), 302 | SchemaEntry("location", Uri), 303 | SchemaEntry("fileFormatSettings", FileFormatSettings) 304 | ) 305 | 306 | class PartitionCollection(MetadataObjectCollection): 307 | itemType = Partition 308 | 309 | class Attribute(MetadataObject): 310 | schema = MetadataObject.schema + ( 311 | SchemaEntry("dataType", DataType), 312 | ) 313 | 314 | def __repr__(self): 315 | return "<[%s]>" % (getattr(self, "name", "(unnamed)"), ) 316 | 317 | class AttributeCollection(MetadataObjectCollection): 318 | itemType = Attribute 319 | 320 | class Entity(Polymorphic, DataObject): 321 | invalidEntityNameRegex = re.compile("\\.|\"") 322 | 323 | def validate(self): 324 | super().validate() 325 | if self.invalidEntityNameRegex.search(self.name): 326 | raise ValueError("%s.name cannot contain dot or quotation mark." % (self.__class__.__name__, )) 327 | 328 | class LocalEntity(Entity): 329 | schema = Entity.schema + ( 330 | SchemaEntry("schemas", SchemaCollection), 331 | SchemaEntry("attributes", AttributeCollection), 332 | SchemaEntry("partitions", PartitionCollection) 333 | ) 334 | 335 | class ReferenceEntity(Entity): 336 | schema = Entity.schema + ( 337 | SchemaEntry("refreshTime", DateTimeOffset), 338 | SchemaEntry("source", String), 339 | SchemaEntry("modelId", String) 340 | ) 341 | 342 | def validate(self): 343 | super().validate() 344 | className = self.__class__.__name__ 345 | if not self.source: 346 | raise ValueError("%s.source is not set." % (className, )) 347 | if not self.modelId: 348 | raise ValueError("%s.modelId is not set." % (className, )) 349 | 350 | # TODO: Validate model references 351 | 352 | class EntityCollection(MetadataObjectCollection): 353 | itemType = Entity 354 | 355 | class Model(DataObject): 356 | schema = DataObject.schema + ( 357 | SchemaEntry("application", String), 358 | SchemaEntry("version", String), 359 | SchemaEntry("modifiedTime", DateTimeOffset), 360 | SchemaEntry("culture", String), 361 | SchemaEntry("referenceModels", ReferenceCollection), 362 | SchemaEntry("entities", EntityCollection, verbose=True), 363 | SchemaEntry("relationships", RelationshipCollection) 364 | ) 365 | 366 | currentSchemaVersion = "1.0" 367 | 368 | def __init__(self, name = None): 369 | super().__init__() 370 | self.name = name 371 | self.version = self.currentSchemaVersion 372 | 373 | @classmethod 374 | def fromJson(cls, value): 375 | if isinstance(value, str): 376 | value = json.loads(value) 377 | elif not isinstance(value, dict): 378 | value = json.load(value) 379 | return super(Model, cls).fromJson(value) 380 | 381 | def toJson(self): 382 | return json.dumps(super().toJson()) 383 | 384 | def validate(self, allowUnresolvedModelReferences = True): 385 | super().validate() 386 | if self.version != self.currentSchemaVersion: 387 | raise ValueError("Invalid model version '%s'", self.version) 388 | if not allowUnresolvedModelReferences: 389 | for entity in self.entities: 390 | if isinstance(entity, ReferenceEntity): 391 | found = next((model for model in self.referenceModels if model.id == entity.modelId), None) 392 | if found is None: 393 | raise ValueError("ReferenceEntity '%s' doesn't have a reference model" % (entity.name, )) 394 | -------------------------------------------------------------------------------- /CDM/python/README.md: -------------------------------------------------------------------------------- 1 | # CDM Folders sample - Python libraries 2 | 3 | This folder contains sample Python libraries to read and write a model.json file in a CDM folder. This was tested with Python 3.4. 4 | 5 | We'll continue to extend the capabilities of the libraries, please submit your suggestions as issues in the repo. -------------------------------------------------------------------------------- /CDM/schema/README.md: -------------------------------------------------------------------------------- 1 | # CDM Folders sample - model.json schema 2 | 3 | This includes the json schema for the model.json file. The detailed documentation can be found here: 4 | https://docs.microsoft.com/en-us/common-data-model/model-json 5 | 6 | There are also a few examples of model.json files created by Power BI dataflows. We'll continue to add more examples, please submit your suggestions as issues in the repo. 7 | -------------------------------------------------------------------------------- /CDM/schema/examples/OrdersProducts/model.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "OrdersProductsV3", 3 | "description": "", 4 | "version": "1.0", 5 | "pbi:mashup": { 6 | "fastCombine": true, 7 | "allowNativeQueries": false, 8 | "queriesMetadata": { 9 | "Orders": { 10 | "queryId": "fa43f66e-e3c7-4e30-ad38-3eaf410267ae", 11 | "queryName": "Orders", 12 | "loadEnabled": true 13 | }, 14 | "Products": { 15 | "queryId": "51638cb0-67d6-4b52-9aad-07641462e007", 16 | "queryName": "Products", 17 | "loadEnabled": true 18 | } 19 | }, 20 | "document": "section Section1;\r\nshared Orders = let\r\n Source = OData.Feed(\"https://services.odata.org/Northwind/Northwind.svc/\"),\r\n #\"Navigation 1\" = Source{[Name = \"Orders\", Signature = \"table\"]}[Data],\r\n #\"Remove columns\" = Table.RemoveColumns(#\"Navigation 1\", Table.ColumnsOfType(#\"Navigation 1\", {type table, type record, type list, type nullable binary, type binary, type function}))\r\nin\r\n #\"Remove columns\";\r\nshared Products = let\r\n Source = OData.Feed(\"https://services.odata.org/Northwind/Northwind.svc/\"),\r\n #\"Navigation 1\" = Source{[Name = \"Products\", Signature = \"table\"]}[Data],\r\n #\"Mapped to standard\" = Cdm.MapToEntity(#\"Navigation 1\", {{\"productId\", \"ProductID\", type text}, {\"name\", \"ProductName\", type text}, {\"supplierName\", \"SupplierID\", type text}, {\"CategoryID\", \"CategoryID\"}, {\"size\", \"QuantityPerUnit\", type text}, {\"price\", \"UnitPrice\", type number}, {\"UnitsInStock\", \"UnitsInStock\"}, {\"UnitsOnOrder\", \"UnitsOnOrder\"}, {\"ReorderLevel\", \"ReorderLevel\"}, {\"Discontinued\", \"Discontinued\"}, {\"Category\", \"Category\"}, {\"Order_Details\", \"Order_Details\"}, {\"Supplier\", \"Supplier\"}, {\"createdOn\", null, type datetime}, {\"createdBy\", null, type text}, {\"modifiedOn\", null, type datetime}, {\"modifiedBy\", null, type text}, {\"createdOnBehalfBy\", null, type text}, {\"modifiedOnBehalfBy\", null, type text}, {\"organizationId\", null, type text}, {\"versionNumber\", null, Int64.Type}, {\"importSequenceNumber\", null, Int64.Type}, {\"overriddenCreatedOn\", null, type datetime}, {\"timeZoneRuleVersionNumber\", null, Int64.Type}, {\"UTCConversionTimeZoneCode\", null, Int64.Type}, {\"processId\", null, type text}, {\"stageId\", null, type text}, {\"traversedPath\", null, type text}, {\"vendorID\", null, type text}, {\"validFromDate\", null, type datetime}, {\"validToDate\", null, type datetime}, {\"currentCost\", null, type number}, {\"transactionCurrencyId\", null, type text}, {\"exchangeRate\", null, type number}, {\"currentCostBase\", null, type number}, {\"defaultUoMId\", null, type text}, {\"defaultUoMScheduleId\", null, type text}, {\"description\", null, type text}, {\"isKit\", null, type logical}, {\"isStockItem\", null, type logical}, {\"parentProductId\", null, type text}, {\"priceBase\", null, type number}, {\"productStructure\", null, Int64.Type}, {\"productStructure_display\", null, type text}, {\"productNumber\", null, type text}, {\"productTypeCode\", null, Int64.Type}, {\"productTypeCode_display\", null, type text}, {\"productUrl\", null, type text}, {\"quantityDecimal\", null, Int64.Type}, {\"quantityOnHand\", null, type number}, {\"standardCost\", null, type number}, {\"standardCostBase\", null, type number}, {\"stateCode\", null, Int64.Type}, {\"stateCode_display\", null, type text}, {\"statusCode\", null, Int64.Type}, {\"statusCode_display\", null, type text}, {\"stockVolume\", null, type number}, {\"stockWeight\", null, type number}, {\"vendorName\", null, type text}, {\"vendorPartNumber\", null, type text}, {\"hierarchyPath\", null, type text}, {\"priceLevelId\", null, type text}, {\"subjectId\", null, type text}, {\"entityImageId\", null, type text}, {\"createdByExternalParty\", null, type text}, {\"modifiedByExternalParty\", null, type text}, {\"form\", null, type text}, {\"isBrand\", null, type logical}, {\"isOvertheCounter\", null, type logical}, {\"medicationCode\", null, type text}, {\"packageContainer\", null, type text}}, null, \"Product\"),\r\n #\"Remove columns\" = Table.RemoveColumns(#\"Mapped to standard\", Table.ColumnsOfType(#\"Mapped to standard\", {type table, type record, type list, type nullable binary, type binary, type function}))\r\nin\r\n #\"Remove columns\";\r\n" 21 | }, 22 | "entities": [ 23 | { 24 | "$type": "LocalEntity", 25 | "name": "Orders", 26 | "description": "", 27 | "pbi:refreshPolicy": { 28 | "$type": "FullRefreshPolicy", 29 | "location": "Orders.csv" 30 | }, 31 | "attributes": [ 32 | { 33 | "name": "OrderID", 34 | "dataType": "int64" 35 | }, 36 | { 37 | "name": "CustomerID", 38 | "dataType": "string" 39 | }, 40 | { 41 | "name": "EmployeeID", 42 | "dataType": "int64" 43 | }, 44 | { 45 | "name": "OrderDate", 46 | "dataType": "dateTime" 47 | }, 48 | { 49 | "name": "RequiredDate", 50 | "dataType": "dateTime" 51 | }, 52 | { 53 | "name": "ShippedDate", 54 | "dataType": "dateTime" 55 | }, 56 | { 57 | "name": "ShipVia", 58 | "dataType": "int64" 59 | }, 60 | { 61 | "name": "Freight", 62 | "dataType": "decimal" 63 | }, 64 | { 65 | "name": "ShipName", 66 | "dataType": "string" 67 | }, 68 | { 69 | "name": "ShipAddress", 70 | "dataType": "string" 71 | }, 72 | { 73 | "name": "ShipCity", 74 | "dataType": "string" 75 | }, 76 | { 77 | "name": "ShipRegion", 78 | "dataType": "string" 79 | }, 80 | { 81 | "name": "ShipPostalCode", 82 | "dataType": "string" 83 | }, 84 | { 85 | "name": "ShipCountry", 86 | "dataType": "string" 87 | } 88 | ], 89 | "partitions": [ 90 | { 91 | "name": "Part001", 92 | "refreshTime": "2018-11-14T19:37:51.756186+00:00", 93 | "location": "https://dfmsitscuscdsa.blob.core.windows.net/0682aad0-37c2-4eac-a4d3-e243f7e0afc4/Orders.csv?snapshot=2018-11-14T19:37:51.7526830Z" 94 | } 95 | ] 96 | }, 97 | { 98 | "$type": "LocalEntity", 99 | "name": "Products", 100 | "description": "", 101 | "pbi:refreshPolicy": { 102 | "$type": "FullRefreshPolicy", 103 | "location": "Products.csv" 104 | }, 105 | "annotations": [ 106 | { 107 | "name": "pbi:MappingDisplayHint", 108 | "value": "Product" 109 | } 110 | ], 111 | "attributes": [ 112 | { 113 | "name": "productId", 114 | "dataType": "string" 115 | }, 116 | { 117 | "name": "name", 118 | "dataType": "string" 119 | }, 120 | { 121 | "name": "supplierName", 122 | "dataType": "string" 123 | }, 124 | { 125 | "name": "CategoryID", 126 | "dataType": "int64" 127 | }, 128 | { 129 | "name": "size", 130 | "dataType": "string" 131 | }, 132 | { 133 | "name": "price", 134 | "dataType": "double" 135 | }, 136 | { 137 | "name": "UnitsInStock", 138 | "dataType": "int64" 139 | }, 140 | { 141 | "name": "UnitsOnOrder", 142 | "dataType": "int64" 143 | }, 144 | { 145 | "name": "ReorderLevel", 146 | "dataType": "int64" 147 | }, 148 | { 149 | "name": "Discontinued", 150 | "dataType": "boolean" 151 | }, 152 | { 153 | "name": "createdOn", 154 | "dataType": "dateTime" 155 | }, 156 | { 157 | "name": "createdBy", 158 | "dataType": "string" 159 | }, 160 | { 161 | "name": "modifiedOn", 162 | "dataType": "dateTime" 163 | }, 164 | { 165 | "name": "modifiedBy", 166 | "dataType": "string" 167 | }, 168 | { 169 | "name": "createdOnBehalfBy", 170 | "dataType": "string" 171 | }, 172 | { 173 | "name": "modifiedOnBehalfBy", 174 | "dataType": "string" 175 | }, 176 | { 177 | "name": "organizationId", 178 | "dataType": "string" 179 | }, 180 | { 181 | "name": "versionNumber", 182 | "dataType": "int64" 183 | }, 184 | { 185 | "name": "importSequenceNumber", 186 | "dataType": "int64" 187 | }, 188 | { 189 | "name": "overriddenCreatedOn", 190 | "dataType": "dateTime" 191 | }, 192 | { 193 | "name": "timeZoneRuleVersionNumber", 194 | "dataType": "int64" 195 | }, 196 | { 197 | "name": "UTCConversionTimeZoneCode", 198 | "dataType": "int64" 199 | }, 200 | { 201 | "name": "processId", 202 | "dataType": "string" 203 | }, 204 | { 205 | "name": "stageId", 206 | "dataType": "string" 207 | }, 208 | { 209 | "name": "traversedPath", 210 | "dataType": "string" 211 | }, 212 | { 213 | "name": "vendorID", 214 | "dataType": "string" 215 | }, 216 | { 217 | "name": "validFromDate", 218 | "dataType": "dateTime" 219 | }, 220 | { 221 | "name": "validToDate", 222 | "dataType": "dateTime" 223 | }, 224 | { 225 | "name": "currentCost", 226 | "dataType": "double" 227 | }, 228 | { 229 | "name": "transactionCurrencyId", 230 | "dataType": "string" 231 | }, 232 | { 233 | "name": "exchangeRate", 234 | "dataType": "double" 235 | }, 236 | { 237 | "name": "currentCostBase", 238 | "dataType": "double" 239 | }, 240 | { 241 | "name": "defaultUoMId", 242 | "dataType": "string" 243 | }, 244 | { 245 | "name": "defaultUoMScheduleId", 246 | "dataType": "string" 247 | }, 248 | { 249 | "name": "description", 250 | "dataType": "string" 251 | }, 252 | { 253 | "name": "isKit", 254 | "dataType": "boolean" 255 | }, 256 | { 257 | "name": "isStockItem", 258 | "dataType": "boolean" 259 | }, 260 | { 261 | "name": "parentProductId", 262 | "dataType": "string" 263 | }, 264 | { 265 | "name": "priceBase", 266 | "dataType": "double" 267 | }, 268 | { 269 | "name": "productStructure", 270 | "dataType": "int64" 271 | }, 272 | { 273 | "name": "productStructure_display", 274 | "dataType": "string" 275 | }, 276 | { 277 | "name": "productNumber", 278 | "dataType": "string" 279 | }, 280 | { 281 | "name": "productTypeCode", 282 | "dataType": "int64" 283 | }, 284 | { 285 | "name": "productTypeCode_display", 286 | "dataType": "string" 287 | }, 288 | { 289 | "name": "productUrl", 290 | "dataType": "string" 291 | }, 292 | { 293 | "name": "quantityDecimal", 294 | "dataType": "int64" 295 | }, 296 | { 297 | "name": "quantityOnHand", 298 | "dataType": "double" 299 | }, 300 | { 301 | "name": "standardCost", 302 | "dataType": "double" 303 | }, 304 | { 305 | "name": "standardCostBase", 306 | "dataType": "double" 307 | }, 308 | { 309 | "name": "stateCode", 310 | "dataType": "int64" 311 | }, 312 | { 313 | "name": "stateCode_display", 314 | "dataType": "string" 315 | }, 316 | { 317 | "name": "statusCode", 318 | "dataType": "int64" 319 | }, 320 | { 321 | "name": "statusCode_display", 322 | "dataType": "string" 323 | }, 324 | { 325 | "name": "stockVolume", 326 | "dataType": "double" 327 | }, 328 | { 329 | "name": "stockWeight", 330 | "dataType": "double" 331 | }, 332 | { 333 | "name": "vendorName", 334 | "dataType": "string" 335 | }, 336 | { 337 | "name": "vendorPartNumber", 338 | "dataType": "string" 339 | }, 340 | { 341 | "name": "hierarchyPath", 342 | "dataType": "string" 343 | }, 344 | { 345 | "name": "priceLevelId", 346 | "dataType": "string" 347 | }, 348 | { 349 | "name": "subjectId", 350 | "dataType": "string" 351 | }, 352 | { 353 | "name": "entityImageId", 354 | "dataType": "string" 355 | }, 356 | { 357 | "name": "createdByExternalParty", 358 | "dataType": "string" 359 | }, 360 | { 361 | "name": "modifiedByExternalParty", 362 | "dataType": "string" 363 | }, 364 | { 365 | "name": "form", 366 | "dataType": "string" 367 | }, 368 | { 369 | "name": "isBrand", 370 | "dataType": "boolean" 371 | }, 372 | { 373 | "name": "isOvertheCounter", 374 | "dataType": "boolean" 375 | }, 376 | { 377 | "name": "medicationCode", 378 | "dataType": "string" 379 | }, 380 | { 381 | "name": "packageContainer", 382 | "dataType": "string" 383 | } 384 | ], 385 | "partitions": [ 386 | { 387 | "name": "Part001", 388 | "refreshTime": "2018-11-14T19:37:55.4375154+00:00", 389 | "location": "https://dfmsitscuscdsa.blob.core.windows.net/0682aad0-37c2-4eac-a4d3-e243f7e0afc4/Products.csv?snapshot=2018-11-14T19:37:55.4342726Z" 390 | } 391 | ], 392 | "schemas": [ 393 | "https://raw.githubusercontent.com/Microsoft/CDM/master/schemaDocuments/core/applicationCommon/foundationCommon/Product.0.7.cdm.json", 394 | "https://raw.githubusercontent.com/Microsoft/CDM/master/schemaDocuments/core/applicationCommon/foundationCommon/crmCommon/accelerators/healthCare/electronicMedicalRecords/Product.0.7.cdm.json" 395 | ] 396 | } 397 | ] 398 | } -------------------------------------------------------------------------------- /CDM/schema/examples/OrdersProductsCustomersLinked/model.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "OrdersProductsCustomersLinked", 3 | "description": "", 4 | "version": "1.0", 5 | "pbi:mashup": { 6 | "fastCombine": true, 7 | "allowNativeQueries": false, 8 | "queriesMetadata": { 9 | "Orders": { 10 | "queryId": "88bc570c-047d-460f-8eed-c0ef17649afd", 11 | "queryName": "Orders", 12 | "loadEnabled": true 13 | }, 14 | "Products": { 15 | "queryId": "f33e2de9-69f0-42db-adc9-a244aa544b0a", 16 | "queryName": "Products", 17 | "loadEnabled": true 18 | }, 19 | "Customers": { 20 | "queryId": "77ea2dfa-b868-4527-a6ee-c1da01a3b4c3", 21 | "queryName": "Customers", 22 | "loadEnabled": true 23 | } 24 | }, 25 | "document": "section Section1;\r\nshared Customers = let\r\n Source = OData.Feed(\"https://services.odata.org/V2/Northwind/Northwind.svc\"),\r\n Navigation = Source{[Name = \"Customers\", Signature = \"table\"]}[Data],\r\n #\"Remove columns\" = Table.RemoveColumns(Navigation, Table.ColumnsOfType(Navigation, {type table, type record, type list, type nullable binary, type binary, type function}))\r\nin\r\n #\"Remove columns\";\r\n" 26 | }, 27 | "entities": [ 28 | { 29 | "$type": "ReferenceEntity", 30 | "name": "Orders", 31 | "description": "", 32 | "source": "Orders", 33 | "modelId": "f19bbb97-c031-441a-8bd1-61b9181c0b83/1a7ef9c8-c7e8-45f8-9d8a-b80f8ffe4612", 34 | "annotations": [ 35 | { 36 | "name": "pbi:EntityTypeDisplayHint", 37 | "value": "LinkedEntity" 38 | } 39 | ] 40 | }, 41 | { 42 | "$type": "ReferenceEntity", 43 | "name": "Products", 44 | "description": "", 45 | "source": "Products", 46 | "modelId": "f19bbb97-c031-441a-8bd1-61b9181c0b83/1a7ef9c8-c7e8-45f8-9d8a-b80f8ffe4612", 47 | "annotations": [ 48 | { 49 | "name": "pbi:EntityTypeDisplayHint", 50 | "value": "LinkedEntity" 51 | } 52 | ] 53 | }, 54 | { 55 | "$type": "LocalEntity", 56 | "name": "Customers", 57 | "description": "", 58 | "pbi:refreshPolicy": { 59 | "$type": "FullRefreshPolicy", 60 | "location": "Customers.csv" 61 | }, 62 | "attributes": [ 63 | { 64 | "name": "CustomerID", 65 | "dataType": "string" 66 | }, 67 | { 68 | "name": "CompanyName", 69 | "dataType": "string" 70 | }, 71 | { 72 | "name": "ContactName", 73 | "dataType": "string" 74 | }, 75 | { 76 | "name": "ContactTitle", 77 | "dataType": "string" 78 | }, 79 | { 80 | "name": "Address", 81 | "dataType": "string" 82 | }, 83 | { 84 | "name": "City", 85 | "dataType": "string" 86 | }, 87 | { 88 | "name": "Region", 89 | "dataType": "string" 90 | }, 91 | { 92 | "name": "PostalCode", 93 | "dataType": "string" 94 | }, 95 | { 96 | "name": "Country", 97 | "dataType": "string" 98 | }, 99 | { 100 | "name": "Phone", 101 | "dataType": "string" 102 | }, 103 | { 104 | "name": "Fax", 105 | "dataType": "string" 106 | } 107 | ] 108 | } 109 | ], 110 | "referenceModels": [ 111 | { 112 | "id": "f19bbb97-c031-441a-8bd1-61b9181c0b83/1a7ef9c8-c7e8-45f8-9d8a-b80f8ffe4612", 113 | "location": "https://dfmsitscuscdsa.blob.core.windows.net/0682aad0-37c2-4eac-a4d3-e243f7e0afc4/model.json?snapshot=2018-11-14T19:37:55.5393471Z" 114 | } 115 | ] 116 | } -------------------------------------------------------------------------------- /CDM/schema/modeljsonschema.json: -------------------------------------------------------------------------------- 1 | { 2 | "definitions": { 3 | "annotation": { 4 | "type": "object", 5 | "properties": { 6 | "name": { 7 | "type": "string", 8 | "minLength": 1, 9 | "maxLength": 256, 10 | "pattern": "^[^\\s](.*[^\\s])?$" 11 | }, 12 | "value": { 13 | "type": "string" 14 | } 15 | }, 16 | "required": [ 17 | "name" 18 | ] 19 | }, 20 | "referenceModel": { 21 | "type": "object", 22 | "properties": { 23 | "id": { 24 | "type": "string" 25 | }, 26 | "location": { 27 | "type": "string", 28 | "format": "uri" 29 | } 30 | }, 31 | "required": [ 32 | "id", 33 | "location" 34 | ] 35 | }, 36 | "entity": { 37 | "type": "object", 38 | "properties": { 39 | "$type": { 40 | "type": "string", 41 | "enum": [ 42 | "LocalEntity", 43 | "ReferenceEntity" 44 | ] 45 | }, 46 | "name": { 47 | "type": "string", 48 | "minLength": 1, 49 | "maxLength": 256, 50 | "pattern": "^[^\\s]([^.\"]*[^\\s])?$" 51 | }, 52 | "description": { 53 | "type": "string", 54 | "maxLength": 4000 55 | }, 56 | "annotations": { 57 | "type": "array", 58 | "items": { 59 | "$ref": "#/definitions/annotation" 60 | } 61 | }, 62 | "isHidden": { 63 | "$ref": "#/definitions/isHidden" 64 | } 65 | }, 66 | "required": [ 67 | "$type", 68 | "name" 69 | ] 70 | }, 71 | "localEntity": { 72 | "allOf": [ 73 | { 74 | "$ref": "#/definitions/entity" 75 | }, 76 | { 77 | "properties": { 78 | "$type": { 79 | "type": "string", 80 | "const": "LocalEntity" 81 | }, 82 | "attributes": { 83 | "$id": "#/properties/entities/items/properties/attributes", 84 | "type": "array", 85 | "items": { 86 | "$ref": "#/definitions/attribute" 87 | } 88 | }, 89 | "partitions": { 90 | "$id": "#/properties/entities/items/properties/partitions", 91 | "type": "array", 92 | "items": { 93 | "$ref": "#/definitions/partition" 94 | } 95 | }, 96 | "schemas": { 97 | "$id": "#/properties/entities/items/properties/schemas", 98 | "type": "array", 99 | "items": { 100 | "$id": "#/properties/entities/items/properties/schemas/items", 101 | "type": "string", 102 | "pattern": "^https://raw\\.githubusercontent\\.com/Microsoft/CDM/master/schemaDocuments/core/([a-zA-Z]+/?)*[a-zA-Z0-9.]+\\.[0-9.]+\\.cdm\\.json$", 103 | "examples": [ 104 | "https://raw.githubusercontent.com/Microsoft/CDM/master/schemaDocuments/core/applicationCommon/foundationCommon/Product.0.7.cdm.json", 105 | "https://raw.githubusercontent.com/Microsoft/CDM/master/schemaDocuments/core/applicationCommon/foundationCommon/crmCommon/accelerators/healthCare/electronicMedicalRecords/Product.0.7.cdm.json" 106 | ] 107 | } 108 | } 109 | }, 110 | "required": [ 111 | "$type", 112 | "attributes" 113 | ] 114 | } 115 | ] 116 | }, 117 | "referenceEntity": { 118 | "allOf": [ 119 | { 120 | "$ref": "#/definitions/entity" 121 | }, 122 | { 123 | "properties": { 124 | "$type": { 125 | "type": "string", 126 | "const": "ReferenceEntity" 127 | }, 128 | "source": { 129 | "type": "string" 130 | }, 131 | "modelId": { 132 | "type": "string" 133 | } 134 | }, 135 | "required": [ 136 | "$type", 137 | "source", 138 | "modelId" 139 | ] 140 | } 141 | ] 142 | }, 143 | "attribute": { 144 | "type": "object", 145 | "properties": { 146 | "name": { 147 | "type": "string", 148 | "minLength": 1, 149 | "maxLength": 256, 150 | "pattern": "^[^\\s](.*[^\\s])?$" 151 | }, 152 | "description": { 153 | "type": "string", 154 | "maxLength": 4000 155 | }, 156 | "dataType": { 157 | "type": "string", 158 | "enum": [ 159 | "unclassified", 160 | "string", 161 | "int64", 162 | "double", 163 | "dateTime", 164 | "dateTimeOffset", 165 | "decimal", 166 | "boolean", 167 | "guid", 168 | "json" 169 | ] 170 | }, 171 | "annotations": { 172 | "type": "array", 173 | "items": { 174 | "$ref": "#/definitions/annotation" 175 | } 176 | } 177 | }, 178 | "required": [ 179 | "name", 180 | "dataType" 181 | ] 182 | }, 183 | "partition": { 184 | "type": "object", 185 | "properties": { 186 | "name": { 187 | "type": "string", 188 | "minLength": 1, 189 | "maxLength": 256, 190 | "pattern": "^[^\\s](.*[^\\s])?$" 191 | }, 192 | "description": { 193 | "type": "string", 194 | "maxLength": 4000 195 | }, 196 | "refreshTime": { 197 | "type": "string" 198 | }, 199 | "location": { 200 | "type": "string", 201 | "format": "uri" 202 | }, 203 | "fileFormatSettings": { 204 | "$ref": "#/definitions/fileFormatSettings" 205 | }, 206 | "isHidden": { 207 | "$ref": "#/definitions/isHidden" 208 | } 209 | }, 210 | "required": [ 211 | "name" 212 | ] 213 | }, 214 | "isHidden": { 215 | "type": "boolean" 216 | }, 217 | "fileFormatSettings": { 218 | "oneOf": [ 219 | { 220 | "$ref": "#/definitions/CsvFormatSettings" 221 | } 222 | ], 223 | "required": [ 224 | "$type" 225 | ] 226 | }, 227 | "CsvFormatSettings": { 228 | "type": "object", 229 | "properties": { 230 | "$type": { 231 | "type": "string", 232 | "const": "CsvFormatSettings" 233 | }, 234 | "columnHeaders": { 235 | "type": "boolean", 236 | "default": false 237 | }, 238 | "delimiter": { 239 | "type": "string", 240 | "default": "," 241 | }, 242 | "quoteStyle": { 243 | "type": "string", 244 | "enum": [ 245 | "QuoteStyle.Csv", 246 | "QuoteStyle.None" 247 | ], 248 | "default": "QuoteStyle.Csv" 249 | }, 250 | "csvStyle": { 251 | "type": "string", 252 | "enum": [ 253 | "CsvStyle.QuoteAlways", 254 | "CsvStyle.QuoteAfterDelimiter" 255 | ], 256 | "default": "CsvStyle.QuoteAlways" 257 | } 258 | }, 259 | "required": [ 260 | "$type" 261 | ] 262 | }, 263 | "SingleKeyRelationship": { 264 | "type": "object", 265 | "properties": { 266 | "$type": { 267 | "type": "string", 268 | "const": "SingleKeyRelationship" 269 | }, 270 | "fromAttribute": { 271 | "$ref": "#/definitions/referenceAttribute" 272 | }, 273 | "toAttribute": { 274 | "$ref": "#/definitions/referenceAttribute" 275 | } 276 | }, 277 | "required": [ 278 | "$type", 279 | "fromAttribute", 280 | "toAttribute" 281 | ] 282 | }, 283 | "referenceAttribute": { 284 | "type": "object", 285 | "properties": { 286 | "entityName": { 287 | "type": "string", 288 | "minLength": 1, 289 | "maxLength": 1024, 290 | "pattern": "^[^\\s](.*[^\\s])?$" 291 | }, 292 | "attributeName": { 293 | "type": "string", 294 | "minLength": 1, 295 | "maxLength": 256, 296 | "pattern": "^[^\\s](.*[^\\s])?$" 297 | } 298 | }, 299 | "required": [ 300 | "entityName", 301 | "attributeName" 302 | ] 303 | } 304 | }, 305 | "$schema": "http://json-schema.org/draft-07/schema#", 306 | "$id": "http://example.com/root.json", 307 | "type": "object", 308 | "title": "The Root Schema", 309 | "required": [ 310 | "name", 311 | "version", 312 | "entities" 313 | ], 314 | "properties": { 315 | "application": { 316 | "$id": "#/properties/application", 317 | "type": "string" 318 | }, 319 | "name": { 320 | "$id": "#/properties/name", 321 | "type": "string", 322 | "minLength": 1, 323 | "maxLength": 256, 324 | "pattern": "^[^\\s](.*[^\\s])?$" 325 | }, 326 | "description": { 327 | "$id": "#/properties/description", 328 | "type": "string", 329 | "maxLength": 4000 330 | }, 331 | "version": { 332 | "$id": "#/properties/version", 333 | "type": "string", 334 | "enum": [ 335 | "1.0" 336 | ] 337 | }, 338 | "culture": { 339 | "$id": "#/properties/culture", 340 | "type": "string" 341 | }, 342 | "modifiedTime": { 343 | "$id": "#/properties/modifiedTime", 344 | "type": "string" 345 | }, 346 | "isHidden": { 347 | "$id": "#/properties/isHidden", 348 | "$ref": "#/definitions/isHidden" 349 | }, 350 | "annotations": { 351 | "$id": "#/properties/annotations", 352 | "type": "array", 353 | "items": { 354 | "$ref": "#/definitions/annotation" 355 | } 356 | }, 357 | "entities": { 358 | "$id": "#/properties/entities", 359 | "type": "array", 360 | "items": { 361 | "oneOf": [ 362 | { 363 | "$ref": "#/definitions/localEntity" 364 | }, 365 | { 366 | "$ref": "#/definitions/referenceEntity" 367 | } 368 | ] 369 | } 370 | }, 371 | "referenceModels": { 372 | "$id": "#/properties/referenceModels", 373 | "type": "array", 374 | "items": { 375 | "$ref": "#/definitions/referenceModel" 376 | } 377 | }, 378 | "relationships": { 379 | "$id": "#/properties/relationships", 380 | "type": "array", 381 | "items": { 382 | "oneOf": [ 383 | { 384 | "$ref": "#/definitions/SingleKeyRelationship" 385 | } 386 | ] 387 | } 388 | } 389 | } 390 | } -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## [project-title] Changelog 2 | 3 | 4 | # x.y.z (yyyy-mm-dd) 5 | 6 | *Features* 7 | * ... 8 | 9 | *Bug Fixes* 10 | * ... 11 | 12 | *Breaking Changes* 13 | * ... 14 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to [project-title] 2 | 3 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 4 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 5 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 6 | 7 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 8 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 9 | provided by the bot. You will only need to do this once across all repos using our CLA. 10 | 11 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 12 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 13 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 14 | 15 | - [Code of Conduct](#coc) 16 | - [Issues and Bugs](#issue) 17 | - [Feature Requests](#feature) 18 | - [Submission Guidelines](#submit) 19 | 20 | ## Code of Conduct 21 | Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 22 | 23 | ## Found an Issue? 24 | If you find a bug in the source code or a mistake in the documentation, you can help us by 25 | [submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can 26 | [submit a Pull Request](#submit-pr) with a fix. 27 | 28 | ## Want a Feature? 29 | You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub 30 | Repository. If you would like to *implement* a new feature, please submit an issue with 31 | a proposal for your work first, to be sure that we can use it. 32 | 33 | * **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr). 34 | 35 | ## Submission Guidelines 36 | 37 | ### Submitting an Issue 38 | Before you submit an issue, search the archive, maybe your question was already answered. 39 | 40 | If your issue appears to be a bug, and hasn't been reported, open a new issue. 41 | Help us to maximize the effort we can spend fixing issues and adding new 42 | features, by not reporting duplicate issues. Providing the following information will increase the 43 | chances of your issue being dealt with quickly: 44 | 45 | * **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps 46 | * **Version** - what version is affected (e.g. 0.1.2) 47 | * **Motivation for or Use Case** - explain what are you trying to do and why the current behavior is a bug for you 48 | * **Browsers and Operating System** - is this a problem with all browsers? 49 | * **Reproduce the Error** - provide a live example or a unambiguous set of steps 50 | * **Related Issues** - has a similar issue been reported before? 51 | * **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be 52 | causing the problem (line of code or commit) 53 | 54 | You can file new issues by providing the above information at the corresponding repository's issues link: https://github.com/[organization-name]/[repository-name]/issues/new]. 55 | 56 | ### Submitting a Pull Request (PR) 57 | Before you submit your Pull Request (PR) consider the following guidelines: 58 | 59 | * Search the repository (https://github.com/[organization-name]/[repository-name]/pulls) for an open or closed PR 60 | that relates to your submission. You don't want to duplicate effort. 61 | 62 | * Make your changes in a new git fork: 63 | 64 | * Commit your changes using a descriptive commit message 65 | * Push your fork to GitHub: 66 | * In GitHub, create a pull request 67 | * If we suggest changes then: 68 | * Make the required updates. 69 | * Rebase your fork and force push to your GitHub repository (this will update your Pull Request): 70 | 71 | ```shell 72 | git rebase master -i 73 | git push -f 74 | ``` 75 | 76 | That's it! Thank you for your contribution! 77 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. All rights reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | --- 2 | page_type: sample 3 | languages: 4 | - csharp 5 | - python 6 | - ruby 7 | - powershell 8 | products: 9 | - azure 10 | description: "Tutorial and sample code for integrating Power BI dataflows and Azure Data Services using Common Data Model folders in Azure Data Lake." 11 | --- 12 | 13 | # OBSOLETE 14 | 15 | The technology used in this tutorial is now obsolete. 16 | 17 | For information on using Azure Data Factory mapping data flows to read and write CDM entity data, see this [blog post](https://techcommunity.microsoft.com/t5/azure-data-factory/adf-adds-support-for-inline-datasets-and-common-data-model-to/bc-p/1469606), which describes the overall solution, with links to an article describing how [CDM support uses inline datasets](https://docs.microsoft.com/en-us/azure/data-factory/data-flow-source), and an article providing details of the [source and sink properties](https://docs.microsoft.com/en-us/azure/data-factory/format-common-data-model). 18 | 19 | For information on the new Spark CDM Connector for use in Azure Databricks and Synapse to read and write CDM entity data, see [https://github.com/Azure/spark-cdm-connector](https://github.com/Azure/spark-cdm-connector) 20 | 21 | # OBSOLETE 22 | 23 | # CDM folders and Azure Data Services integration 24 | 25 | Tutorial and sample code for integrating Power BI dataflows and Azure Data Services using Common Data Model (CDM) folders in Azure Data Lake Storage Gen2. For more information on the scenario, see this [blog post](https://azure.microsoft.com/en-us/blog/power-bi-and-azure-data-services-dismantle-data-silos-and-unlock-insights). 26 | 27 | ## Features 28 | 29 | The [tutorial](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/Tutorial/CDM-Azure-Data-Services-Integration-Tutorial.md) walks through use of CDM folders in a modern data warehouse scenario. In it you will: 30 | - Configure your Power BI account to save Power BI dataflows as CDM folders in ADLS Gen2; 31 | - Create a Power BI dataflow by ingesting order data from the Wide World Importers sample database and save it as a CDM folder; 32 | - Use an Azure Databricks notebook that prepares and cleanses the data in the CDM folder, and then writes the updated data to a new CDM folder in ADLS Gen2; 33 | - Use Azure Machine Learning to train and publish a model using data from the CDM folder; 34 | - Use an Azure Data Factory pipeline to load data from the CDM folder into staging tables in Azure SQL Data Warehouse and then invoke stored procedures that transform the data into a dimensional model; 35 | - Use Azure Data Factory to orchestrate the overall process and monitor execution. 36 | 37 | Each step leverages metadata contained in the CDM folder to make it easier and simpler to accomplish the task. 38 | 39 | The provided samples include code, libraries, and Azure resource templates that you can use with CDM folders you create from your own data. 40 | 41 | IMPORTANT: the sample code is provided as-is with no warranties and is intended for learning purposes only. 42 | 43 | ## Getting Started 44 | 45 | See the [tutorial](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/Tutorial/CDM-Azure-Data-Services-Integration-Tutorial.md) for pre-requisites and installation details. 46 | 47 | ## License 48 | The sample code and tutorials in this project are licensed under the MIT license. See the [LICENSE](https://github.com/Azure-Samples/cdm-azure-data-services-integration/blob/master/LICENSE.md) file for more details. 49 | 50 | ## Contributing 51 | 52 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 53 | -------------------------------------------------------------------------------- /Tutorial/README.md: -------------------------------------------------------------------------------- 1 | # CDM and Azure Data Services Integration 2 | 3 | [Tutorial for integrating Power BI Dataflows and Azure Data Services using CDM folders in Azure Data Lake Storage Gen 2.](CDM-Azure-Data-Services-Integration-Tutorial.md) -------------------------------------------------------------------------------- /Tutorial/media/adfauthor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/adfauthor.png -------------------------------------------------------------------------------- /Tutorial/media/adfpipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/adfpipeline.png -------------------------------------------------------------------------------- /Tutorial/media/authormonitor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/authormonitor.png -------------------------------------------------------------------------------- /Tutorial/media/azuresqldb.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/azuresqldb.png -------------------------------------------------------------------------------- /Tutorial/media/cdmparserresources.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/cdmparserresources.png -------------------------------------------------------------------------------- /Tutorial/media/createdataflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/createdataflow.png -------------------------------------------------------------------------------- /Tutorial/media/dataflowase.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/dataflowase.png -------------------------------------------------------------------------------- /Tutorial/media/dataflowase2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/dataflowase2.png -------------------------------------------------------------------------------- /Tutorial/media/dataflowdone.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/dataflowdone.png -------------------------------------------------------------------------------- /Tutorial/media/dataflowsettings.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/dataflowsettings.png -------------------------------------------------------------------------------- /Tutorial/media/folderlocation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/folderlocation.png -------------------------------------------------------------------------------- /Tutorial/media/mountcdmpipeline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/mountcdmpipeline.png -------------------------------------------------------------------------------- /Tutorial/media/overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/overview.png -------------------------------------------------------------------------------- /Tutorial/media/pipelineparameters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/pipelineparameters.png -------------------------------------------------------------------------------- /Tutorial/media/refreshdataflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/refreshdataflow.png -------------------------------------------------------------------------------- /Tutorial/media/refresheddataflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/refresheddataflow.png -------------------------------------------------------------------------------- /Tutorial/media/savedataflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/savedataflow.png -------------------------------------------------------------------------------- /Tutorial/media/selecttables.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/selecttables.png -------------------------------------------------------------------------------- /Tutorial/media/ssmsWWI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Tutorial/media/ssmsWWI.png -------------------------------------------------------------------------------- /Using ADF Mapping Data Flows with CDM.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Azure-Samples/cdm-azure-data-services-integration/d1f890da7625c05afbc4c7eae9578c5909d4ce3c/Using ADF Mapping Data Flows with CDM.pdf --------------------------------------------------------------------------------