├── .github
├── ISSUE_TEMPLATE.md
└── PULL_REQUEST_TEMPLATE.md
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── CleanTrigger1
├── __init__.py
├── clean.py
├── function.json
└── sample.dat
├── CleanTrigger2
├── __init__.py
├── clean.py
├── function.json
└── sample.dat
├── LICENSE.md
├── README.md
├── Reconcile
├── __init__.py
├── clean.py
├── fetch_blob.py
├── function.json
└── sample.dat
├── azure-deploy-event-grid-subscription.json
├── azure-deploy-linux-app-plan.json
├── azuredeploy.parameters.json
├── blob_to_smart_contract
├── __init__.py
├── clean.py
├── fetch_blob.py
├── function.json
└── sample.dat
├── dataset
├── config.ini
├── randomcsvgenerator.py
├── s1_raw.csv
└── s2_raw.csv
├── host.json
├── local.settings.json
├── requirements.txt
└── tests
├── host.json
├── subvalidation.json
└── test_eventgrid.py
/.github/ISSUE_TEMPLATE.md:
--------------------------------------------------------------------------------
1 |
4 | > Please provide us with the following information:
5 | > ---------------------------------------------------------------
6 |
7 | ### This issue is for a: (mark with an `x`)
8 | ```
9 | - [ ] bug report -> please search issues before submitting
10 | - [ ] feature request
11 | - [ ] documentation issue or request
12 | - [ ] regression (a behavior that used to work and stopped in a new release)
13 | ```
14 |
15 | ### Minimal steps to reproduce
16 | >
17 |
18 | ### Any log messages given by the failure
19 | >
20 |
21 | ### Expected/desired behavior
22 | >
23 |
24 | ### OS and Version?
25 | > Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
26 |
27 | ### Versions
28 | >
29 |
30 | ### Mention any other details that might be useful
31 |
32 | > ---------------------------------------------------------------
33 | > Thanks! We'll be in touch soon.
34 |
--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | ## Purpose
2 |
3 | * ...
4 |
5 | ## Does this introduce a breaking change?
6 |
7 | ```
8 | [ ] Yes
9 | [ ] No
10 | ```
11 |
12 | ## Pull Request Type
13 | What kind of change does this Pull Request introduce?
14 |
15 |
16 | ```
17 | [ ] Bugfix
18 | [ ] Feature
19 | [ ] Code style update (formatting, local variables)
20 | [ ] Refactoring (no functional changes, no api changes)
21 | [ ] Documentation content changes
22 | [ ] Other... Please describe:
23 | ```
24 |
25 | ## How to Test
26 | * Get the code
27 |
28 | ```
29 | git clone [repo-address]
30 | cd [repo-name]
31 | git checkout [branch-name]
32 | npm install
33 | ```
34 |
35 | * Test the code
36 |
37 | ```
38 | ```
39 |
40 | ## What to Check
41 | Verify that the following are valid
42 | * ...
43 |
44 | ## Other Information
45 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | *.egg-info/
24 | .installed.cfg
25 | *.egg
26 | MANIFEST
27 |
28 | # PyInstaller
29 | # Usually these files are written by a python script from a template
30 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 |
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 |
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *.cover
47 | .hypothesis/
48 | .pytest_cache/
49 |
50 | # Translations
51 | *.mo
52 | *.pot
53 |
54 | # Django stuff:
55 | *.log
56 | local_settings.py
57 | db.sqlite3
58 |
59 | # Flask stuff:
60 | instance/
61 | .webassets-cache
62 |
63 | # Scrapy stuff:
64 | .scrapy
65 |
66 | # Sphinx documentation
67 | docs/_build/
68 |
69 | # PyBuilder
70 | target/
71 |
72 | # Jupyter Notebook
73 | .ipynb_checkpoints
74 |
75 | # pyenv
76 | .python-version
77 |
78 | # celery beat schedule file
79 | celerybeat-schedule
80 |
81 | # SageMath parsed files
82 | *.sage.py
83 |
84 | # Environments
85 | .env
86 | .venv
87 | env/
88 | venv/
89 | ENV/
90 | env.bak/
91 | venv.bak/
92 |
93 | # Spyder project settings
94 | .spyderproject
95 | .spyproject
96 |
97 | # Rope project settings
98 | .ropeproject
99 |
100 | # mkdocs documentation
101 | /site
102 |
103 | # mypy
104 | .mypy_cache/
105 |
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | ## [project-title] Changelog
2 |
3 |
4 | # x.y.z (yyyy-mm-dd)
5 |
6 | *Features*
7 | * ...
8 |
9 | *Bug Fixes*
10 | * ...
11 |
12 | *Breaking Changes*
13 | * ...
14 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing to [project-title]
2 |
3 | This project welcomes contributions and suggestions. Most contributions require you to agree to a
4 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
5 | the rights to use your contribution. For details, visit https://cla.microsoft.com.
6 |
7 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
8 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
9 | provided by the bot. You will only need to do this once across all repos using our CLA.
10 |
11 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
12 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
13 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
14 |
15 | - [Code of Conduct](#coc)
16 | - [Issues and Bugs](#issue)
17 | - [Feature Requests](#feature)
18 | - [Submission Guidelines](#submit)
19 |
20 | ## Code of Conduct
21 | Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
22 |
23 | ## Found an Issue?
24 | If you find a bug in the source code or a mistake in the documentation, you can help us by
25 | [submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can
26 | [submit a Pull Request](#submit-pr) with a fix.
27 |
28 | ## Want a Feature?
29 | You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub
30 | Repository. If you would like to *implement* a new feature, please submit an issue with
31 | a proposal for your work first, to be sure that we can use it.
32 |
33 | * **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr).
34 |
35 | ## Submission Guidelines
36 |
37 | ### Submitting an Issue
38 | Before you submit an issue, search the archive, maybe your question was already answered.
39 |
40 | If your issue appears to be a bug, and hasn't been reported, open a new issue.
41 | Help us to maximize the effort we can spend fixing issues and adding new
42 | features, by not reporting duplicate issues. Providing the following information will increase the
43 | chances of your issue being dealt with quickly:
44 |
45 | * **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps
46 | * **Version** - what version is affected (e.g. 0.1.2)
47 | * **Motivation for or Use Case** - explain what are you trying to do and why the current behavior is a bug for you
48 | * **Browsers and Operating System** - is this a problem with all browsers?
49 | * **Reproduce the Error** - provide a live example or a unambiguous set of steps
50 | * **Related Issues** - has a similar issue been reported before?
51 | * **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be
52 | causing the problem (line of code or commit)
53 |
54 | You can file new issues by providing the above information at the corresponding repository's issues link: https://github.com/[organization-name]/[repository-name]/issues/new].
55 |
56 | ### Submitting a Pull Request (PR)
57 | Before you submit your Pull Request (PR) consider the following guidelines:
58 |
59 | * Search the repository (https://github.com/[organization-name]/[repository-name]/pulls) for an open or closed PR
60 | that relates to your submission. You don't want to duplicate effort.
61 |
62 | * Make your changes in a new git fork:
63 |
64 | * Commit your changes using a descriptive commit message
65 | * Push your fork to GitHub:
66 | * In GitHub, create a pull request
67 | * If we suggest changes then:
68 | * Make the required updates.
69 | * Rebase your fork and force push to your GitHub repository (this will update your Pull Request):
70 |
71 | ```shell
72 | git rebase master -i
73 | git push -f
74 | ```
75 |
76 | That's it! Thank you for your contribution!
77 |
--------------------------------------------------------------------------------
/CleanTrigger1/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import json
3 | import azure.functions as func
4 | from . import clean as cleaning_service
5 |
6 | def main(req: func.HttpRequest) -> func.HttpResponse:
7 | # This will output to postman
8 | logging.info('Python HTTP trigger function processed a request.')
9 | req_body = req.get_json()
10 |
11 | if is_validation_event(req_body):
12 | return func.HttpResponse(validate_eg(req_body))
13 |
14 | elif is_blob_created_event(req_body):
15 | result = cleaning_service.clean(req_body)
16 |
17 | if result is "Success":
18 | return func.HttpResponse("Successfully cleaned data",status_code=200)
19 | else:
20 | return func.HttpResponse("Bad Request", status_code=400)
21 |
22 | else:
23 | pass
24 |
25 | # Check for validation event from event grid
26 | def is_validation_event(req_body):
27 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.EventGrid.SubscriptionValidationEvent"
28 |
29 | # If blob created event, then true
30 | def is_blob_created_event(req_body):
31 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.Storage.BlobCreated"
32 |
33 | # Respond to event grid webhook validation event
34 | def validate_eg(req_body):
35 | result = {}
36 | result['validationResponse'] = req_body[0]['data']['validationCode']
37 | return json.dumps(result)
--------------------------------------------------------------------------------
/CleanTrigger1/clean.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import pandas as pd
4 | from azure.storage.blob import ContentSettings
5 | from azure.storage.blob import BlockBlobService
6 | from io import StringIO
7 |
8 | blob_account_name = os.getenv("BlobAccountName")
9 | blob_account_key = os.getenv("BlobAccountKey")
10 | block_blob_service = BlockBlobService(account_name=blob_account_name,
11 | account_key=blob_account_key)
12 | out_blob_container_name = os.getenv("C1")
13 |
14 | def clean(req_body):
15 | blob_obj,filename = extract_blob_props(req_body[0]['data']['url'] )
16 | df = pd.read_csv(StringIO(blob_obj.content))
17 | result = clean_blob(df,filename)
18 | return result
19 |
20 | # Extract blob container name and blob file
21 | def extract_blob_props(url):
22 |
23 | blob_file_name = url.rsplit('/',1)[-1]
24 | in_container_name = url.rsplit('/',2)[-2]
25 |
26 | # remove file extension from blob name
27 | readblob = block_blob_service.get_blob_to_text(in_container_name,blob_file_name)
28 | return readblob, blob_file_name
29 |
30 | def clean_blob(df, blob_file_name):
31 |
32 | # group by names and region and sum the sales and units
33 | df1 = df.groupby(["names","region"],as_index=False)[["units","price"]].sum().reset_index()
34 |
35 | # pick one region based on request
36 | df2 = df1[df1["region"] == 'east']
37 | outcsv = df2.to_csv(index=False)
38 |
39 | cleaned_blob_file_name = "cleaned_" +blob_file_name
40 | block_blob_service.create_blob_from_text(out_blob_container_name, cleaned_blob_file_name, outcsv)
41 | return "Success"
42 |
--------------------------------------------------------------------------------
/CleanTrigger1/function.json:
--------------------------------------------------------------------------------
1 | {
2 | "scriptFile": "__init__.py",
3 | "bindings": [
4 | {
5 | "authLevel": "anonymous",
6 | "type": "httpTrigger",
7 | "direction": "in",
8 | "name": "req",
9 | "methods": [
10 | "get",
11 | "post"
12 | ]
13 | },
14 | {
15 | "type": "http",
16 | "direction": "out",
17 | "name": "$return"
18 | }
19 | ]
20 | }
--------------------------------------------------------------------------------
/CleanTrigger1/sample.dat:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Azure"
3 | }
--------------------------------------------------------------------------------
/CleanTrigger2/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import json
3 | import azure.functions as func
4 | from . import clean as cleaning_service
5 |
6 | def main(req: func.HttpRequest) -> func.HttpResponse:
7 | req_body = req.get_json()
8 |
9 | if is_validation_event(req_body):
10 | return func.HttpResponse(validate_eg(req_body))
11 |
12 | elif is_blob_created_event(req_body):
13 | result = cleaning_service.clean(req_body)
14 |
15 | if result is "Success":
16 | return func.HttpResponse("Successfully cleaned data",status_code=200)
17 | else:
18 | return func.HttpResponse("Bad Request", status_code=400)
19 |
20 | else: # don't care about other events
21 | pass
22 |
23 | # Check for validation event from event grid
24 | def is_validation_event(req_body):
25 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.EventGrid.SubscriptionValidationEvent"
26 |
27 | # If blob created event, then true
28 | def is_blob_created_event(req_body):
29 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.Storage.BlobCreated"
30 |
31 | # Respond to event grid webhook validation event
32 | def validate_eg(req_body):
33 | result = {}
34 | result['validationResponse'] = req_body[0]['data']['validationCode']
35 | return json.dumps(result)
--------------------------------------------------------------------------------
/CleanTrigger2/clean.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import pandas as pd
4 | from azure.storage.blob import ContentSettings
5 | from azure.storage.blob import BlockBlobService
6 | from io import StringIO
7 |
8 | blob_account_name = os.getenv("BlobAccountName")
9 | blob_account_key = os.getenv("BlobAccountKey")
10 | block_blob_service = BlockBlobService(account_name=blob_account_name,
11 | account_key=blob_account_key)
12 | out_blob_container_name = os.getenv("C2")
13 |
14 | def clean(req_body):
15 | blob_obj,filename = extract_blob_props(req_body[0]['data']['url'])
16 | df = pd.read_csv(StringIO(blob_obj.content))
17 | result = clean_blob(df, filename)
18 | return result
19 |
20 | def extract_blob_props(url):
21 | blob_file_name = url.rsplit('/',1)[-1]
22 | in_container_name = url.rsplit('/',2)[-2]
23 | readblob = block_blob_service.get_blob_to_text(in_container_name,blob_file_name)
24 | return readblob, blob_file_name
25 |
26 | def clean_blob(df, blob_file_name):
27 | # group by names and item and sum the sales and units
28 | df1 = df.groupby(["names","item"],as_index=False)[["units","price"]].sum().reset_index()
29 |
30 | # pick one region based on request
31 | df2 = df1[df1["item"] == 'binder']
32 | outcsv = df2.to_csv(index=False)
33 |
34 | cleaned_blob_file_name = "cleaned_" +blob_file_name
35 | block_blob_service.create_blob_from_text(out_blob_container_name, cleaned_blob_file_name, outcsv)
36 | return "Success"
37 |
38 |
--------------------------------------------------------------------------------
/CleanTrigger2/function.json:
--------------------------------------------------------------------------------
1 | {
2 | "scriptFile": "__init__.py",
3 | "bindings": [
4 | {
5 | "authLevel": "anonymous",
6 | "type": "httpTrigger",
7 | "direction": "in",
8 | "name": "req",
9 | "methods": [
10 | "get",
11 | "post"
12 | ]
13 | },
14 | {
15 | "type": "http",
16 | "direction": "out",
17 | "name": "$return"
18 | }
19 | ]
20 | }
--------------------------------------------------------------------------------
/CleanTrigger2/sample.dat:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Azure"
3 | }
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) Microsoft Corporation. All rights reserved.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | page_type: sample
3 | description: "This sample demonstrates a data cleaning pipeline with Azure Functions written in Python."
4 | languages:
5 | - python
6 | products:
7 | - azure-functions
8 | - azure-storage
9 | ---
10 |
11 | # Data Cleaning Pipeline
12 |
13 | This sample demonstrates a data cleaning pipeline with Azure Functions written in Python triggered off a HTTP event from Event Grid to perform some pandas cleaning and reconciliation of CSV files.
14 | Using this sample we demonstrate a real use case where this is used to perform cleaning tasks.
15 |
16 | ## Getting Started
17 |
18 | ### Deploy to Azure
19 |
20 | #### Prerequisites
21 |
22 | - Install Python 3.6+
23 | - Install [Functions Core Tools](https://docs.microsoft.com/en-us/azure/azure-functions/functions-run-local#v2)
24 | - Install Docker
25 | - Note: If run on Windows, use Ubuntu WSL to run deploy script
26 |
27 | #### Steps
28 |
29 | - Deploy through Azure CLI
30 | - Open AZ CLI and run `az group create -l [region] -n [resourceGroupName]` to create a resource group in your Azure subscription (i.e. [region] could be westus2, eastus, etc.)
31 | - Run `az group deployment create --name [deploymentName] --resource-group [resourceGroupName] --template-file azuredeploy.json`
32 |
33 | - Deploy Function App
34 | - [Create/Activate virtual environment](https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-first-function-python#create-and-activate-a-virtual-environment)
35 | - Run `func azure functionapp publish [functionAppName] --build-native-deps`
36 |
37 | ### Test
38 |
39 | - Upload s1.csv file into c1raw container
40 | - Watch event grid trigger the CleanTrigger1 function and produce a "cleaned_s1_raw.csv"
41 | - Repeat the same for s2.csv into c2raw container
42 | - Now send the following HTTP request to the Reconcile function to merge
43 |
44 | ```
45 | {
46 | "file_1_url" : "https://{storagename}.blob.core.windows.net/c1raw/cleaned_s1_raw.csv",
47 | "file_2_url" : "https://{storagename}.blob.core.windows.net/c2raw/cleaned_s2_raw.csv",
48 | "batchId" : "1122"
49 | }
50 |
51 | ```
52 | - Watch it produce final.csv file
53 | - Can use a logic app to call the reconcile method with batch id's
54 |
55 | ## References
56 |
57 | - [Create your first Python Function](https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-first-function-python)
58 |
--------------------------------------------------------------------------------
/Reconcile/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import json
3 | import azure.functions as func
4 | from . import clean as cleaning_service
5 |
6 | def main(req: func.HttpRequest) -> func.HttpResponse:
7 | # This will output to postman
8 | logging.info('Python HTTP trigger function processed a request.')
9 | try:
10 | req_body = req.get_json()
11 | f1_url = req_body.get('file_1_url')
12 | f2_url = req_body.get('file_2_url')
13 | batch_id = req_body.get('batchId')
14 | except:
15 | return func.HttpResponse("Bad Request", status_code=400)
16 |
17 | result = cleaning_service.clean(f1_url,f2_url,batch_id)
18 | return func.HttpResponse(result,status_code=200)
--------------------------------------------------------------------------------
/Reconcile/clean.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import pandas as pd
4 | from azure.storage.blob import ContentSettings
5 | from azure.storage.blob import BlockBlobService
6 | from io import StringIO
7 | from . import fetch_blob as fetching_service
8 |
9 | blob_account_name = os.getenv("BlobAccountName")
10 | blob_account_key = os.getenv("BlobAccountKey")
11 | block_blob_service = BlockBlobService(account_name=blob_account_name,
12 | account_key=blob_account_key)
13 | out_blob_container_name = os.getenv("FINAL")
14 |
15 | # Clean blob flow from event grid events
16 | # This function will call all the other functions in clean.py
17 |
18 | def clean(file_1_url,file_2_url,batch_id):
19 | f1_container = file_1_url.rsplit('/', 2)[-2]
20 | f2_container = file_2_url.rsplit('/', 2)[-2]
21 | f2_df, f1_df = fetch_blobs(batch_id,f2_container,f1_container)
22 | result = final_reconciliation(f2_df, f1_df,batch_id)
23 | return 'Success'
24 |
25 | def fetch_blobs(batch_id,file_2_container_name,file_1_container_name):
26 | # Create container & blob dictionary with helper function
27 | blob_dict = fetching_service.blob_to_dict(batch_id,file_2_container_name,file_1_container_name)
28 |
29 | # Create F1 DF
30 | filter_string = 'c1'
31 | f1_df = fetching_service.blob_dict_to_df(blob_dict, filter_string)
32 |
33 | # Create F2 df
34 | filter_string = 'c2'
35 | f2_df = fetching_service.blob_dict_to_df(blob_dict, filter_string)
36 | return f2_df, f1_df
37 |
38 | def final_reconciliation(f2_df, f1_df,batch_id):
39 | outcsv = f2_df.to_csv(index=False)
40 | cleaned_blob_file_name = "reconciled_" + batch_id
41 | block_blob_service.create_blob_from_text(out_blob_container_name, cleaned_blob_file_name, outcsv)
42 | return "Success"
43 |
--------------------------------------------------------------------------------
/Reconcile/fetch_blob.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import collections
4 | import pandas as pd
5 | from azure.storage.blob import ContentSettings
6 | from azure.storage.blob import BlockBlobService
7 | from io import StringIO
8 | #kill $(lsof -t -i :7071)
9 |
10 | blob_account_name = os.getenv("BlobAccountName")
11 | blob_account_key = os.getenv("BlobAccountKey")
12 | block_blob_service = BlockBlobService(account_name=blob_account_name,
13 | account_key=blob_account_key)
14 |
15 | def blob_dict_to_df(my_ordered_dict, filter_string):
16 | logging.warning('blob_dict_to_df')
17 | filtered_dict = {k:v for k,v in my_ordered_dict.items() if filter_string in k}
18 | logging.warning(filtered_dict)
19 | container_key = list(filtered_dict.keys())[0]
20 | latest_file = list(filtered_dict.values())[0]
21 | blobstring = block_blob_service.get_blob_to_text(container_key, latest_file).content
22 | df = pd.read_csv(StringIO(blobstring),dtype=str)
23 | return df
24 |
25 | def blob_to_dict(batchId,*args):
26 | # add containers to list
27 | container_list = []
28 | arg_len = (len(args))
29 | i = 0
30 | for i in range(arg_len):
31 | container_list.append(args[i])
32 | ''.join([str(i) for i in container_list])
33 | logging.info(container_list)
34 | # get blob file names from container... azure SDK returns a generator object
35 | ii = 0
36 | file_names = []
37 | for container in container_list:
38 | logging.warning('FOR LOOP')
39 | generator = block_blob_service.list_blobs(container)
40 | logging.warning(list(generator))
41 | for file in generator:
42 | if "cleaned" in file.name:
43 | file_names.append(file.name)
44 | ii = ii+1
45 | # Merge the two lists to create a dictionary
46 | # container_file_dict = collections.OrderedDict()
47 | # container_file_dict = dict(zip(container_list,file_names))
48 | c1_list = [f for f in file_names if batchId + "_c1" in f]
49 | c2_list = [f for f in file_names if batchId + "_c2" in f]
50 |
51 | for c in container_list:
52 | if "c1" in c:
53 | c1_name = c
54 | else:
55 | c2_name = c
56 | container_file_dict = {}
57 | container_file_dict[c1_name] = c1_list[0]
58 | container_file_dict[c2_name] = c2_list[0]
59 | return container_file_dict
60 |
--------------------------------------------------------------------------------
/Reconcile/function.json:
--------------------------------------------------------------------------------
1 | {
2 | "scriptFile": "__init__.py",
3 | "bindings": [
4 | {
5 | "authLevel": "anonymous",
6 | "type": "httpTrigger",
7 | "direction": "in",
8 | "name": "req",
9 | "methods": [
10 | "get",
11 | "post"
12 | ]
13 | },
14 | {
15 | "type": "http",
16 | "direction": "out",
17 | "name": "$return"
18 | }
19 | ]
20 | }
21 |
--------------------------------------------------------------------------------
/Reconcile/sample.dat:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Azure"
3 | }
--------------------------------------------------------------------------------
/azure-deploy-event-grid-subscription.json:
--------------------------------------------------------------------------------
1 | {
2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
3 | "contentVersion": "1.0.0.0",
4 | "parameters": {
5 | "eventSubName1": {
6 | "type": "string",
7 | "defaultValue": "subToStorage1",
8 | "metadata": {
9 | "description": "Provide a name for the Event Grid subscription."
10 | }
11 | },
12 | "eventSubName2": {
13 | "type": "string",
14 | "defaultValue": "subToStorage2",
15 | "metadata": {
16 | "description": "Provide a name for the Event Grid subscription."
17 | }
18 | },
19 | "endpoint1": {
20 | "type": "string",
21 | "metadata": {
22 | "description": "Provide the URL for the WebHook to receive events. Create your own endpoint for events."
23 | }
24 | },
25 | "storageName": {
26 | "type": "string",
27 | "defaultValue": "203014767teststorage",
28 | "metadata": {
29 | "description": "Provide a name for the Event Grid subscription."
30 | }
31 | }
32 | },
33 | "resources": [
34 | {
35 | "type": "Microsoft.Storage/storageAccounts/providers/eventSubscriptions",
36 | "name": "[concat(parameters('storageName'), '/Microsoft.EventGrid/', parameters('eventSubName1'))]",
37 | "apiVersion": "2018-01-01",
38 | "properties": {
39 | "destination": {
40 | "endpointType": "WebHook",
41 | "properties": {
42 | "endpointUrl": "[parameters('endpoint1')]"
43 | }
44 | },
45 | "filter": {
46 | "subjectBeginsWith": "",
47 | "subjectEndsWith": "",
48 | "isSubjectCaseSensitive": false,
49 | "includedEventTypes": [
50 | "All"
51 | ],
52 | "advancedFilters": [
53 | {
54 | "operatorType": "StringContains",
55 | "key": "Subject",
56 | "values": ["raw"]
57 | }
58 | ]
59 | }
60 | }
61 | }
62 | ]
63 | }
--------------------------------------------------------------------------------
/azure-deploy-linux-app-plan.json:
--------------------------------------------------------------------------------
1 | {
2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
3 | "contentVersion": "1.0.0.0",
4 | "parameters": {
5 | "functionapp1": {
6 | "defaultValue": "customerendpointdh1",
7 | "type": "String"
8 | },
9 | "functionapp2": {
10 | "defaultValue": "customerendpointdh2",
11 | "type": "String"
12 | },
13 | "config_web_name": {
14 | "defaultValue": "web",
15 | "type": "String"
16 | },
17 | "storageName": {
18 | "defaultValue": "203014767teststorage",
19 | "type": "String"
20 | },
21 | "linuxConsumptionAppName": {
22 | "defaultValue": "WestUSLinuxDynamicPlan",
23 | "type": "String"
24 | },
25 | "siteName1": {
26 | "defaultValue": "customerendpointdh1.azurewebsites.net",
27 | "type": "String"
28 | },
29 | "siteName2": {
30 | "defaultValue": "customerendpointdh2.azurewebsites.net",
31 | "type": "String"
32 | },
33 | "outputBlobContainerName" : {
34 | "defaultValue": "cleaned",
35 | "type": "String"
36 | }
37 | },
38 | "variables": {
39 | "storageAccountid": "[concat(resourceGroup().id,'/providers/','Microsoft.Storage/storageAccounts/', parameters('storageName'))]",
40 | "container1" : "raw",
41 | "container2" : "cleaned"
42 | },
43 | "resources": [
44 | {
45 | "name": "[parameters('storageName')]",
46 | "type": "Microsoft.Storage/storageAccounts",
47 | "apiVersion": "2017-10-01",
48 | "sku": {
49 | "name": "Standard_LRS"
50 | },
51 | "kind": "StorageV2",
52 | "location": "West US",
53 | "tags": {},
54 | "properties": {
55 | "accessTier": "Hot"
56 | },
57 | "resources": [
58 | {
59 | "name": "[concat('default/', variables('container1'))]",
60 | "type": "blobServices/containers",
61 | "apiVersion": "2018-03-01-preview",
62 | "dependsOn": [
63 | "[parameters('storageName')]"
64 | ]
65 | },
66 | {
67 | "name": "[concat('default/', variables('container2'))]",
68 | "type": "blobServices/containers",
69 | "apiVersion": "2018-03-01-preview",
70 | "dependsOn": [
71 | "[parameters('storageName')]"
72 | ]
73 | }
74 | ]
75 | },
76 | {
77 | "type": "Microsoft.Web/serverfarms",
78 | "sku": {
79 | "name": "Y1",
80 | "tier": "Dynamic",
81 | "size": "Y1",
82 | "family": "Y",
83 | "capacity": 0
84 | },
85 | "kind": "functionapp",
86 | "name": "[parameters('linuxConsumptionAppName')]",
87 | "apiVersion": "2016-09-01",
88 | "location": "West US",
89 | "properties": {
90 | "name": "[parameters('linuxConsumptionAppName')]",
91 | "perSiteScaling": false,
92 | "reserved": true
93 | },
94 | "dependsOn": []
95 | },
96 | {
97 | "type": "Microsoft.Web/sites",
98 | "kind": "functionapp,linux",
99 | "name": "[parameters('functionapp1')]",
100 | "apiVersion": "2016-08-01",
101 | "location": "West US",
102 | "properties": {
103 | "enabled": true,
104 | "hostNameSslStates": [
105 | {
106 | "name": "[concat(parameters('functionapp1'),'.azurewebsites.net')]",
107 | "sslState": "Disabled",
108 | "hostType": "Standard"
109 | }
110 | ],
111 | "serverFarmId": "[resourceId('Microsoft.Web/serverfarms', parameters('linuxConsumptionAppName'))]",
112 | "reserved": true,
113 | "siteConfig": {
114 | "appSettings": [
115 | {
116 | "name": "AzureWebJobsDashboard",
117 | "value": "[concat('DefaultEndpointsProtocol=https;AccountName=', parameters('storageName'), ';AccountKey=', listKeys(variables('storageAccountid'),'2015-05-01-preview').key1)]"
118 | },
119 | {
120 | "name": "AzureWebJobsStorage",
121 | "value": "[concat('DefaultEndpointsProtocol=https;AccountName=', parameters('storageName'), ';AccountKey=', listKeys(variables('storageAccountid'),'2015-05-01-preview').key1)]"
122 | },
123 | {
124 | "name": "WEBSITE_CONTENTAZUREFILECONNECTIONSTRING",
125 | "value": "[concat('DefaultEndpointsProtocol=https;AccountName=', parameters('storageName'), ';AccountKey=', listKeys(variables('storageAccountid'),'2015-05-01-preview').key1)]"
126 | },
127 | {
128 | "name": "WEBSITE_CONTENTSHARE",
129 | "value": "[parameters('functionapp1')]"
130 | },
131 | {
132 | "name": "FUNCTIONS_EXTENSION_VERSION",
133 | "value": "~2"
134 | },
135 | {
136 | "name": "WEBSITE_NODE_DEFAULT_VERSION",
137 | "value": "8.11.1"
138 | },
139 | {
140 | "name": "FUNCTIONS_WORKER_RUNTIME",
141 | "value": "python"
142 | },
143 | {
144 | "name" : "BlobAccountName",
145 | "value" : "[parameters('storageName')]"
146 | },
147 | {
148 | "name": "BlobAccountKey",
149 | "value" : "[listKeys(variables('storageAccountid'),'2015-05-01-preview').key1]"
150 | },
151 | {
152 | "name" : "OutBlobContainerName",
153 | "value" : "[parameters('outputBlobContainerName')]"
154 | }
155 | ]
156 | }
157 | },
158 | "dependsOn": [
159 | "[resourceId('Microsoft.Web/serverfarms', parameters('linuxConsumptionAppName'))]"
160 | ]
161 | },
162 | {
163 | "type": "Microsoft.Web/sites/config",
164 | "name": "[concat(parameters('functionapp1'), '/', parameters('config_web_name'))]",
165 | "apiVersion": "2016-08-01",
166 | "location": "West US",
167 | "properties": {
168 | "netFrameworkVersion": "v4.0",
169 | "scmType": "None",
170 | "use32BitWorkerProcess": true,
171 | "webSocketsEnabled": false,
172 | "alwaysOn": false,
173 | "appCommandLine": "",
174 | "managedPipelineMode": "Integrated",
175 | "virtualApplications": [
176 | {
177 | "virtualPath": "/",
178 | "physicalPath": "site\\wwwroot",
179 | "preloadEnabled": false }
180 | ],
181 | "customAppPoolIdentityAdminState": false,
182 | "customAppPoolIdentityTenantState": false,
183 | "loadBalancing": "LeastRequests",
184 | "routingRules": [],
185 | "experiments": {
186 | "rampUpRules": []
187 | },
188 | "autoHealEnabled": false,
189 | "vnetName": "",
190 | "cors": {
191 | "allowedOrigins": [
192 | "https://functions.azure.com",
193 | "https://functions-staging.azure.com",
194 | "https://functions-next.azure.com"
195 | ],
196 | "supportCredentials": false
197 | }
198 | },
199 | "dependsOn": [
200 | "[resourceId('Microsoft.Web/sites', parameters('functionapp1'))]"
201 | ]
202 | },
203 | {
204 | "type": "Microsoft.Web/sites/hostNameBindings",
205 | "name": "[concat(parameters('functionapp1'), '/', parameters('siteName1'))]",
206 | "apiVersion": "2016-08-01",
207 | "location": "West US",
208 | "properties": {
209 | "siteName": "customerendpointdh1",
210 | "hostNameType": "Verified"
211 | },
212 | "dependsOn": [
213 | "[resourceId('Microsoft.Web/sites', parameters('functionapp1'))]"
214 | ]
215 | }
216 | ]
217 | }
--------------------------------------------------------------------------------
/azuredeploy.parameters.json:
--------------------------------------------------------------------------------
1 | {
2 | "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json",
3 | "contentVersion": "1.0.0.0",
4 | "parameters": {
5 | "functionapp1": {
6 | "value": "203014767_func_app1"
7 | },
8 | "functionapp2":{
9 | "value" : "203014767_func_app2"
10 | },
11 | "storageName":{
12 | "value": "203014767teststorage"
13 | },
14 | "outputBlobContainerName":{
15 | "value" : "203014767_Test_Blob"
16 | },
17 | "eventSubName1" : {
18 | "value" : "203014767_event1"
19 | },
20 | "eventSubName2" : {
21 | "value" : "203014767_event2"
22 | },
23 | "endpoint1" : {
24 | "value" : "203014767_endpoint1"
25 | }
26 | }
27 | }
28 |
--------------------------------------------------------------------------------
/blob_to_smart_contract/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import json
3 | import azure.functions as func
4 | from . import clean as cleaning_service
5 |
6 | def main(req: func.HttpRequest) -> func.HttpResponse:
7 | # This will output to postman
8 | logging.info('Python HTTP trigger function processed a request.')
9 | req_body = req.get_json()
10 |
11 | if is_validation_event(req_body):
12 | return func.HttpResponse(validate_eg(req_body))
13 |
14 | elif is_blob_created_event(req_body):
15 | result = cleaning_service.clean(req_body)
16 |
17 | if result is "Success":
18 | return func.HttpResponse("Successfully cleaned data",status_code=200)
19 | else:
20 | return func.HttpResponse("Bad Request", status_code=400)
21 |
22 | else: # don't care about other events
23 | pass
24 |
25 | # Check for validation event from event grid
26 | def is_validation_event(req_body):
27 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.EventGrid.SubscriptionValidationEvent"
28 |
29 | # If blob created event, then true
30 | def is_blob_created_event(req_body):
31 | return req_body and req_body[0] and req_body[0]['eventType'] and req_body[0]['eventType'] == "Microsoft.Storage.BlobCreated"
32 |
33 | # Respond to event grid webhook validation event
34 | def validate_eg(req_body):
35 | result = {}
36 | result['validationResponse'] = req_body[0]['data']['validationCode']
37 | return json.dumps(result)
--------------------------------------------------------------------------------
/blob_to_smart_contract/clean.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import logging
3 | import requests
4 | import json
5 | import numpy as np
6 | import os
7 | import pandas as pd
8 | from azure.storage.blob import ContentSettings
9 | from azure.storage.blob import BlockBlobService
10 | from io import StringIO
11 | from adal import AuthenticationContext
12 | from . import fetch_blob as fetching_service
13 | #ce4d996051a1005da9245562212cb070efdba9d2
14 |
15 | #python3.6 -m venv funcenv... this creates the funcenv
16 | # source funcenv/bin/activate... this activates virtual environment created above
17 | # func host start after each change
18 | # pip install -r requirements.txt
19 | #ce4d996051a1005da9245562212cb070efdba9d2
20 |
21 | #python3.6 -m venv funcenv... this creates the funcenv
22 | # source funcenv/bin/activate... this activates virtual environment created above
23 | # func host start after each change
24 | # pip install -r requirements.txt
25 | #%%
26 | blob_account_name = os.getenv("BlobAccountName")
27 | blob_account_key = os.getenv("BlobAccountKey")
28 | block_blob_service = BlockBlobService(account_name=blob_account_name,
29 | account_key=blob_account_key)
30 | out_blob_final = os.getenv("OutBlobFinal")
31 | #%%
32 | AUTHORITY = 'https://login.microsoftonline.com/gemtudev.onmicrosoft.com'
33 |
34 | # Click on this link to get the Swagger API reference
35 | base_url = 'https://gemtu-ws5arp-api.azurewebsites.net'
36 | WORKBENCH_API_URL = 'https://gemtu-ws5arp-api.azurewebsites.net'
37 | #base_url = 'https://gemtu-ws5arp-api.azurewebsites.net'
38 |
39 | # This is the application ID of the blockchain workbench web API
40 | # Login to the directory of the tenant -> App registrations -> 'Azure Blockchain Workbench *****-***** ->
41 | # copy the Application ID
42 | RESOURCE = 'a33cc4fb-e3f2-4c23-a005-b46819f58f07'
43 |
44 | #Service principal app id & secret/key:
45 | CLIENT_APP_Id = 'c8c2dab5-db8b-4ae2-8210-45b7a335708e'
46 | CLIENT_SECRET = 'Rh95dZrJobHe3fB/GyhxhPyIRtW8DKmThmFl+CfmtI4='
47 | #%%
48 | auth_context = AuthenticationContext(AUTHORITY)
49 | #%%
50 | def clean(req_body):
51 | dfCreate = fetch_blobs(out_blob_final)
52 | #create_contract(14, 14, 1, testPayload3b)
53 | json_array= populate_workbench(dfCreate)
54 | result = create_json_blob(json_array)
55 | return 'Success'
56 | #%%
57 | # Read/process CSV into pandas df
58 | def fetch_blobs(out_blob_final):
59 | # Create container & blob dictionary with helper function
60 | blob_dict = fetching_service.blob_to_dict(out_blob_final)
61 | # create DF
62 | filter_string = "final"
63 | df = fetching_service.blob_dict_to_df(blob_dict, filter_string)
64 | logging.info(df.dtypes)
65 | return df
66 | #%%
67 | def populate_workbench(dfCreate):
68 | json_array = []
69 | #logging.warning(dfCreate.head())
70 | for index, row in dfCreate.iterrows():
71 | try:
72 | #logging.warning(dfCreate.iloc[index])
73 | payload = make_create_payload(dfCreate,index)
74 | json_array+=[payload]
75 | #logging.warning(payload)
76 | #resp = create_contract(workflowId,contractCodeId,connectionId,payload)
77 | #createdContracts.append(resp.text)
78 | except:
79 | print('contract creation failed')
80 | continue
81 | #logging.warning(payload)
82 | return json_array
83 | #%%
84 | def make_create_payload(df,index):
85 | # This function generates the payload json fed from the pandas df
86 | #logging.warning(df)
87 | #need to update this value
88 | workflowFunctionId = 93
89 | try:
90 | #logging.warning('Creating payload...\n')
91 | payload = {
92 | "workflowFunctionId": workflowFunctionId,
93 | "workflowActionParameters": [
94 | {
95 | "name": "po",
96 | "value": df['po'][index]
97 | }, {
98 | "name": "itemno",
99 | "value": df['itemno'][index]
100 | }, {
101 | "name": "invno",
102 | "value": df['invno'][index]
103 | }, {
104 | "name": "signedinvval",
105 | "value": df['signedinval'][index]
106 | }, {
107 | "name": "invdate",
108 | "value": df['invdate'][index]
109 | }, {
110 | "name": "poformat",
111 | "value": df['poformat'][index]
112 | }, {
113 | "name": "popricematch",
114 | "value": df['popricematch'][index]
115 | }, {
116 | "name": "poinvpricematch",
117 | "value": df['poinvpricematch'][index]
118 | }, {
119 | "name": "initstate",
120 | "value": df['initstate'][index]
121 | }, {
122 | "name": "finalpo",
123 | "value": df['finalpo'][index]
124 | }, {
125 | "name": "finalresult",
126 | "value": df['finalresult'][index]
127 | }
128 | ]
129 | }
130 | #payload = json.dumps(payload)
131 | #logging.warning('payload')
132 | #logging.warning(payload)
133 | return payload
134 | except:
135 | logging.warning('error in payload')
136 |
137 | def create_json_blob(json_array):
138 | #outjson = json_array
139 | #myarray = np.asarray(json_array).tolist()
140 | myarray = pd.Series(json_array).to_json(orient='values')
141 | blob_file_name = "df_to_json.json"
142 | block_blob_service.create_blob_from_text(out_blob_final, blob_file_name, myarray)
143 | return 'Success'
144 | #%%
145 | #testPayload3b = json.dumps(testPayload3)
146 | createdContracts = []
147 | logging.warning('Creating contracts...\n')
148 | def create_contract(workflowId, contractCodeId, connectionId, payload):
149 | if __name__ == '__main__':
150 | try:
151 | # Acquiring the token
152 | token = auth_context.acquire_token_with_client_credentials(
153 | RESOURCE, CLIENT_APP_Id, CLIENT_SECRET)
154 | #pprint(str(token))
155 |
156 | url = WORKBENCH_API_URL + '/api/v2/contracts'
157 |
158 | headers = {'Authorization': 'Bearer ' +
159 | token['accessToken'], 'Content-Type': 'application/json'}
160 |
161 | params = {'workflowId': workflowId, 'contractCodeId': contractCodeId, 'connectionId': connectionId}
162 |
163 | # Making call to Workbench
164 | response = requests.post(url=url,data=payload,headers=headers,params=params)
165 |
166 | print('Status code: ' + str(response.status_code), '\n')
167 | print('Created contractId: ' + str(response.text), '\n', '\n')
168 | return response
169 | except Exception as error:
170 | print(error)
171 | return error
172 |
173 |
174 |
175 |
176 |
--------------------------------------------------------------------------------
/blob_to_smart_contract/fetch_blob.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import os
3 | import collections
4 | import pandas as pd
5 | import numpy as np
6 | from azure.storage.blob import ContentSettings
7 | from azure.storage.blob import BlockBlobService
8 | from io import StringIO
9 | #kill $(lsof -t -i :7071)
10 |
11 | blob_account_name = os.getenv("BlobAccountName")
12 | blob_account_key = os.getenv("BlobAccountKey")
13 | block_blob_service = BlockBlobService(account_name=blob_account_name,
14 | account_key=blob_account_key)
15 |
16 | def blob_dict_to_df(my_ordered_dict, filter_string):
17 | logging.warning('blob_dict_to_df')
18 | logging.warning(my_ordered_dict)
19 | logging.warning(filter_string)
20 | filtered_dict = {k:v for k,v in my_ordered_dict.items() if filter_string in k}
21 | logging.warning(filtered_dict)
22 | container_key = list(filtered_dict.keys())[0]
23 | latest_file = list(filtered_dict.values())[0]
24 | blobstring = block_blob_service.get_blob_to_text(container_key, latest_file).content
25 | df = pd.read_csv(StringIO(blobstring),dtype=str)
26 | df = df.replace(np.nan, '', regex=True)
27 | df["initstate"] = df["finalresult"].map(lambda x: "0" if "no" in x else "2")
28 | #logging.warning(df.head())
29 | return df
30 |
31 | def blob_to_dict(*args):
32 | # add containers to list
33 | container_list = []
34 | arg_len = (len(args))
35 | i = 0
36 | for i in range(arg_len):
37 | container_list.append(args[i])
38 | ''.join([str(i) for i in container_list])
39 | logging.info(container_list)
40 | # get blob file names from container... azure SDK returns a generator object
41 | ii = 0
42 | file_names = []
43 | for container in container_list:
44 | logging.warning('FOR LOOP')
45 | generator = block_blob_service.list_blobs(container)
46 | logging.warning(list(generator))
47 | for file in generator:
48 | file_names.append(file.name)
49 | logging.info(file_names[ii])
50 | ii = ii+1
51 | # Merge the two lists to create a dictionary
52 | container_file_dict = collections.OrderedDict()
53 | container_file_dict = dict(zip(container_list,file_names))
54 | #blob_dict_to_df(container_file_dict)
55 | logging.warning('blob_to_dict function')
56 | logging.warning(container_file_dict)
57 | return container_file_dict
58 |
--------------------------------------------------------------------------------
/blob_to_smart_contract/function.json:
--------------------------------------------------------------------------------
1 | {
2 | "scriptFile": "__init__.py",
3 | "bindings": [
4 | {
5 | "authLevel": "anonymous",
6 | "type": "httpTrigger",
7 | "direction": "in",
8 | "name": "req",
9 | "methods": [
10 | "get",
11 | "post"
12 | ]
13 | },
14 | {
15 | "type": "http",
16 | "direction": "out",
17 | "name": "$return"
18 | }
19 | ]
20 | }
--------------------------------------------------------------------------------
/blob_to_smart_contract/sample.dat:
--------------------------------------------------------------------------------
1 | {
2 | "name": "Azure"
3 | }
--------------------------------------------------------------------------------
/dataset/config.ini:
--------------------------------------------------------------------------------
1 | ; config.ini
2 | [Columns]
3 | customer=highrandom
4 | order=highrandom
5 | names=Richard,Ben,Nick,Aaron,John
6 | region=east,west,central
7 | item=pens,binder,paper
8 | units=lowrandom
9 | price=lowrandom
--------------------------------------------------------------------------------
/dataset/randomcsvgenerator.py:
--------------------------------------------------------------------------------
1 | import configparser
2 | import random
3 |
4 | # nr of rows
5 | rows=100
6 |
7 | # read from config.ini
8 | config = configparser.ConfigParser()
9 | config.read('config.ini')
10 | section = config.sections()[0]
11 | col_names = config.options(section)
12 |
13 | with open('generated.csv','w') as f:
14 | f.write(','.join(col_names) + "\n")
15 | for i in range(1,rows):
16 | line = []
17 | for col in col_names:
18 | item = config.get(section,col)
19 | # define as many conditions you like...
20 |
21 | # a large random number
22 | if item == "highrandom":
23 | line.append(str(random.randrange(1000000, 9999999)))
24 | # a medium random number
25 | if item == "medrandom":
26 | line.append(str(random.randrange(10000,99999)))
27 | # a low random number
28 | if item == "lowrandom":
29 | line.append(str(random.randrange(100,999)))
30 | # generate random choice from given set of values
31 | if "," in item:
32 | choice = random.choice(item.split(","))
33 | line.append(str(choice))
34 | f.write(','.join(line) + "\n")
35 |
36 | f.close()
37 |
38 |
--------------------------------------------------------------------------------
/dataset/s1_raw.csv:
--------------------------------------------------------------------------------
1 | customer,order,names,region,item,units,price
2 | 7262165,9703508,Aaron,east,paper,747,997
3 | 4616455,8069744,Ben,west,paper,606,185
4 | 5971611,9145486,Ben,west,pens,271,403
5 | 2338105,3958052,Ben,central,pens,119,318
6 | 6058281,5713029,Aaron,east,pens,111,588
7 | 7799747,1935441,John,central,pens,494,541
8 | 4268894,9609672,John,central,pens,569,269
9 | 3904926,4793823,John,east,pens,480,895
10 | 7136420,4103116,John,central,paper,286,671
11 | 7615742,7936821,Aaron,east,paper,826,688
12 | 5579191,4938850,Richard,central,binder,300,498
13 | 8766316,7885362,Richard,west,paper,212,940
14 | 6829476,2759171,Ben,east,pens,116,984
15 | 9356622,6821948,John,west,paper,411,117
16 | 6120661,1749213,Aaron,central,pens,385,823
17 | 7694333,4818021,Ben,west,paper,239,753
18 | 8973305,3604550,Aaron,west,binder,428,977
19 | 7742689,2042955,Ben,east,pens,280,716
20 | 4876091,2342131,Aaron,east,pens,570,213
21 | 8678810,8595134,Nick,west,paper,666,553
22 | 4761317,2309400,Nick,east,binder,177,374
23 | 8485140,8385257,Ben,east,binder,928,787
24 | 1111334,5531601,Ben,east,paper,155,920
25 | 1587478,6966827,John,west,binder,208,257
26 | 8917514,9208473,John,central,pens,788,610
27 | 6285660,6064145,Nick,east,paper,644,589
28 | 3522700,5650262,Richard,west,pens,599,362
29 | 6392383,4018601,Nick,east,paper,412,186
30 | 7411390,8728047,Richard,west,paper,149,745
31 | 5157191,8207924,Richard,east,paper,949,554
32 | 2381505,2881712,Nick,east,paper,273,231
33 | 3682238,2436487,Ben,west,paper,944,429
34 | 9325361,2922302,Nick,central,binder,649,413
35 | 8921453,2474892,Ben,east,pens,615,826
36 | 9056577,8816645,Nick,west,paper,377,977
37 | 2568631,1440723,Aaron,east,pens,399,129
38 | 5860753,8780527,John,central,binder,791,621
39 | 2079112,5606513,Nick,west,binder,179,422
40 | 5900635,7963477,Aaron,east,binder,773,324
41 | 3195785,6243244,Richard,central,binder,458,611
42 | 5588816,2812958,Nick,east,paper,925,969
43 | 7480228,5361920,Nick,east,paper,493,222
44 | 5240415,2853334,Richard,east,paper,220,108
45 | 2741925,2509574,Aaron,central,binder,950,984
46 | 2889267,2096300,Ben,east,pens,319,215
47 | 2636111,5627275,Nick,central,binder,207,319
48 | 8504974,5010213,Ben,east,paper,578,243
49 | 2840278,3860160,Nick,central,paper,881,489
50 | 3034359,2075511,Ben,west,binder,684,413
51 | 5132194,6559888,Nick,central,pens,502,988
52 | 8058886,5301513,Richard,west,pens,403,356
53 | 4317202,5401933,Ben,west,binder,996,441
54 | 5995744,1889420,Aaron,west,pens,856,393
55 | 7335577,8629612,Aaron,central,paper,658,751
56 | 7574670,8912546,Ben,central,pens,366,530
57 | 5252527,4418895,Ben,central,pens,106,686
58 | 3376538,2151894,Nick,east,binder,336,748
59 | 5126705,4964040,Nick,west,pens,695,166
60 | 7476692,7811601,John,west,pens,714,831
61 | 6407071,5205813,John,east,pens,247,520
62 | 4590122,4003835,Richard,east,binder,996,481
63 | 8088663,4112730,Nick,east,binder,748,257
64 | 4453747,7728857,Ben,west,binder,971,433
65 | 6013003,2347973,John,east,pens,799,423
66 | 9244644,1181002,Ben,central,paper,293,434
67 | 2497717,9072391,Nick,west,pens,902,180
68 | 3343840,8678453,Richard,east,paper,617,228
69 | 1477311,9194058,John,east,paper,476,735
70 | 5865196,7676539,Ben,central,paper,624,111
71 | 4977880,4045629,Ben,central,binder,382,642
72 | 1149541,3004955,Nick,east,binder,455,136
73 | 9546677,7430616,Ben,central,paper,994,323
74 | 6475847,6794001,Nick,west,pens,211,146
75 | 1123534,9223169,John,central,paper,634,773
76 | 8339182,7183324,John,west,paper,808,461
77 | 9022922,3870153,Aaron,central,pens,416,173
78 | 4277498,8617843,Richard,central,paper,494,152
79 | 7259430,3632115,Nick,east,binder,215,536
80 | 6714318,3847473,John,central,paper,231,835
81 | 4133799,5878001,Richard,east,pens,680,722
82 | 8560303,7350110,John,east,binder,702,245
83 | 7310662,5376060,Richard,west,binder,894,315
84 | 5520029,2769000,Ben,west,paper,199,618
85 | 9296388,5402422,Nick,central,binder,936,532
86 | 2174535,5536311,Ben,west,binder,660,275
87 | 9897466,3653221,Nick,west,binder,858,656
88 | 7310133,8262752,Richard,central,pens,655,677
89 | 6768863,4916288,Richard,west,binder,350,380
90 | 9090185,2833327,John,east,binder,353,216
91 | 8453475,4107163,Ben,west,paper,735,318
92 | 4019264,9935008,Nick,east,pens,358,514
93 | 3638572,9898492,Richard,east,paper,872,390
94 | 9594055,5740416,Aaron,central,binder,932,562
95 | 9749755,3613121,Ben,central,pens,265,589
96 | 2081996,4848737,John,central,paper,113,920
97 | 1122957,8130323,Nick,west,pens,111,557
98 | 4199051,6375160,Ben,central,pens,169,849
99 | 9823036,1553562,John,west,binder,225,839
100 | 2866071,6919487,John,west,paper,714,601
101 |
--------------------------------------------------------------------------------
/dataset/s2_raw.csv:
--------------------------------------------------------------------------------
1 | customer,order,names,region,item,units,price
2 | 2630120,6615957,John,west,paper,236,820
3 | 1928229,1631195,Nick,west,paper,186,450
4 | 8703733,3332001,Ben,central,binder,789,650
5 | 4508368,9651099,Richard,central,pens,948,551
6 | 6895053,6691904,Richard,east,paper,453,548
7 | 1018357,1437828,Ben,central,pens,954,640
8 | 2697132,9866065,Nick,central,paper,696,353
9 | 3027058,9130952,Aaron,east,binder,138,223
10 | 2073981,9141578,Ben,west,paper,325,422
11 | 6076983,4238099,Nick,central,paper,134,344
12 | 5316851,8121173,John,west,pens,825,954
13 | 9962221,1977268,Aaron,central,paper,557,557
14 | 5398147,6367649,John,east,paper,532,566
15 | 3864861,4066176,Aaron,west,binder,381,199
16 | 9821733,9512218,Aaron,east,pens,466,583
17 | 2940832,3210755,John,central,binder,119,616
18 | 1799654,4468679,John,west,paper,622,300
19 | 6729716,9309020,Ben,west,binder,948,623
20 | 7280784,8332358,Ben,west,paper,279,225
21 | 6674887,1613599,Ben,west,paper,221,427
22 | 7863449,7505176,John,east,pens,218,890
23 | 3609656,4698495,Ben,west,pens,196,563
24 | 7592925,7749241,Richard,central,pens,339,498
25 | 8875502,3067891,Nick,west,binder,927,260
26 | 5286002,2341849,Richard,east,paper,801,965
27 | 5051433,5163955,Richard,east,pens,393,798
28 | 5699284,9868416,Ben,west,paper,280,750
29 | 7043309,2474609,Nick,east,paper,147,353
30 | 7151204,8237679,Nick,west,binder,664,620
31 | 9170699,9080335,John,east,binder,221,626
32 | 6321713,5514052,John,east,pens,306,410
33 | 7448969,6503473,Aaron,central,binder,293,274
34 | 4549509,6654647,Richard,west,paper,918,868
35 | 9453682,1636058,Aaron,central,pens,976,280
36 | 5335378,5838107,Richard,west,binder,699,220
37 | 1392828,7208028,Richard,west,binder,182,786
38 | 9697042,1346679,John,west,pens,238,212
39 | 1451047,4435497,Richard,west,binder,365,718
40 | 3594849,8554543,Ben,west,paper,945,127
41 | 3867317,2521725,John,east,pens,842,270
42 | 4558641,1050934,Nick,west,paper,605,286
43 | 4619372,7948476,Ben,east,pens,512,682
44 | 2026276,4485732,Ben,east,binder,795,857
45 | 2719065,5068010,Richard,east,pens,289,436
46 | 1391907,2041945,John,west,pens,917,627
47 | 1868539,2325194,Ben,west,binder,579,190
48 | 4108552,8039195,John,west,paper,271,808
49 | 1046194,5168931,Nick,central,pens,513,693
50 | 8301946,2956675,Richard,east,binder,567,761
51 | 9055248,5868755,Richard,west,binder,232,219
52 | 3874847,1563078,Ben,east,paper,706,611
53 | 8616293,5952825,Ben,east,paper,665,683
54 | 4657692,8199620,Nick,west,binder,570,961
55 | 2937477,1920961,Nick,central,paper,121,799
56 | 2902393,7232627,Richard,central,paper,873,145
57 | 2801703,7954307,Nick,central,pens,581,550
58 | 1579315,4808019,John,central,pens,646,138
59 | 5104644,4471392,Aaron,central,paper,839,524
60 | 8117338,8816269,Aaron,east,binder,795,481
61 | 4292715,5144317,Nick,central,binder,639,451
62 | 9574437,7149165,Nick,east,pens,474,510
63 | 1286942,6788174,Ben,west,binder,836,742
64 | 7914658,1253557,John,central,binder,586,662
65 | 3610539,8287938,John,west,paper,170,416
66 | 9691626,7703325,Ben,west,binder,292,794
67 | 7773380,3324706,Ben,central,paper,372,558
68 | 9560196,7923059,Nick,east,pens,727,181
69 | 8331616,6920131,Ben,east,pens,262,530
70 | 2169243,8424174,Ben,east,paper,988,668
71 | 9149901,1420867,Richard,east,pens,310,693
72 | 1952779,4474360,Aaron,east,paper,333,782
73 | 9247917,9273201,Nick,east,pens,790,519
74 | 1899785,2109114,Ben,east,pens,476,668
75 | 7856754,2280721,Ben,east,paper,124,746
76 | 2105839,5509421,Nick,central,pens,379,716
77 | 3515994,1988786,Aaron,central,pens,599,712
78 | 6461676,7340276,John,central,pens,920,190
79 | 7276182,1076975,Aaron,central,binder,927,840
80 | 2152277,2696815,Aaron,east,binder,685,812
81 | 5527535,5810406,Aaron,west,pens,542,337
82 | 8463126,2974927,Richard,central,paper,678,133
83 | 7173049,8681162,Nick,central,paper,506,847
84 | 8719679,5690117,Nick,central,pens,230,578
85 | 9617614,9591048,Richard,east,paper,913,301
86 | 3377423,3798798,Ben,east,paper,769,947
87 | 8451040,1070835,Richard,east,pens,418,508
88 | 8332099,8158160,Ben,west,binder,657,577
89 | 6570058,4061390,Aaron,west,binder,406,426
90 | 4080314,8616824,Aaron,central,pens,797,221
91 | 4375686,4191217,John,west,binder,734,550
92 | 2386825,6043101,Nick,east,binder,524,491
93 | 3272868,2159803,Ben,central,paper,523,601
94 | 5116646,6224073,Richard,central,binder,531,529
95 | 9699564,8448356,Richard,west,paper,706,241
96 | 8484139,1132006,Aaron,central,binder,241,516
97 | 3612660,7468263,Richard,east,binder,437,107
98 | 5732816,1145162,John,central,paper,289,420
99 | 3484603,6390384,Nick,west,binder,112,242
100 | 2040436,2378152,Richard,west,binder,990,590
101 |
--------------------------------------------------------------------------------
/host.json:
--------------------------------------------------------------------------------
1 | {
2 | "version": "2.0"
3 | }
--------------------------------------------------------------------------------
/local.settings.json:
--------------------------------------------------------------------------------
1 | {
2 | "IsEncrypted": false,
3 | "Values": {
4 | "FUNCTIONS_WORKER_RUNTIME": "python",
5 | "BlobAccountName": "",
6 | "BlobAccountKey": "",
7 | "C1": "c1raw",
8 | "C2": "c2raw",
9 | "FINAL" : "reconciled"
10 | },
11 | "ConnectionStrings": {}
12 | }
13 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | asn1crypto==0.24.0
2 | astroid==2.1.0
3 | azure-common==1.1.18
4 | azure-functions==1.0.0a5
5 | azure-functions-worker==1.0.0a6
6 | azure-storage-blob==1.4.0
7 | azure-storage-common==1.4.0
8 | certifi==2018.11.29
9 | cffi==1.11.5
10 | chardet==3.0.4
11 | cryptography==2.5
12 | cycler==0.10.0
13 | grpcio==1.14.2
14 | grpcio-tools==1.14.2
15 | idna==2.8
16 | isort==4.3.4
17 | kiwisolver==1.0.1
18 | lazy-object-proxy==1.3.1
19 | matplotlib==3.0.2
20 | mccabe==0.6.1
21 | numpy==1.16.1
22 | pandas==0.24.1
23 | protobuf==3.6.1
24 | ptvsd==4.2.3
25 | pycparser==2.19
26 | pylint==2.2.2
27 | pyparsing==2.3.1
28 | python-dateutil==2.7.5
29 | pytz==2018.9
30 | requests==2.21.0
31 | scikit-learn==0.20.2
32 | scipy==1.2.0
33 | six==1.12.0
34 | sklearn==0.0
35 | typed-ast==1.3.0
36 | urllib3==1.24.1
37 | wrapt==1.11.1
38 |
--------------------------------------------------------------------------------
/tests/host.json:
--------------------------------------------------------------------------------
1 | {
2 | "version": "2.0"
3 | }
--------------------------------------------------------------------------------
/tests/subvalidation.json:
--------------------------------------------------------------------------------
1 | [{
2 | "id": "2d1781af-3a4c-4d7c-bd0c-e34b19da4e66",
3 | "topic": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
4 | "subject": "",
5 | "data": {
6 | "validationCode": "512d38b6-c7b8-40c8-89fe-f46f9e9622b6",
7 | "validationUrl": "https://rp-eastus2.eventgrid.azure.net:553/eventsubscriptions/estest/validate?id=B2E34264-7D71-453A-B5FB-B62D0FDC85EE&t=2018-04-26T20:30:54.4538837Z&apiVersion=2018-05-01-preview&token=1BNqCxBBSSE9OnNSfZM4%2b5H9zDegKMY6uJ%2fO2DFRkwQ%3d"
8 | },
9 | "eventType": "Microsoft.EventGrid.SubscriptionValidationEvent",
10 | "eventTime": "2018-01-25T22:12:19.4556811Z",
11 | "metadataVersion": "1",
12 | "dataVersion": "1"
13 | }]
--------------------------------------------------------------------------------
/tests/test_eventgrid.py:
--------------------------------------------------------------------------------
1 | import pytest
2 | import subprocess
3 | import os
4 | import signal
5 | import requests
6 | import json
7 | import time
8 | import collections
9 | ### Travis: http://luisquintanilla.me/2018/02/18/testing-deploying-python-projects-travisci/
10 | #func host start and then open a new terminal and cd into the test folder
11 | # pytest -v test_eventgrid.py
12 | ## https://learning.oreilly.com/library/view/python-testing-with/9781680502848/f_0011.xhtml#ch.pytest
13 | # The Task structure is used as a data structure to pass information between the UI and the API
14 | #Task = collections.namedtuple('Task', ['summary', 'owner', 'done', 'id'])
15 | pro = None
16 |
17 | #Task = collections.namedtuple('Task', ['summary', 'owner', 'done', 'id'])
18 | # You can use __new__.__defaults__ to create Task objects without having to specify all the fields.
19 | #Task.__new__.__defaults__ = (None, None, False, None)
20 |
21 |
22 | @pytest.fixture
23 | def init_func():
24 | pass
25 | #subprocess.Popen("func host start",shell=True)
26 | # The os.setsid() is passed in the argument preexec_fn so
27 | # it's run after the fork() and before exec() to run the shell.
28 | #pro = subprocess.Popen(['func','host','start'],stdout=subprocess.PIPE, shell=True, preexec_fn=os.setsid)
29 | #yield
30 | #print("tearing down functions host...")
31 | #os.killpg(os.getpgid(pro.pid), signal.SIGTERM)
32 |
33 |
34 | # https://docs.pytest.org/en/latest/fixture.html
35 | # https://docs.pytest.org/en/latest/parametrize.html
36 | # https://learning.oreilly.com/library/view/Python+Testing+with+pytest/9781680502848/f_0026.xhtml#parametrized_testing
37 | # Use @pytest.mark.parametrize(argnames, argvalues) to pass lots of data through the same test, like this:
38 | @pytest.mark.parametrize('web', ['http://localhost:7071/api/GE_Clean_Trigger',
39 | 'http://localhost:7071/api/MTU_Clean_Trigger',
40 | 'http://localhost:7071/api/PO_Match'])
41 | #@pytest.fixture
42 | #def web():
43 | # links = 'http://localhost:7071/api/GE_Clean_Trigger'
44 | # return links
45 |
46 | def test_eg_validation(init_func, web):
47 | with open('subvalidation.json') as f:
48 | payload = json.load(f)
49 | r = requests.post(web, json = payload)
50 | print(r.status_code,r.json())
51 | assert 'validationResponse' in str(r.json())
52 |
--------------------------------------------------------------------------------