├── .gitignore ├── LICENSE ├── README.md └── sample.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. All rights reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | --- 2 | languages: 3 | - python 4 | products: 5 | - azure 6 | - azure-active-directory 7 | page_type: sample 8 | description: "When building an application with Python SDK for Data Lake Analytics, you need to pick how your application will sign in to Azure Active Directory." 9 | --- 10 | 11 | # Authenticating your Python application against Azure Active Directory 12 | 13 | ## Overview 14 | 15 | When building an application that uses the Python SDK for Data Lake Analytics (ADLA), you need to pick how your application will sign in to Azure Active Directory (AAD). 16 | 17 | There are two fundamental ways to have your application sign-in: 18 | * **Interactive** - Use this method when your application has a user directly using your application and your app needs to perform operations in the context of that user. 19 | * **Non-interactive** - Thus this method when your application is not meant to interact with ADLA as a specific user. This is useful for long-running services. 20 | 21 | ## Required Python packages 22 | 23 | * [adal](https://pypi.python.org/pypi/adal/0.4.7) - v0.4.7 24 | * [azure-mgmt-datalake-analytics](https://pypi.python.org/pypi/azure-mgmt-datalake-analytics/0.2.0) - v0.2.0 25 | 26 | 27 | If you have Python installed, you can install these packages via the command line with the following commands: 28 | 29 | ``` 30 | pip install adal==0.4.7 31 | pip install azure-mgmt-datalake-analytics==0.2.0 32 | ``` 33 | 34 | ## Required imports 35 | 36 | To simplify the code samples, ensure you have the following `import` statements at the top of your code. 37 | 38 | ```python 39 | ## AADTokenCredentials for multi-factor authentication 40 | from msrestazure.azure_active_directory import AADTokenCredentials 41 | 42 | ## Required for Azure Data Lake Analytics job management 43 | from azure.mgmt.datalake.analytics.job import DataLakeAnalyticsJobManagementClient 44 | from azure.mgmt.datalake.analytics.job.models import JobInformation, JobState, USqlJobProperties 45 | 46 | ## Other required imports 47 | import adal, uuid, time 48 | ``` 49 | 50 | ## Basic authentication workflow 51 | 52 | For a given domain (tenant). Your code needs to get credentials (tokens) for each end Azure REST endpoint (token audience) that you intend to use. Once the credentials are retrieved, then REST clients are built using those credentials. 53 | 54 | #### Token Audiences 55 | These are the Azure REST endpoints (token audiences) that are used in the samples: 56 | * Azure Resource Manager management operations: ``https://management.core.windows.net/``. 57 | * Azure Data Lake data plane operations: ``https://datalake.azure.net/``. 58 | 59 | #### Domains and Tenant 60 | 61 | You can retrieve your AAD domain / tenant ID by going to the [Azure portal](https://portal.azure.com/) and clicking 'Azure Active Directory'. An example domain is "contoso.onmicrosoft.com". 62 | 63 | #### Client ID 64 | 65 | All clients must have a "Client ID" that is known by the domain you are connecting to. 66 | 67 | #### Sample code 68 | 69 | ```python 70 | if __name__ == '__main__': 71 | creds = authenticate_device_code() 72 | ``` 73 | 74 | The `authenticate_*TYPE*` represents one of three different helper methods used in the samples. The helper methods are shown below. 75 | 76 | ## Interactive login 77 | 78 | There are two ways to use interactive login: 79 | * **Interactive Pop-up** - The device the user is using will see a prompt appear and will use that prompt. This document does not cover this case yet. 80 | * **Interactive Device code** - The device the user is using will NOT see a prompt. This is useful in those cases when, for example, it is not possible to show a prompt. 81 | 82 | ### Authenticate interactively with a device code 83 | 84 | This option is used when you want to have a browser popup appear when the user signs in to your application, showing an AAD login form. From this interactive popup, your application will receive the tokens necessary to use the Data Lake Analytics Python SDK on behalf of the user. 85 | 86 | This is not supported yet. 87 | 88 | ### Authenticate interactively with a device code 89 | 90 | Azure Active Directory also supports a form of authentication called "device code" authentication. Using this, you can direct your end-user to a browser window, where they will complete their sign-in process before returning to your application. 91 | 92 | ```python 93 | def authenticate_device_code(): 94 | """ 95 | Authenticate the end-user using device auth. 96 | """ 97 | authority_host_uri = 'https://login.microsoftonline.com' 98 | tenant = '' 99 | authority_uri = authority_host_uri + '/' + tenant 100 | resource_uri = 'https://management.core.windows.net/' 101 | client_id = '04b07795-8ddb-461a-bbee-02f9e1bf7b46' 102 | 103 | context = adal.AuthenticationContext(authority_uri, api_version=None) 104 | code = context.acquire_user_code(resource_uri, client_id) 105 | print(code['message']) 106 | mgmt_token = context.acquire_token_with_device_code(resource_uri, code, client_id) 107 | credentials = AADTokenCredentials(mgmt_token, client_id) 108 | 109 | return credentials 110 | ``` 111 | 112 | > NOTE: The client id used above is a well known that already exists for all azure services. While it makes the sample code easy to use, for production code you should use generate your own client ids for your application. 113 | 114 | ### Non-interactive - Service principal - Authentication 115 | 116 | Use this option if you want to have your application authenticate against AAD using its own credentials, rather than those of a user. Using this process, your application will receive the tokens necessary to use the Data Lake Analytics Python SDK as a service principal, 117 | which represents your application in AAD. 118 | 119 | Non-interactive - Service principal / application 120 | * Using a secret key 121 | * Using a certificate 122 | 123 | #### Service principals 124 | To create service principal [follow the steps in this article](https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-authenticate-service-principal). 125 | 126 | #### Authenticate non-interactively with a secret key 127 | 128 | ```python 129 | def authenticate_client_key(): 130 | """ 131 | Authenticate using service principal w/ key. 132 | """ 133 | authority_host_uri = 'https://login.microsoftonline.com' 134 | tenant = '' 135 | authority_uri = authority_host_uri + '/' + tenant 136 | resource_uri = 'https://management.core.windows.net/' 137 | client_id = '' 138 | client_secret = '' 139 | 140 | context = adal.AuthenticationContext(authority_uri, api_version=None) 141 | mgmt_token = context.acquire_token_with_client_credentials(resource_uri, client_id, client_secret) 142 | credentials = AADTokenCredentials(mgmt_token, client_id) 143 | 144 | return credentials 145 | ``` 146 | 147 | #### Authenticate non-interactively with a certificate 148 | 149 | ```python 150 | def authenticate_client_cert(): 151 | """ 152 | Authenticate using service principal w/ cert. 153 | """ 154 | authority_host_uri = 'https://login.microsoftonline.com' 155 | tenant = '' 156 | authority_uri = authority_host_uri + '/' + tenant 157 | resource_uri = 'https://management.core.windows.net/' 158 | client_id = '' 159 | client_cert = '' 160 | client_cert_thumbprint = '' 161 | 162 | context = adal.AuthenticationContext(authority_uri, api_version=None) 163 | 164 | mgmt_token = context.acquire_token_with_client_certificate(resource_uri, client_id, client_cert, client_cert_thumbprint) 165 | credentials = AADTokenCredentials(mgmt_token, client_id) 166 | 167 | return credentials 168 | ``` 169 | 170 | #### Setting up and using Data Lake SDKs 171 | Once your have followed one of the approaches for authentication, you're ready to set up your ADLA Python SDK client objects, which you'll use to perform various actions with the service. Remember to use the right tokens/credentials with the right clients: use the ADL credentials for data plane operations, and use the ARM credentials for resource- and account-related operations. 172 | 173 | You can then perform actions using the clients, like so: 174 | 175 | ```python 176 | adla_account = '' 177 | resource_group = '' 178 | sub_id = '' 179 | 180 | job_client = DataLakeAnalyticsJobManagementClient(creds, 'azuredatalakeanalytics.net') 181 | 182 | script = '@a = SELECT * FROM (VALUES ("Hello, World!")) AS T(message); OUTPUT @a TO "/Samples/Output/HelloWorld.csv" USING Outputters.Csv();' 183 | 184 | job_id = str(uuid.uuid4()) 185 | 186 | job_client.job.create(adla_account, job_id, JobInformation( 187 | name='HelloWorld', 188 | type='USql', 189 | properties=USqlJobProperties(script=script) 190 | )) 191 | 192 | job_result = job_client.job.get(adla_account, job_id) 193 | 194 | while job_result.state != JobState.ended: 195 | print('Job is not yet done. Waiting for 3 seconds. Current state: ' + job_result.state.value) 196 | time.sleep(3) 197 | job_result = job_client.job.get(adla_account, job_id) 198 | 199 | print('Job finished with result: ' + job_result.result.value) 200 | ``` 201 | 202 | ## Contributing 203 | 204 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 205 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 206 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 207 | 208 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 209 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 210 | provided by the bot. You will only need to do this once across all repos using our CLA. 211 | 212 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 213 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 214 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 215 | -------------------------------------------------------------------------------- /sample.py: -------------------------------------------------------------------------------- 1 | ## AADTokenCredentials for multi-factor authentication 2 | from msrestazure.azure_active_directory import AADTokenCredentials 3 | 4 | ## Required for Azure Data Lake Analytics job management 5 | from azure.mgmt.datalake.analytics.job import DataLakeAnalyticsJobManagementClient 6 | from azure.mgmt.datalake.analytics.job.models import JobInformation, JobState, USqlJobProperties 7 | 8 | ## Other required imports 9 | import adal, uuid, time 10 | 11 | def authenticate_device_code(): 12 | """ 13 | Authenticate the end-user using device auth. 14 | """ 15 | authority_host_uri = 'https://login.microsoftonline.com' 16 | tenant = '' 17 | authority_uri = authority_host_uri + '/' + tenant 18 | resource_uri = 'https://management.core.windows.net/' 19 | client_id = '04b07795-8ddb-461a-bbee-02f9e1bf7b46' 20 | 21 | context = adal.AuthenticationContext(authority_uri, api_version=None) 22 | code = context.acquire_user_code(resource_uri, client_id) 23 | print(code['message']) 24 | mgmt_token = context.acquire_token_with_device_code(resource_uri, code, client_id) 25 | credentials = AADTokenCredentials(mgmt_token, client_id) 26 | 27 | return credentials 28 | 29 | def authenticate_username_password(): 30 | """ 31 | Authenticate using user w/ username + password. 32 | This doesn't work for users or tenants that have multi-factor authentication required. 33 | """ 34 | authority_host_uri = 'https://login.microsoftonline.com' 35 | tenant = '' 36 | authority_uri = authority_host_uri + '/' + tenant 37 | resource_uri = 'https://management.core.windows.net/' 38 | username = '' 39 | password = '' 40 | client_id = '' 41 | 42 | context = adal.AuthenticationContext(authority_uri, api_version=None) 43 | mgmt_token = context.acquire_token_with_username_password(resource_uri, username, password, client_id) 44 | credentials = AADTokenCredentials(mgmt_token, client_id) 45 | 46 | return credentials 47 | 48 | def authenticate_client_key(): 49 | """ 50 | Authenticate using service principal w/ key. 51 | """ 52 | authority_host_uri = 'https://login.microsoftonline.com' 53 | tenant = '' 54 | authority_uri = authority_host_uri + '/' + tenant 55 | resource_uri = 'https://management.core.windows.net/' 56 | client_id = '' 57 | client_secret = '' 58 | 59 | context = adal.AuthenticationContext(authority_uri, api_version=None) 60 | mgmt_token = context.acquire_token_with_client_credentials(resource_uri, client_id, client_secret) 61 | credentials = AADTokenCredentials(mgmt_token, client_id) 62 | 63 | return credentials 64 | 65 | def authenticate_client_cert(): 66 | """ 67 | Authenticate using service principal w/ cert. 68 | """ 69 | authority_host_uri = 'https://login.microsoftonline.com' 70 | tenant = '' 71 | authority_uri = authority_host_uri + '/' + tenant 72 | resource_uri = 'https://management.core.windows.net/' 73 | client_id = '' 74 | client_cert = '' 75 | client_cert_thumbprint = '' 76 | 77 | context = adal.AuthenticationContext(authority_uri, api_version=None) 78 | 79 | mgmt_token = context.acquire_token_with_client_certificate(resource_uri, client_id, client_cert, client_cert_thumbprint) 80 | credentials = AADTokenCredentials(mgmt_token, client_id) 81 | 82 | return credentials 83 | 84 | if __name__ == '__main__': 85 | creds = authenticate_device_code() 86 | 87 | adla_account = '' 88 | resource_group = '' 89 | sub_id = '' 90 | 91 | job_client = DataLakeAnalyticsJobManagementClient(creds, 'azuredatalakeanalytics.net') 92 | 93 | script = '@a = SELECT * FROM (VALUES ("Hello, World!")) AS T(message); OUTPUT @a TO "/Samples/Output/HelloWorld.csv" USING Outputters.Csv();' 94 | 95 | job_id = str(uuid.uuid4()) 96 | 97 | job_client.job.create(adla_account, job_id, JobInformation( 98 | name='HelloWorld', 99 | type='USql', 100 | properties=USqlJobProperties(script=script) 101 | )) 102 | 103 | job_result = job_client.job.get(adla_account, job_id) 104 | 105 | while job_result.state != JobState.ended: 106 | print('Job is not yet done. Waiting for 3 seconds. Current state: ' + job_result.state.value) 107 | time.sleep(3) 108 | job_result = job_client.job.get(adla_account, job_id) 109 | 110 | print('Job finished with result: ' + job_result.result.value) 111 | --------------------------------------------------------------------------------