├── .github ├── ISSUE_TEMPLATE.md └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── CONTRIBUTING.md ├── LICENSE.md ├── README.md └── src ├── config.py ├── python_quickstart_client.py ├── requirements.txt ├── taskdata0.txt ├── taskdata1.txt └── taskdata2.txt /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | 4 | > Please provide us with the following information: 5 | > --------------------------------------------------------------- 6 | 7 | ### This issue is for a: (mark with an `x`) 8 | ``` 9 | - [ ] bug report -> please search issues before submitting 10 | - [ ] feature request 11 | - [ ] documentation issue or request 12 | - [ ] regression (a behavior that used to work and stopped in a new release) 13 | ``` 14 | 15 | ### Minimal steps to reproduce 16 | > 17 | 18 | ### Any log messages given by the failure 19 | > 20 | 21 | ### Expected/desired behavior 22 | > 23 | 24 | ### OS and Version? 25 | > Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) 26 | 27 | ### Versions 28 | > 29 | 30 | ### Mention any other details that might be useful 31 | 32 | > --------------------------------------------------------------- 33 | > Thanks! We'll be in touch soon. 34 | -------------------------------------------------------------------------------- /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## Purpose 2 | 3 | * ... 4 | 5 | ## Does this introduce a breaking change? 6 | 7 | ``` 8 | [ ] Yes 9 | [ ] No 10 | ``` 11 | 12 | ## Pull Request Type 13 | What kind of change does this Pull Request introduce? 14 | 15 | 16 | ``` 17 | [ ] Bugfix 18 | [ ] Feature 19 | [ ] Code style update (formatting, local variables) 20 | [ ] Refactoring (no functional changes, no api changes) 21 | [ ] Documentation content changes 22 | [ ] Other... Please describe: 23 | ``` 24 | 25 | ## How to Test 26 | * Get the code 27 | 28 | ``` 29 | git clone [repo-address] 30 | cd [repo-name] 31 | git checkout [branch-name] 32 | npm install 33 | ``` 34 | 35 | * Test the code 36 | 37 | ``` 38 | ``` 39 | 40 | ## What to Check 41 | Verify that the following are valid 42 | * ... 43 | 44 | ## Other Information 45 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.suo 8 | *.user 9 | *.userosscache 10 | *.sln.docstates 11 | 12 | # User-specific files (MonoDevelop/Xamarin Studio) 13 | *.userprefs 14 | 15 | # Build results 16 | [Dd]ebug/ 17 | [Dd]ebugPublic/ 18 | [Rr]elease/ 19 | [Rr]eleases/ 20 | x64/ 21 | x86/ 22 | bld/ 23 | [Bb]in/ 24 | [Oo]bj/ 25 | [Ll]og/ 26 | 27 | # Visual Studio 2015 cache/options directory 28 | .vs/ 29 | # Uncomment if you have tasks that create the project's static files in wwwroot 30 | #wwwroot/ 31 | 32 | # MSTest test Results 33 | [Tt]est[Rr]esult*/ 34 | [Bb]uild[Ll]og.* 35 | 36 | # NUNIT 37 | *.VisualState.xml 38 | TestResult.xml 39 | 40 | # Build Results of an ATL Project 41 | [Dd]ebugPS/ 42 | [Rr]eleasePS/ 43 | dlldata.c 44 | 45 | # .NET Core 46 | project.lock.json 47 | project.fragment.lock.json 48 | artifacts/ 49 | **/Properties/launchSettings.json 50 | 51 | *_i.c 52 | *_p.c 53 | *_i.h 54 | *.ilk 55 | *.meta 56 | *.obj 57 | *.pch 58 | *.pdb 59 | *.pgc 60 | *.pgd 61 | *.rsp 62 | *.sbr 63 | *.tlb 64 | *.tli 65 | *.tlh 66 | *.tmp 67 | *.tmp_proj 68 | *.log 69 | *.vspscc 70 | *.vssscc 71 | .builds 72 | *.pidb 73 | *.svclog 74 | *.scc 75 | 76 | # Chutzpah Test files 77 | _Chutzpah* 78 | 79 | # Visual C++ cache files 80 | ipch/ 81 | *.aps 82 | *.ncb 83 | *.opendb 84 | *.opensdf 85 | *.sdf 86 | *.cachefile 87 | *.VC.db 88 | *.VC.VC.opendb 89 | 90 | # Visual Studio profiler 91 | *.psess 92 | *.vsp 93 | *.vspx 94 | *.sap 95 | 96 | # TFS 2012 Local Workspace 97 | $tf/ 98 | 99 | # Guidance Automation Toolkit 100 | *.gpState 101 | 102 | # ReSharper is a .NET coding add-in 103 | _ReSharper*/ 104 | *.[Rr]e[Ss]harper 105 | *.DotSettings.user 106 | 107 | # JustCode is a .NET coding add-in 108 | .JustCode 109 | 110 | # TeamCity is a build add-in 111 | _TeamCity* 112 | 113 | # DotCover is a Code Coverage Tool 114 | *.dotCover 115 | 116 | # Visual Studio code coverage results 117 | *.coverage 118 | *.coveragexml 119 | 120 | # NCrunch 121 | _NCrunch_* 122 | .*crunch*.local.xml 123 | nCrunchTemp_* 124 | 125 | # MightyMoose 126 | *.mm.* 127 | AutoTest.Net/ 128 | 129 | # Web workbench (sass) 130 | .sass-cache/ 131 | 132 | # Installshield output folder 133 | [Ee]xpress/ 134 | 135 | # DocProject is a documentation generator add-in 136 | DocProject/buildhelp/ 137 | DocProject/Help/*.HxT 138 | DocProject/Help/*.HxC 139 | DocProject/Help/*.hhc 140 | DocProject/Help/*.hhk 141 | DocProject/Help/*.hhp 142 | DocProject/Help/Html2 143 | DocProject/Help/html 144 | 145 | # Click-Once directory 146 | publish/ 147 | 148 | # Publish Web Output 149 | *.[Pp]ublish.xml 150 | *.azurePubxml 151 | # TODO: Comment the next line if you want to checkin your web deploy settings 152 | # but database connection strings (with potential passwords) will be unencrypted 153 | *.pubxml 154 | *.publishproj 155 | 156 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 157 | # checkin your Azure Web App publish settings, but sensitive information contained 158 | # in these scripts will be unencrypted 159 | PublishScripts/ 160 | 161 | # NuGet Packages 162 | *.nupkg 163 | # The packages folder can be ignored because of Package Restore 164 | **/packages/* 165 | # except build/, which is used as an MSBuild target. 166 | !**/packages/build/ 167 | # Uncomment if necessary however generally it will be regenerated when needed 168 | #!**/packages/repositories.config 169 | # NuGet v3's project.json files produces more ignorable files 170 | *.nuget.props 171 | *.nuget.targets 172 | 173 | # Microsoft Azure Build Output 174 | csx/ 175 | *.build.csdef 176 | 177 | # Microsoft Azure Emulator 178 | ecf/ 179 | rcf/ 180 | 181 | # Windows Store app package directories and files 182 | AppPackages/ 183 | BundleArtifacts/ 184 | Package.StoreAssociation.xml 185 | _pkginfo.txt 186 | 187 | # Visual Studio cache files 188 | # files ending in .cache can be ignored 189 | *.[Cc]ache 190 | # but keep track of directories ending in .cache 191 | !*.[Cc]ache/ 192 | 193 | # Others 194 | ClientBin/ 195 | ~$* 196 | *~ 197 | *.dbmdl 198 | *.dbproj.schemaview 199 | *.jfm 200 | *.pfx 201 | *.publishsettings 202 | orleans.codegen.cs 203 | 204 | # Since there are multiple workflows, uncomment next line to ignore bower_components 205 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 206 | #bower_components/ 207 | 208 | # RIA/Silverlight projects 209 | Generated_Code/ 210 | 211 | # Backup & report files from converting an old project file 212 | # to a newer Visual Studio version. Backup files are not needed, 213 | # because we have git ;-) 214 | _UpgradeReport_Files/ 215 | Backup*/ 216 | UpgradeLog*.XML 217 | UpgradeLog*.htm 218 | 219 | # SQL Server files 220 | *.mdf 221 | *.ldf 222 | *.ndf 223 | 224 | # Business Intelligence projects 225 | *.rdl.data 226 | *.bim.layout 227 | *.bim_*.settings 228 | 229 | # Microsoft Fakes 230 | FakesAssemblies/ 231 | 232 | # GhostDoc plugin setting file 233 | *.GhostDoc.xml 234 | 235 | # Node.js Tools for Visual Studio 236 | .ntvs_analysis.dat 237 | node_modules/ 238 | 239 | # Typescript v1 declaration files 240 | typings/ 241 | 242 | # Visual Studio 6 build log 243 | *.plg 244 | 245 | # Visual Studio 6 workspace options file 246 | *.opt 247 | 248 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 249 | *.vbw 250 | 251 | # Visual Studio LightSwitch build output 252 | **/*.HTMLClient/GeneratedArtifacts 253 | **/*.DesktopClient/GeneratedArtifacts 254 | **/*.DesktopClient/ModelManifest.xml 255 | **/*.Server/GeneratedArtifacts 256 | **/*.Server/ModelManifest.xml 257 | _Pvt_Extensions 258 | 259 | # Paket dependency manager 260 | .paket/paket.exe 261 | paket-files/ 262 | 263 | # FAKE - F# Make 264 | .fake/ 265 | 266 | # JetBrains Rider 267 | .idea/ 268 | *.sln.iml 269 | 270 | # CodeRush 271 | .cr/ 272 | 273 | # Python Tools for Visual Studio (PTVS) 274 | __pycache__/ 275 | *.pyc 276 | 277 | # Cake - Uncomment if you are using it 278 | # tools/** 279 | # !tools/packages.config 280 | 281 | # Telerik's JustMock configuration file 282 | *.jmconfig 283 | 284 | # BizTalk build output 285 | *.btp.cs 286 | *.btm.cs 287 | *.odx.cs 288 | *.xsd.cs 289 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to Azure Samples 2 | 3 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 4 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 5 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 6 | 7 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 8 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 9 | provided by the bot. You will only need to do this once across all repos using our CLA. 10 | 11 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 12 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 13 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 14 | 15 | - [Code of Conduct](#coc) 16 | - [Issues and Bugs](#issue) 17 | - [Feature Requests](#feature) 18 | - [Submission Guidelines](#submit) 19 | 20 | ## Code of Conduct 21 | Help us keep this project open and inclusive. Please read and follow our [Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 22 | 23 | ## Found an Issue? 24 | If you find a bug in the source code or a mistake in the documentation, you can help us by 25 | [submitting an issue](#submit-issue) to the GitHub Repository. Even better, you can 26 | [submit a Pull Request](#submit-pr) with a fix. 27 | 28 | ## Want a Feature? 29 | You can *request* a new feature by [submitting an issue](#submit-issue) to the GitHub 30 | Repository. If you would like to *implement* a new feature, please submit an issue with 31 | a proposal for your work first, to be sure that we can use it. 32 | 33 | * **Small Features** can be crafted and directly [submitted as a Pull Request](#submit-pr). 34 | 35 | ## Submission Guidelines 36 | 37 | ### Submitting an Issue 38 | Before you submit an issue, search the archive, maybe your question was already answered. 39 | 40 | If your issue appears to be a bug, and hasn't been reported, open a new issue. 41 | Help us to maximize the effort we can spend fixing issues and adding new 42 | features, by not reporting duplicate issues. Providing the following information will increase the 43 | chances of your issue being dealt with quickly: 44 | 45 | * **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps 46 | * **Version** - what version is affected (e.g. 0.1.2) 47 | * **Motivation for or Use Case** - explain what are you trying to do and why the current behavior is a bug for you 48 | * **Browsers and Operating System** - is this a problem with all browsers? 49 | * **Reproduce the Error** - provide a live example or a unambiguous set of steps 50 | * **Related Issues** - has a similar issue been reported before? 51 | * **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be 52 | causing the problem (line of code or commit) 53 | 54 | You can file new issues by providing the above information at the corresponding repository's issues link: https://github.com/[organization-name]/[repository-name]/issues/new]. 55 | 56 | ### Submitting a Pull Request (PR) 57 | Before you submit your Pull Request (PR) consider the following guidelines: 58 | 59 | * Search the repository (https://github.com/[organization-name]/[repository-name]/pulls) for an open or closed PR 60 | that relates to your submission. You don't want to duplicate effort. 61 | 62 | * Make your changes in a new git fork: 63 | 64 | * Commit your changes using a descriptive commit message 65 | * Push your fork to GitHub: 66 | * In GitHub, create a pull request 67 | * If we suggest changes then: 68 | * Make the required updates. 69 | * Rebase your fork and force push to your GitHub repository (this will update your Pull Request): 70 | 71 | ```shell 72 | git rebase master -i 73 | git push -f 74 | ``` 75 | 76 | That's it! Thank you for your contribution! 77 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. All rights reserved. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | --- 2 | page_type: sample 3 | description: "A basic Python application that introduces Batch features such as pools, nodes, jobs, tasks, and interaction with Storage." 4 | languages: 5 | - python 6 | products: 7 | - azure 8 | --- 9 | 10 | # Azure Batch Python Quickstart 11 | 12 | A basic Python application that introduces Batch features such as pools, nodes, jobs, tasks, and interaction with Storage. Each task writes a text file to standard output. 13 | 14 | For details and explanation, see the accompanying article [Run your first Batch job with the Python API](https://docs.microsoft.com/azure/batch/quick-run-python). 15 | 16 | ## Prerequisites 17 | 18 | - Azure Batch account and linked general-purpose Azure Storage account 19 | - Python 2.7 or 3.3 or later including pip 20 | 21 | ## Resources 22 | 23 | - [Azure Batch documentation](https://docs.microsoft.com/azure/batch/) 24 | - [Azure Batch code samples](https://github.com/Azure/azure-batch-samples) 25 | 26 | ## Project code of conduct 27 | 28 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 29 | -------------------------------------------------------------------------------- /src/config.py: -------------------------------------------------------------------------------- 1 | # ------------------------------------------------------------------------- 2 | # 3 | # THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, 4 | # EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES 5 | # OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. 6 | # ---------------------------------------------------------------------------------- 7 | # The example companies, organizations, products, domain names, 8 | # e-mail addresses, logos, people, places, and events depicted 9 | # herein are fictitious. No association with any real company, 10 | # organization, product, domain name, email address, logo, person, 11 | # places, or events is intended or should be inferred. 12 | # -------------------------------------------------------------------------- 13 | 14 | # Global constant variables (Azure Storage account/Batch details) 15 | 16 | # import "config.py" in "python_quickstart_client.py " 17 | # Please note that storing the batch and storage account keys in Azure Key Vault 18 | # is a better practice for Production usage. 19 | 20 | """ 21 | Configure Batch and Storage Account credentials 22 | """ 23 | 24 | BATCH_ACCOUNT_NAME = '' # Your batch account name 25 | BATCH_ACCOUNT_KEY = '' # Your batch account key 26 | BATCH_ACCOUNT_URL = '' # Your batch account URL 27 | STORAGE_ACCOUNT_NAME = '' 28 | STORAGE_ACCOUNT_KEY = '' 29 | STORAGE_ACCOUNT_DOMAIN = 'blob.core.windows.net' # Your storage account blob service domain 30 | 31 | POOL_ID = 'PythonQuickstartPool' # Your Pool ID 32 | POOL_NODE_COUNT = 2 # Pool node count 33 | POOL_VM_SIZE = 'STANDARD_DS1_V2' # VM Type/Size 34 | JOB_ID = 'PythonQuickstartJob' # Job ID 35 | STANDARD_OUT_FILE_NAME = 'stdout.txt' # Standard Output file 36 | -------------------------------------------------------------------------------- /src/python_quickstart_client.py: -------------------------------------------------------------------------------- 1 | # python quickstart client Code Sample 2 | # 3 | # Copyright (c) Microsoft Corporation 4 | # 5 | # All rights reserved. 6 | # 7 | # MIT License 8 | # 9 | # Permission is hereby granted, free of charge, to any person obtaining a 10 | # copy of this software and associated documentation files (the "Software"), 11 | # to deal in the Software without restriction, including without limitation 12 | # the rights to use, copy, modify, merge, publish, distribute, sublicense, 13 | # and/or sell copies of the Software, and to permit persons to whom the 14 | # Software is furnished to do so, subject to the following conditions: 15 | # 16 | # The above copyright notice and this permission notice shall be included in 17 | # all copies or substantial portions of the Software. 18 | # 19 | # THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 21 | # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 22 | # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 23 | # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 24 | # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 25 | # DEALINGS IN THE SOFTWARE. 26 | 27 | """ 28 | Create a pool of nodes to output text files from azure blob storage. 29 | """ 30 | 31 | import datetime 32 | import io 33 | import os 34 | import sys 35 | import time 36 | 37 | from azure.storage.blob import ( 38 | BlobServiceClient, 39 | BlobSasPermissions, 40 | generate_blob_sas 41 | ) 42 | from azure.batch import BatchServiceClient 43 | from azure.batch.batch_auth import SharedKeyCredentials 44 | import azure.batch.models as batchmodels 45 | from azure.core.exceptions import ResourceExistsError 46 | 47 | import config 48 | 49 | DEFAULT_ENCODING = "utf-8" 50 | 51 | 52 | # Update the Batch and Storage account credential strings in config.py with values 53 | # unique to your accounts. These are used when constructing connection strings 54 | # for the Batch and Storage client objects. 55 | 56 | def query_yes_no(question: str, default: str = "yes") -> str: 57 | """ 58 | Prompts the user for yes/no input, displaying the specified question text. 59 | 60 | :param str question: The text of the prompt for input. 61 | :param str default: The default if the user hits . Acceptable values 62 | are 'yes', 'no', and None. 63 | :return: 'yes' or 'no' 64 | """ 65 | valid = {'y': 'yes', 'n': 'no'} 66 | if default is None: 67 | prompt = ' [y/n] ' 68 | elif default == 'yes': 69 | prompt = ' [Y/n] ' 70 | elif default == 'no': 71 | prompt = ' [y/N] ' 72 | else: 73 | raise ValueError(f"Invalid default answer: '{default}'") 74 | 75 | choice = default 76 | 77 | while 1: 78 | user_input = input(question + prompt).lower() 79 | if not user_input: 80 | break 81 | try: 82 | choice = valid[user_input[0]] 83 | break 84 | except (KeyError, IndexError): 85 | print("Please respond with 'yes' or 'no' (or 'y' or 'n').\n") 86 | 87 | return choice 88 | 89 | 90 | def print_batch_exception(batch_exception: batchmodels.BatchErrorException): 91 | """ 92 | Prints the contents of the specified Batch exception. 93 | 94 | :param batch_exception: 95 | """ 96 | print('-------------------------------------------') 97 | print('Exception encountered:') 98 | if batch_exception.error and \ 99 | batch_exception.error.message and \ 100 | batch_exception.error.message.value: 101 | print(batch_exception.error.message.value) 102 | if batch_exception.error.values: 103 | print() 104 | for mesg in batch_exception.error.values: 105 | print(f'{mesg.key}:\t{mesg.value}') 106 | print('-------------------------------------------') 107 | 108 | 109 | def upload_file_to_container(blob_storage_service_client: BlobServiceClient, 110 | container_name: str, file_path: str) -> batchmodels.ResourceFile: 111 | """ 112 | Uploads a local file to an Azure Blob storage container. 113 | 114 | :param blob_storage_service_client: A blob service client. 115 | :param str container_name: The name of the Azure Blob storage container. 116 | :param str file_path: The local path to the file. 117 | :return: A ResourceFile initialized with a SAS URL appropriate for Batch 118 | tasks. 119 | """ 120 | blob_name = os.path.basename(file_path) 121 | blob_client = blob_storage_service_client.get_blob_client(container_name, blob_name) 122 | 123 | print(f'Uploading file {file_path} to container [{container_name}]...') 124 | 125 | with open(file_path, "rb") as data: 126 | blob_client.upload_blob(data, overwrite=True) 127 | 128 | sas_token = generate_blob_sas( 129 | config.STORAGE_ACCOUNT_NAME, 130 | container_name, 131 | blob_name, 132 | account_key=config.STORAGE_ACCOUNT_KEY, 133 | permission=BlobSasPermissions(read=True), 134 | expiry=datetime.datetime.utcnow() + datetime.timedelta(hours=2) 135 | ) 136 | 137 | sas_url = generate_sas_url( 138 | config.STORAGE_ACCOUNT_NAME, 139 | config.STORAGE_ACCOUNT_DOMAIN, 140 | container_name, 141 | blob_name, 142 | sas_token 143 | ) 144 | 145 | return batchmodels.ResourceFile( 146 | http_url=sas_url, 147 | file_path=blob_name 148 | ) 149 | 150 | 151 | def generate_sas_url( 152 | account_name: str, 153 | account_domain: str, 154 | container_name: str, 155 | blob_name: str, 156 | sas_token: str 157 | ) -> str: 158 | """ 159 | Generates and returns a sas url for accessing blob storage 160 | """ 161 | return f"https://{account_name}.{account_domain}/{container_name}/{blob_name}?{sas_token}" 162 | 163 | 164 | def create_pool(batch_service_client: BatchServiceClient, pool_id: str): 165 | """ 166 | Creates a pool of compute nodes with the specified OS settings. 167 | 168 | :param batch_service_client: A Batch service client. 169 | :param str pool_id: An ID for the new pool. 170 | :param str publisher: Marketplace image publisher 171 | :param str offer: Marketplace image offer 172 | :param str sku: Marketplace image sku 173 | """ 174 | print(f'Creating pool [{pool_id}]...') 175 | 176 | # Create a new pool of Linux compute nodes using an Azure Virtual Machines 177 | # Marketplace image. For more information about creating pools of Linux 178 | # nodes, see: 179 | # https://azure.microsoft.com/documentation/articles/batch-linux-nodes/ 180 | new_pool = batchmodels.PoolAddParameter( 181 | id=pool_id, 182 | virtual_machine_configuration=batchmodels.VirtualMachineConfiguration( 183 | image_reference=batchmodels.ImageReference( 184 | publisher="canonical", 185 | offer="0001-com-ubuntu-server-focal", 186 | sku="20_04-lts", 187 | version="latest" 188 | ), 189 | node_agent_sku_id="batch.node.ubuntu 20.04"), 190 | vm_size=config.POOL_VM_SIZE, 191 | target_dedicated_nodes=config.POOL_NODE_COUNT 192 | ) 193 | batch_service_client.pool.add(new_pool) 194 | 195 | 196 | def create_job(batch_service_client: BatchServiceClient, job_id: str, pool_id: str): 197 | """ 198 | Creates a job with the specified ID, associated with the specified pool. 199 | 200 | :param batch_service_client: A Batch service client. 201 | :param str job_id: The ID for the job. 202 | :param str pool_id: The ID for the pool. 203 | """ 204 | print(f'Creating job [{job_id}]...') 205 | 206 | job = batchmodels.JobAddParameter( 207 | id=job_id, 208 | pool_info=batchmodels.PoolInformation(pool_id=pool_id)) 209 | 210 | batch_service_client.job.add(job) 211 | 212 | 213 | def add_tasks(batch_service_client: BatchServiceClient, job_id: str, resource_input_files: list): 214 | """ 215 | Adds a task for each input file in the collection to the specified job. 216 | 217 | :param batch_service_client: A Batch service client. 218 | :param str job_id: The ID of the job to which to add the tasks. 219 | :param list resource_input_files: A collection of input files. One task will be 220 | created for each input file. 221 | """ 222 | 223 | print(f'Adding {resource_input_files} tasks to job [{job_id}]...') 224 | 225 | tasks = [] 226 | 227 | for idx, input_file in enumerate(resource_input_files): 228 | 229 | command = f"/bin/bash -c \"cat {input_file.file_path}\"" 230 | tasks.append(batchmodels.TaskAddParameter( 231 | id=f'Task{idx}', 232 | command_line=command, 233 | resource_files=[input_file] 234 | ) 235 | ) 236 | 237 | batch_service_client.task.add_collection(job_id, tasks) 238 | 239 | 240 | def wait_for_tasks_to_complete(batch_service_client: BatchServiceClient, job_id: str, 241 | timeout: datetime.timedelta): 242 | """ 243 | Returns when all tasks in the specified job reach the Completed state. 244 | 245 | :param batch_service_client: A Batch service client. 246 | :param job_id: The id of the job whose tasks should be to monitored. 247 | :param timeout: The duration to wait for task completion. If all 248 | tasks in the specified job do not reach Completed state within this time 249 | period, an exception will be raised. 250 | """ 251 | timeout_expiration = datetime.datetime.now() + timeout 252 | 253 | print(f"Monitoring all tasks for 'Completed' state, timeout in {timeout}...", end='') 254 | 255 | while datetime.datetime.now() < timeout_expiration: 256 | print('.', end='') 257 | sys.stdout.flush() 258 | tasks = batch_service_client.task.list(job_id) 259 | 260 | incomplete_tasks = [task for task in tasks if 261 | task.state != batchmodels.TaskState.completed] 262 | if not incomplete_tasks: 263 | print() 264 | return True 265 | 266 | time.sleep(1) 267 | 268 | print() 269 | raise RuntimeError("ERROR: Tasks did not reach 'Completed' state within " 270 | "timeout period of " + str(timeout)) 271 | 272 | 273 | def print_task_output(batch_service_client: BatchServiceClient, job_id: str, 274 | text_encoding: str=None): 275 | """ 276 | Prints the stdout.txt file for each task in the job. 277 | 278 | :param batch_client: The batch client to use. 279 | :param str job_id: The id of the job with task output files to print. 280 | """ 281 | 282 | print('Printing task output...') 283 | 284 | tasks = batch_service_client.task.list(job_id) 285 | 286 | for task in tasks: 287 | 288 | node_id = batch_service_client.task.get( 289 | job_id, task.id).node_info.node_id 290 | print(f"Task: {task.id}") 291 | print(f"Node: {node_id}") 292 | 293 | stream = batch_service_client.file.get_from_task( 294 | job_id, task.id, config.STANDARD_OUT_FILE_NAME) 295 | 296 | file_text = _read_stream_as_string( 297 | stream, 298 | text_encoding) 299 | 300 | if text_encoding is None: 301 | text_encoding = DEFAULT_ENCODING 302 | 303 | sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding = text_encoding) 304 | sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding = text_encoding) 305 | 306 | print("Standard output:") 307 | print(file_text) 308 | 309 | 310 | def _read_stream_as_string(stream, encoding) -> str: 311 | """ 312 | Read stream as string 313 | 314 | :param stream: input stream generator 315 | :param str encoding: The encoding of the file. The default is utf-8. 316 | :return: The file content. 317 | """ 318 | output = io.BytesIO() 319 | try: 320 | for data in stream: 321 | output.write(data) 322 | if encoding is None: 323 | encoding = DEFAULT_ENCODING 324 | return output.getvalue().decode(encoding) 325 | finally: 326 | output.close() 327 | 328 | 329 | if __name__ == '__main__': 330 | 331 | start_time = datetime.datetime.now().replace(microsecond=0) 332 | print(f'Sample start: {start_time}') 333 | print() 334 | 335 | # Create the blob client, for use in obtaining references to 336 | # blob storage containers and uploading files to containers. 337 | blob_service_client = BlobServiceClient( 338 | account_url=f"https://{config.STORAGE_ACCOUNT_NAME}.{config.STORAGE_ACCOUNT_DOMAIN}/", 339 | credential=config.STORAGE_ACCOUNT_KEY 340 | ) 341 | 342 | # Use the blob client to create the containers in Azure Storage if they 343 | # don't yet exist. 344 | input_container_name = 'input' # pylint: disable=invalid-name 345 | try: 346 | blob_service_client.create_container(input_container_name) 347 | except ResourceExistsError: 348 | pass 349 | 350 | # The collection of data files that are to be processed by the tasks. 351 | input_file_paths = [os.path.join(sys.path[0], 'taskdata0.txt'), 352 | os.path.join(sys.path[0], 'taskdata1.txt'), 353 | os.path.join(sys.path[0], 'taskdata2.txt')] 354 | 355 | # Upload the data files. 356 | input_files = [ 357 | upload_file_to_container(blob_service_client, input_container_name, file_path) 358 | for file_path in input_file_paths] 359 | 360 | # Create a Batch service client. We'll now be interacting with the Batch 361 | # service in addition to Storage 362 | credentials = SharedKeyCredentials(config.BATCH_ACCOUNT_NAME, 363 | config.BATCH_ACCOUNT_KEY) 364 | 365 | batch_client = BatchServiceClient( 366 | credentials, 367 | batch_url=config.BATCH_ACCOUNT_URL) 368 | 369 | try: 370 | # Create the pool that will contain the compute nodes that will execute the 371 | # tasks. 372 | create_pool(batch_client, config.POOL_ID) 373 | 374 | # Create the job that will run the tasks. 375 | create_job(batch_client, config.JOB_ID, config.POOL_ID) 376 | 377 | # Add the tasks to the job. 378 | add_tasks(batch_client, config.JOB_ID, input_files) 379 | 380 | # Pause execution until tasks reach Completed state. 381 | wait_for_tasks_to_complete(batch_client, 382 | config.JOB_ID, 383 | datetime.timedelta(minutes=30)) 384 | 385 | print(" Success! All tasks reached the 'Completed' state within the " 386 | "specified timeout period.") 387 | 388 | # Print the stdout.txt and stderr.txt files for each task to the console 389 | print_task_output(batch_client, config.JOB_ID) 390 | 391 | # Print out some timing info 392 | end_time = datetime.datetime.now().replace(microsecond=0) 393 | print() 394 | print(f'Sample end: {end_time}') 395 | elapsed_time = end_time - start_time 396 | print(f'Elapsed time: {elapsed_time}') 397 | print() 398 | input('Press ENTER to exit...') 399 | 400 | except batchmodels.BatchErrorException as err: 401 | print_batch_exception(err) 402 | raise 403 | 404 | finally: 405 | # Clean up storage resources 406 | print(f'Deleting container [{input_container_name}]...') 407 | blob_service_client.delete_container(input_container_name) 408 | 409 | # Clean up Batch resources (if the user so chooses). 410 | if query_yes_no('Delete job?') == 'yes': 411 | batch_client.job.delete(config.JOB_ID) 412 | 413 | if query_yes_no('Delete pool?') == 'yes': 414 | batch_client.pool.delete(config.POOL_ID) 415 | -------------------------------------------------------------------------------- /src/requirements.txt: -------------------------------------------------------------------------------- 1 | azure-batch==11.0.0 2 | azure-storage-blob==12.8.1 -------------------------------------------------------------------------------- /src/taskdata0.txt: -------------------------------------------------------------------------------- 1 | With support for Linux, Windows Server, SQL Server, Oracle, IBM, and SAP, Azure Virtual Machines gives you the flexibility of virtualization for a wide range of computing solutions—development and testing, running applications, and extending your datacenter. It’s the freedom of open-source software configured the way you need it. It’s as if it was another rack in your datacenter, giving you the power to deploy an application in minutes instead of weeks. 2 | 3 | It’s all about choice for your virtual machines. Choose Linux or Windows. Choose to be on-premises, in the cloud, or both. Choose your own virtual machine image or download a certified pre-configured image in our marketplace. With Virtual Machines, you’re in control. 4 | 5 | Combine the performance of a world-class supercomputer with the scalability of the cloud. Scale from one to thousands of virtual machine instances. Plus, with the growing number of regional Azure datacenters, easily scale globally so you’re closer to where your customers are. 6 | 7 | Keep your budget in check with low-cost, per-minute billing. You only pay for the compute time you use. 8 | 9 | We’ll help you encrypt sensitive data, protect virtual machines from viruses and malware, secure network traffic, and meet regulatory and compliance requirements. -------------------------------------------------------------------------------- /src/taskdata1.txt: -------------------------------------------------------------------------------- 1 | Batch processing began with mainframe computers and punch cards. Today it still plays a central role in business, engineering, science, and other pursuits that require running lots of automated tasks—processing bills and payroll, calculating portfolio risk, designing new products, rendering animated films, testing software, searching for energy, predicting the weather, and finding new cures for disease. Previously only a few had access to the computing power for these scenarios. With Azure Batch, that power is available to you when you need it, without any capital investment. 2 | 3 | Choose the operating system and development tools you need to run your large-scale jobs on Batch. Batch provides a consistent job scheduling and management experience whether you select Windows Server or Linux compute nodes, but lets you take advantage of the unique features of each environment. With Windows, use your existing Windows-based code, including .NET, to run large-scale compute jobs in Azure. With Linux, choose from popular distributions including CentOS, Ubuntu, and SUSE Linux Enterprise Server to run your compute jobs, or use Docker containers to lift and shift your applications. Batch provides SDKs and supports a range of development tools including Python and Java. 4 | 5 | Batch runs the applications that you use on workstations and clusters today. It’s easy to cloud-enable your executables and scripts to scale out. Batch provides a queue to receive the work that you want to run and executes your applications. Describe the data that need to be moved to the cloud for processing, how the data should be distributed, what parameters to use for each task, and the command to start the process. Think about this like an assembly line with multiple applications. Batch makes it easy to share data between steps and manage the execution as a whole. 6 | 7 | You use a workstation today, maybe a small cluster, or you wait in a queue to run your jobs. What if you had access to 16 cores, 100 cores, 10,000 cores, or even 100,000 cores when you needed them, and only had to pay for what you used? With Batch you can. Avoid the bottlenecks and waiting that limit your imagination. What could you do on Azure that you can’t do today? -------------------------------------------------------------------------------- /src/taskdata2.txt: -------------------------------------------------------------------------------- 1 | Azure Storage offers a set of storage services for all your business needs. Choose from Blob Storage (Object Storage) for unstructured data, File Storage for SMB-based cloud file shares, Table Storage for NoSQL data, Queue Storage to reliably store messages, and Premium Storage for high-performance, low-latency block storage for I/O-intensive workloads running in Azure Virtual Machines. 2 | 3 | Storage keeps pace with your growing data needs, delivering petabytes of storage for the largest scenarios. Whether you're building modern applications or a high-scale big data application, Storage can handle it. 4 | 5 | Storage is available in more regions than any other public cloud offering, letting you store your data where it makes the most business sense. Scale up or across data centers as needed, and be closer to your customers for faster access and better performance. 6 | 7 | Storage automatically replicates your data and maintains multiple copies—either in a single region or globally with geo-redundancy—to help guard against unexpected hardware failures. 8 | --------------------------------------------------------------------------------