├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── aws_batch ├── README.md ├── __init__.py ├── createrole.py ├── register_sample_job.py ├── requirements.txt ├── template_access_policy.py └── template_job_definition.py ├── buildspec.yaml ├── codebuild_cloudformation.json ├── source ├── Dockerfile ├── __init__.py ├── awsS3Io.py ├── main.py ├── requirements.txt └── sampleProcess.py └── tests ├── IT_test_sampleProcess.py ├── __init__.py ├── requirements.txt └── test_awsS3Io.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # AWS Batch Python sample template 2 | A simple python quick start template to use with AWS Batch that helps you build a docker image through CI / CD . 3 | 4 | This demo batch downloads a sample json and uploads to s3 destination. 5 | 6 | 7 | ## Prerequistes 8 | 1. Install Python 3.6 9 | 2. Optional: Install virtual environment https://virtualenv.pypa.io/en/latest/installation/ or conda https://conda.io/docs/installation.html 10 | 11 | ## Set up 12 | 1. Install python dependencies 13 | ```bash 14 | pip install -r source/requirements.txt 15 | ``` 16 | 17 | ## Run Sample locally 18 | 19 | ```bash 20 | export PYTHONPATH=./source 21 | 22 | # To get help 23 | python ./source/main.py -h 24 | 25 | # Sample download data to current directory 26 | python ./source/main.py . 27 | 28 | # Sample download data to s3 path s3://mybucket/mydir/ 29 | python ./source/main.py . --s3 s3://mybucket/mydir/ 30 | ``` 31 | 32 | 33 | 34 | ## Run on AWS batch 35 | 1. Create an repository in ECR registry called "aws-batch-sample-python", as detailed here https://docs.aws.amazon.com/AmazonECR/latest/userguide/repository-create.html 36 | 37 | 2. Setup AWS codebuild to build a docker container as detailed using cloudformation stack [codebuild_cloudformation.json](codebuild_cloudformation.json). This uses the [buildspec.yaml](buildspec.yaml). For more details on codebuild see https://docs.aws.amazon.com/codebuild/latest/userguide/sample-docker.html. 38 | *Note* The ecr repositoryname to use is specified in the [buildspec.yaml](buildspec.yaml) file 39 | 40 | 3. Start a build in codebuild to push a new image into the ECS Repository "aws-batch-sample-python". When the build succeds, you will see a image in the repository 41 | 42 | 4. Register a job with AWS batch as detailed in [aws_batch/README.md](aws_batch/README.md) 43 | 44 | 45 | 46 | ## License 47 | 48 | This library is licensed under the MIT-0 License. See the LICENSE file. 49 | 50 | -------------------------------------------------------------------------------- /aws_batch/README.md: -------------------------------------------------------------------------------- 1 | # Register AWS batch job 2 | 3 | ## Prerequisites 4 | 1. Python 3.5+, https://www.python.org/downloads/release/python-350/ 5 | 2. Install pip, see https://pip.pypa.io/en/stable/installing/ 6 | 7 | ## Setup 8 | 3. Install dependencies for this project 9 | ```bash 10 | pip install -r aws_batch/requirements.txt 11 | ``` 12 | 4. Make sure you have registered the docker image with ECS as detailed in the main [README.md](../README.md) 13 | 14 | 15 | ## How to run 16 | 17 | 1. Register a aws batch job 18 | ```bash 19 | export PYTHONPATH=./aws_batch 20 | 21 | python aws_batch/register_sample_job.py .dkr.ecr..amazonaws.com/aws-batch-sample-python:latest "" 22 | 23 | #For full details 24 | python aws_batch/register_sample_job.py -h 25 | 26 | ``` 27 | 28 | 2. If you go to the AWS Batch console -- Job definition , you will see the new job called aws_batch_python_sample. 29 | 30 | 5. You can then trigger a new job through the AWS Batch console. Pass in the name of the s3destination as one of the parameters in the job. 31 | 32 | 33 | -------------------------------------------------------------------------------- /aws_batch/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-batch-python-sample/673439d319f2c1cddb1a3cefb5fbdeae27b80930/aws_batch/__init__.py -------------------------------------------------------------------------------- /aws_batch/createrole.py: -------------------------------------------------------------------------------- 1 | import json 2 | import logging 3 | 4 | import boto3 5 | 6 | 7 | def create_role(role_name, assumed_role_policy, policy, managed_policy_arns=None): 8 | logger = logging.getLogger(__name__) 9 | client = boto3.client('iam') 10 | managed_policy_arns = managed_policy_arns or [] 11 | logger.info( 12 | "Creating role {} with accesspolicy \n {}".format(role_name, json.dumps(policy, sort_keys=False, indent=4))) 13 | 14 | try: 15 | client.create_role( 16 | 17 | RoleName=role_name, 18 | AssumeRolePolicyDocument=json.dumps(assumed_role_policy), 19 | Description='This is the role for a batch task' 20 | ) 21 | except Exception as e: 22 | logger.warning( 23 | "Error creating role {}, {}, the role could not be created. If the role already exists, then the managed policy and custom policy will be added to it..".format( 24 | role_name, e)) 25 | 26 | # Managed policy arn here 27 | for p in managed_policy_arns: 28 | client.attach_role_policy( 29 | RoleName=role_name, PolicyArn=p) 30 | 31 | # Custom policy 32 | role_policy = boto3.resource('iam').RolePolicy(role_name, 'custom_policy') 33 | role_policy.put( 34 | PolicyDocument=json.dumps(policy) 35 | ) 36 | -------------------------------------------------------------------------------- /aws_batch/register_sample_job.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import json 3 | import logging 4 | import sys 5 | 6 | import boto3 7 | from createrole import create_role 8 | from template_access_policy import create_access_policy 9 | from template_job_definition import get_job_definition 10 | 11 | """ 12 | Registers the same job with AWS Batch 13 | """ 14 | 15 | 16 | class RegisterJob: 17 | 18 | def __init__(self, client=None, account=None, aws_region=None): 19 | self.client = client or boto3.client('batch') 20 | self.account = account or boto3.client('sts').get_caller_identity().get('Account') 21 | self.region = aws_region or boto3.session.Session().region_name 22 | 23 | def run(self, container_name: str, s3uri_destination: str, job_def_name: str, ncpus: int, memoryInMB): 24 | """ 25 | Registers a job with aws batch. 26 | :param s3uri_destination: the name of the s3 bucket that will hold the data 27 | :param container_name: The name of the container to use e.g 324346001917.dkr.ecr.us-east-2.amazonaws.com/awscomprehend-sentiment-demo:latest 28 | """ 29 | role_name = "AWSBatchECSRole_{}".format(job_def_name) 30 | logger = logging.getLogger(__name__) 31 | 32 | ##This is mandatory for aws batch 33 | assume_role_policy = { 34 | "Version": "2012-10-17", 35 | "Statement": [ 36 | { 37 | "Sid": "", 38 | "Effect": "Allow", 39 | "Principal": { 40 | "Service": "ecs-tasks.amazonaws.com" 41 | }, 42 | "Action": "sts:AssumeRole" 43 | } 44 | ] 45 | } 46 | 47 | access_policy = create_access_policy(s3uri_destination) 48 | 49 | managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"] 50 | 51 | create_role(role_name, assume_role_policy, access_policy, managed_policy_arns) 52 | 53 | job_definition = get_job_definition(self.account, self.region, container_name, job_def_name, s3uri_destination, 54 | memoryInMB, ncpus, 55 | role_name) 56 | 57 | logger.info( 58 | "Creating a job with parameters \n {}".format(json.dumps(job_definition, sort_keys=False, indent=4))) 59 | response = self.client.register_job_definition(**job_definition) 60 | return response 61 | 62 | 63 | if __name__ == '__main__': 64 | parser = argparse.ArgumentParser() 65 | logging.basicConfig(level=logging.INFO, handlers=[logging.StreamHandler(sys.stdout)], 66 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') 67 | 68 | logger = logging.getLogger(__name__) 69 | 70 | parser.add_argument("containerimage", 71 | help="Container image, e.g 346001917.dkr.ecr.us-east-2.amazonaws.com/aws-batch-sample-python") 72 | 73 | parser.add_argument("s3uri", 74 | help="The s3 uri path that will contain the input/output data. e.g s3://mybucket/aws-batch-sample-python/") 75 | 76 | parser.add_argument("--job-name", 77 | help="The name of the job", default="aws_batch_python_sample") 78 | 79 | parser.add_argument("--cpus", 80 | help="The number of cpus", default=4, type=int) 81 | 82 | parser.add_argument("--memoryMb", 83 | help="The memory in MB", default=2000, type=int) 84 | 85 | args = parser.parse_args() 86 | 87 | # Register job 88 | job = RegisterJob() 89 | result = job.run(args.containerimage, args.s3uri, args.job_name, args.cpus, args.memoryMb) 90 | 91 | logger.info("Completed\n{}".format(json.dumps(result, indent=4))) 92 | -------------------------------------------------------------------------------- /aws_batch/requirements.txt: -------------------------------------------------------------------------------- 1 | boto3==1.7.62 -------------------------------------------------------------------------------- /aws_batch/template_access_policy.py: -------------------------------------------------------------------------------- 1 | from urllib.parse import urlparse 2 | 3 | 4 | def create_access_policy(s3_bucket_uri): 5 | """ 6 | YOU WOULD CUSTOMISE THIS BASED ON YOUR ACCESS REQUIREMENTS 7 | 8 | This is the policy required to run this sample job. 9 | :param s3_bucket_uri: The s3 bucket path to put the results to 10 | :return: The access policy json 11 | """ 12 | 13 | parsed_url = urlparse(s3_bucket_uri) 14 | 15 | bucket_name = parsed_url.netloc 16 | key = parsed_url.path 17 | 18 | # This is custom for the batch 19 | access_policy = { 20 | "Version": "2012-10-17", 21 | "Statement": [ 22 | { 23 | "Sid": "BucketKeyAccess", 24 | "Effect": "Allow", 25 | "Action": [ 26 | "s3:PutObject", 27 | "s3:GetObject", 28 | "s3:DeleteObject", 29 | 30 | ], 31 | "Resource": [ 32 | "arn:aws:s3:::{}{}*".format(bucket_name, key), 33 | ] 34 | }, 35 | { 36 | "Sid": "BuckeyAccess", 37 | "Effect": "Allow", 38 | "Action": [ 39 | "s3:ListBucket", 40 | "s3:HeadBucket" 41 | ], 42 | "Resource": [ 43 | "arn:aws:s3:::{}".format(bucket_name) 44 | ] 45 | } 46 | ] 47 | } 48 | return access_policy 49 | -------------------------------------------------------------------------------- /aws_batch/template_job_definition.py: -------------------------------------------------------------------------------- 1 | def get_job_definition(account, region, container_name, job_def_name, job_param_s3uri_destination, memoryInMB, ncpus, 2 | role_name): 3 | """ 4 | YOU WOULD CUSTOMISE THIS BASED ON YOUR ACCESS REQUIREMENTS 5 | 6 | This is the job definition for this sample job. 7 | :param account: 8 | :param region: 9 | :param container_name: 10 | :param job_def_name: 11 | :param memoryInMB: 12 | :param ncpus: 13 | :param role_name: 14 | :return: 15 | """ 16 | return { 17 | "jobDefinitionName": job_def_name, 18 | "type": "container", 19 | # These are the arguments for the job 20 | "parameters": { 21 | "outputdir": "/data", 22 | "s3destination": job_param_s3uri_destination, 23 | "log_level": "INFO" 24 | 25 | }, 26 | # Specify container & jobs properties include entry point and job args that are referred to in parameters 27 | "containerProperties": { 28 | "image": container_name, 29 | "vcpus": ncpus, 30 | "memory": memoryInMB, 31 | "command": [ 32 | "main.py", 33 | "Ref::outputdir", 34 | "--s3uri", 35 | "Ref::s3destination", 36 | "--log-level", 37 | "Ref::log_level" 38 | 39 | ], 40 | "jobRoleArn": "arn:aws:iam::{}:role/{}".format(account, role_name), 41 | "volumes": [ 42 | { 43 | "host": { 44 | "sourcePath": job_def_name 45 | }, 46 | "name": "/dev/shm" 47 | } 48 | ], 49 | "environment": [ 50 | { 51 | "name": "AWS_DEFAULT_REGION", 52 | "value": region 53 | } 54 | ], 55 | "mountPoints": [ 56 | { 57 | "containerPath": "/data", 58 | "readOnly": False, 59 | "sourceVolume": "data" 60 | } 61 | ], 62 | "readonlyRootFilesystem": False, 63 | "privileged": True, 64 | "ulimits": [], 65 | "user": "" 66 | }, 67 | "retryStrategy": { 68 | "attempts": 1 69 | } 70 | } 71 | -------------------------------------------------------------------------------- /buildspec.yaml: -------------------------------------------------------------------------------- 1 | version: 0.2 2 | 3 | phases: 4 | install: 5 | commands: 6 | - apt-get update 7 | - apt-get install zip 8 | pre_build: 9 | commands: 10 | - pip install pytest==3.6.3 11 | - pip install pyflakes==2.0.0 12 | ## CONFIGURE THIS: Repo name, please make sure this repo exists in ECR 13 | - export IMAGE_REPO_NAME=aws-batch-sample-python 14 | # AWS cli version to login into ecr. This needs to be compatible with the version of boto3 in the requirements file 15 | - export awscli_version=1.16.35 16 | 17 | build: 18 | commands: 19 | # Run Tests 20 | - pip install -r source/requirements.txt 21 | - pip install -r tests/requirements.txt 22 | - pyflakes ./**/*.py 23 | - export PYTHONPATH=./source 24 | - pytest 25 | ## Tests passed, so build docker 26 | - echo Building the Docker image... 27 | - cd source 28 | 29 | ## Automate version tagging based on datetime for now, ideally should be tied to release tags 30 | - export LATEST_TAG=latest 31 | - export VERSION_TAG=$(date '+%Y%m%d%H%M') 32 | # Get AWS Account Id 33 | - export AWS_ACCOUNT_ID=$(echo $CODEBUILD_BUILD_ARN | cut -d':' -f 5) 34 | # Build docker image 35 | - docker build -t $IMAGE_REPO_NAME:$LATEST_TAG . 36 | - docker tag $IMAGE_REPO_NAME:$LATEST_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$LATEST_TAG 37 | - docker tag $IMAGE_REPO_NAME:$LATEST_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$VERSION_TAG 38 | 39 | post_build: 40 | commands: 41 | - echo creating package 42 | - echo $CODEBUILD_SRC_DIR 43 | ## This is optional, packaging the solution without docker 44 | - mkdir $CODEBUILD_SRC_DIR/buildoutput 45 | - pip install -r $CODEBUILD_SRC_DIR/source/requirements.txt -t $CODEBUILD_SRC_DIR/source 46 | - cd $CODEBUILD_SRC_DIR/source && zip -r ../buildoutput/source.zip . 47 | # Login to to ECR, this means code build has this role 48 | # fix awscli version so nothing breaks... 49 | - pip install awscli==$awscli_version 50 | - echo Logging in to Amazon ECR... 51 | - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION) 52 | #Push Docker Image 53 | - echo Pushing the Docker image... 54 | - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$LATEST_TAG 55 | - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$VERSION_TAG 56 | 57 | artifacts: 58 | files: 59 | - '**/*' 60 | base-directory: buildoutput -------------------------------------------------------------------------------- /codebuild_cloudformation.json: -------------------------------------------------------------------------------- 1 | { 2 | "AWSTemplateFormatVersion": "2010-09-09", 3 | "Description": "Deploys code pipeline to build aws batch", 4 | "Metadata": {}, 5 | "Parameters": { 6 | "codeBuildProjectName": { 7 | "Description": "The code build project name", 8 | "Type": "String" 9 | }, 10 | "gitSourcePublicRepository": { 11 | "Description": "The github source public repository clone url", 12 | "Type": "String" 13 | }, 14 | "buildImage": { 15 | "Description": "The python build image", 16 | "Type": "String", 17 | "Default": "aws/codebuild/python:3.6.5" 18 | }, 19 | "buildArtifactsS3Bucket": { 20 | "Description": "The s3 bucket where artifacts will be placed post build", 21 | "Type": "String" 22 | }, 23 | "buildArtifactsS3Key": { 24 | "Description": "The s3 key within the bucket where artifacts will be placed post build", 25 | "Type": "String" 26 | }, 27 | "dockerImageRepository": { 28 | "Description": "The docker image repo to push the image to", 29 | "Type": "String" 30 | } 31 | }, 32 | "Mappings": {}, 33 | "Conditions": {}, 34 | "Resources": { 35 | "CodeBuild": { 36 | "Type": "AWS::CodeBuild::Project", 37 | "Properties": { 38 | "Artifacts": { 39 | "Location": { 40 | "Ref": "buildArtifactsS3Bucket" 41 | }, 42 | "Name": { 43 | "Ref": "codeBuildProjectName" 44 | }, 45 | "Path": { 46 | "Ref": "buildArtifactsS3Key" 47 | }, 48 | "Type": "S3" 49 | }, 50 | "Description": "Builds aws batch docker image", 51 | "Environment": { 52 | "Type": "LINUX_CONTAINER", 53 | "ComputeType": "BUILD_GENERAL1_SMALL", 54 | "Image": { 55 | "Ref": "buildImage" 56 | }, 57 | "PrivilegedMode": "True" 58 | }, 59 | "Name": { 60 | "Ref": "codeBuildProjectName" 61 | }, 62 | "ServiceRole": { 63 | "Ref": "codeBuildIamRole" 64 | }, 65 | "Source": { 66 | "Location": { 67 | "Ref": "gitSourcePublicRepository" 68 | }, 69 | "Type": "GITHUB", 70 | "GitCloneDepth": 1 71 | }, 72 | "Tags": [ 73 | { 74 | "Key": "StackName", 75 | "Value": "AWS::StackName" 76 | } 77 | ], 78 | "TimeoutInMinutes": 20 79 | } 80 | }, 81 | "codeBuildIamRole": { 82 | "Type": "AWS::IAM::Role", 83 | "Properties": { 84 | "AssumeRolePolicyDocument": { 85 | "Version": "2012-10-17", 86 | "Statement": [ 87 | { 88 | "Effect": "Allow", 89 | "Principal": { 90 | "Service": [ 91 | "codebuild.amazonaws.com" 92 | ] 93 | }, 94 | "Action": [ 95 | "sts:AssumeRole" 96 | ] 97 | } 98 | ] 99 | }, 100 | "Policies": [ 101 | { 102 | "PolicyName": "S3PutArtifactsPolicy", 103 | "PolicyDocument": { 104 | "Version": "2012-10-17", 105 | "Statement": [ 106 | { 107 | "Effect": "Allow", 108 | "Action": [ 109 | "s3:PutObject", 110 | "s3:ReadObject" 111 | ], 112 | "Resource": [ 113 | { 114 | "Fn::Join": [ 115 | "", 116 | [ 117 | "arn:aws:s3:::", 118 | { 119 | "Ref": "buildArtifactsS3Bucket" 120 | }, 121 | "/", 122 | { 123 | "Ref": "buildArtifactsS3Key" 124 | }, 125 | "/*" 126 | ] 127 | ] 128 | } 129 | ] 130 | }, 131 | { 132 | "Effect": "Allow", 133 | "Action": [ 134 | "s3:HeadBucket", 135 | "s3:ListBucket" 136 | ], 137 | "Resource": [ 138 | { 139 | "Fn::Join": [ 140 | "", 141 | [ 142 | "arn:aws:s3:::", 143 | { 144 | "Ref": "buildArtifactsS3Bucket" 145 | } 146 | ] 147 | ] 148 | } 149 | ] 150 | } 151 | ] 152 | } 153 | }, 154 | { 155 | "PolicyName": "CloudWatchLogsFullAccess", 156 | "PolicyDocument": { 157 | "Version": "2012-10-17", 158 | "Statement": [ 159 | { 160 | "Action": [ 161 | "logs:*" 162 | ], 163 | "Effect": "Allow", 164 | "Resource": "*" 165 | } 166 | ] 167 | } 168 | }, 169 | { 170 | "PolicyName": "ECRPutImage", 171 | "PolicyDocument": { 172 | "Version": "2012-10-17", 173 | "Statement": [ 174 | { 175 | "Effect": "Allow", 176 | "Action": [ 177 | "ecr:PutImage", 178 | "ecr:InitiateLayerUpload", 179 | "ecr:UploadLayerPart", 180 | "ecr:CompleteLayerUpload", 181 | "ecr:BatchCheckLayerAvailability" 182 | ], 183 | "Resource": [ 184 | { 185 | "Fn::Join": [ 186 | "", 187 | [ 188 | "arn:aws:ecr:", 189 | { 190 | "Ref": "AWS::Region" 191 | }, 192 | ":", 193 | { 194 | "Ref": "AWS::AccountId" 195 | }, 196 | ":repository/", 197 | { 198 | "Ref": "dockerImageRepository" 199 | } 200 | ] 201 | ] 202 | } 203 | ] 204 | }, 205 | { 206 | "Effect": "Allow", 207 | "Action": [ 208 | "ecr:GetAuthorizationToken" 209 | ], 210 | "Resource": "*" 211 | } 212 | ] 213 | } 214 | } 215 | ], 216 | "RoleName": { 217 | "Fn::Join": [ 218 | "", 219 | [ 220 | { 221 | "Ref": "AWS::StackName" 222 | }, 223 | "_CodeBuildIamRole" 224 | ] 225 | ] 226 | } 227 | } 228 | } 229 | }, 230 | "Outputs": {} 231 | } -------------------------------------------------------------------------------- /source/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.6 2 | 3 | ENV source_path /opt/program/source 4 | RUN mkdir -p ${source_path} 5 | 6 | 7 | 8 | #Set up source 9 | COPY ./ ${source_path} 10 | RUN pip install -r ${source_path}/requirements.txt -t ${source_path} 11 | RUN pip install awscli --upgrade 12 | 13 | #Set up working directory 14 | WORKDIR ${source_path} 15 | ENTRYPOINT ["python"] 16 | 17 | #Default arguments to run test 18 | CMD ["main.py", "."] 19 | -------------------------------------------------------------------------------- /source/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-batch-python-sample/673439d319f2c1cddb1a3cefb5fbdeae27b80930/source/__init__.py -------------------------------------------------------------------------------- /source/awsS3Io.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | 4 | import boto3 5 | 6 | """ 7 | S3 upload download operations 8 | """ 9 | 10 | 11 | class AwsS3Io: 12 | 13 | def __init__(self): 14 | self.client = None 15 | 16 | @property 17 | def logger(self): 18 | return logging.getLogger(__name__) 19 | 20 | @property 21 | def client(self): 22 | self.__client__ = self.__client__ or boto3.resource('s3').meta.client 23 | return self.__client__ 24 | 25 | @client.setter 26 | def client(self, value): 27 | self.__client__ = value 28 | 29 | def uploadfile(self, localpath, s3path): 30 | """ 31 | Uploads a file to s3 32 | :param localpath: The local path 33 | :param s3path: The s3 path in format s3://mybucket/mydir/mysample.txt 34 | """ 35 | self.logger.info("Upload file {} to s3 {}".format(localpath, s3path)) 36 | 37 | bucket, key = self.get_bucketname_key(s3path) 38 | 39 | if key.endswith("/"): 40 | key = "{}{}".format(key, os.path.basename(localpath)) 41 | 42 | self.client.upload_file(localpath, bucket, key) 43 | 44 | def get_bucketname_key(self, uripath): 45 | assert uripath.startswith("s3://") 46 | 47 | path_without_scheme = uripath[5:] 48 | bucket_end_index = path_without_scheme.find("/") 49 | 50 | bucket_name = path_without_scheme 51 | key = "/" 52 | if bucket_end_index > -1: 53 | bucket_name = path_without_scheme[0:bucket_end_index] 54 | key = path_without_scheme[bucket_end_index + 1:] 55 | 56 | return bucket_name, key 57 | -------------------------------------------------------------------------------- /source/main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import sys 4 | 5 | from awsS3Io import AwsS3Io 6 | from sampleProcess import SampleProcess 7 | 8 | 9 | def run(output_dir, s3destination): 10 | downloaded_file = SampleProcess().run(output_dir) 11 | # If s3 uri is present upload to s3 12 | if s3destination is not None: 13 | AwsS3Io().uploadfile(downloaded_file, s3destination) 14 | 15 | 16 | if __name__ == '__main__': 17 | parser = argparse.ArgumentParser() 18 | parser.add_argument("output_dir", 19 | help="The output location to save the data to") 20 | 21 | parser.add_argument("--s3uri", 22 | help="This is optional, provide the path if you want to upload the data to s3", default=None) 23 | 24 | parser.add_argument("--log-level", help="Log level", default="INFO", choices={"INFO", "WARN", "DEBUG", "ERROR"}) 25 | 26 | args = parser.parse_args() 27 | 28 | # Set up logging 29 | logging.basicConfig(level=logging.getLevelName(args.log_level), handlers=[logging.StreamHandler(sys.stdout)], 30 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') 31 | logger = logging.getLogger(__name__) 32 | 33 | # Start process 34 | logger.info("Starting run with arguments...\n{}".format(args.__dict__)) 35 | 36 | run(args.output_dir, args.s3uri) 37 | 38 | logger.info("Completed run...") 39 | -------------------------------------------------------------------------------- /source/requirements.txt: -------------------------------------------------------------------------------- 1 | boto3==1.7.62 -------------------------------------------------------------------------------- /source/sampleProcess.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | import urllib.request 4 | 5 | """ 6 | This is a simple sample that downloads a json formatted address and writes the output to a directory 7 | """ 8 | 9 | 10 | class SampleProcess: 11 | def __init__(self, uri="http://maps.googleapis.com/maps/api/geocode/json?address=google"): 12 | self.uri = uri 13 | 14 | @property 15 | def logger(self): 16 | return logging.getLogger(__name__) 17 | 18 | def run(self, output_dir): 19 | output_filename = os.path.join(output_dir, "sample.json") 20 | self.logger.info("Downloading from {} to {}".format(self.uri, output_filename)) 21 | 22 | with urllib.request.urlopen(self.uri) as url: 23 | data = url.read().decode() 24 | 25 | self.logger.debug("Writing {} to {}", data, output_filename) 26 | with open(output_filename, "w") as out: 27 | out.write(data) 28 | 29 | self.logger.info("Download complete..") 30 | return output_filename 31 | -------------------------------------------------------------------------------- /tests/IT_test_sampleProcess.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tempfile 3 | from unittest import TestCase 4 | 5 | from sampleProcess import SampleProcess 6 | 7 | """ 8 | This a integration test 9 | """ 10 | 11 | 12 | class ITTestSampleProcess(TestCase): 13 | 14 | def test_run(self): 15 | # Arrange 16 | sut = SampleProcess() 17 | tmpout = tempfile.mkdtemp() 18 | 19 | # Act 20 | actual = sut.run(output_dir=tmpout) 21 | 22 | # Assert 23 | # Check the file downloaed exits is greater than zero bytes 24 | self.assertTrue(os.path.getsize(actual) > 0) 25 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/aws-batch-python-sample/673439d319f2c1cddb1a3cefb5fbdeae27b80930/tests/__init__.py -------------------------------------------------------------------------------- /tests/requirements.txt: -------------------------------------------------------------------------------- 1 | ddt==1.1.3 -------------------------------------------------------------------------------- /tests/test_awsS3Io.py: -------------------------------------------------------------------------------- 1 | from unittest import TestCase 2 | from unittest.mock import Mock 3 | 4 | from awsS3Io import AwsS3Io 5 | from ddt import ddt, data, unpack 6 | 7 | """ 8 | This is a unit test that mocks boto3 client 9 | """ 10 | 11 | 12 | @ddt 13 | class TestAwsS3Io(TestCase): 14 | 15 | @data(("mydummyfile", "s3://mockbucket/path", "mockbucket", "path") 16 | , ("/user/mydummyfile", "s3://mockbucket/path/", "mockbucket", "path/mydummyfile")) 17 | @unpack 18 | def test_uploadfile(self, localfile, s3, expected_bucket, expected_key): 19 | # Arrange 20 | sut = AwsS3Io() 21 | mocks3client = Mock() 22 | sut.client = mocks3client 23 | 24 | # Act 25 | sut.uploadfile(localfile, s3) 26 | 27 | # Assert s3 client was called 28 | mocks3client.upload_file.assert_called_with(localfile, expected_bucket, expected_key) 29 | --------------------------------------------------------------------------------