├── .gitignore
├── README.md
├── api-gateway
├── code
│ └── app.py
└── template.yaml
├── autopilot-job-lambda
├── check-autopilot-status
│ ├── app.py
│ └── requirements.txt
├── create-autopilot
│ ├── app.py
│ └── requirements.txt
├── template.yaml
└── workflow-trigger
│ ├── app.py
│ └── requirements.txt
├── buildspec.yaml
├── buildspec_deploy_api.yaml
├── img
├── Machine-Learning-Recipe-with-Olalekan.jpg
├── stepfunctions_graph_horitontal.png
└── stepfunctions_graph_vertical.png
├── requirements.txt
└── workflow
├── __init__.py
├── main.py
└── utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | .idea
2 | venv
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Automate the end-to-end AutoML lifecycle with Amazon SageMaker Autopilot on Amazon Step Functions
2 |
(CD4AutoML)
3 |
4 |
5 | 
6 |
7 | 
8 |
9 | This repository is a getting started/ready to use kit for deploying and running repeatable machine learning pipelines
10 | with Amazon SageMaker Autopilot. The project leverages with Amazon CodeBuild service integration with Amazon Step Functions
11 | to deploy an Amazon API Gateway for serving predictions. With this project, you can easily jump start your continuous
12 | integration and continuous deployment automated machine learning workflow with Amazon SageMaker Autopilot.
13 |
14 |
15 | ## Architecture
16 | 
17 |
18 | This project is designed to get up and running with CD4AutoML (I coined this), much [CD4ML](https://martinfowler.com/articles/cd4ml.html)
19 | from [Martin Fowler's blogpost](https://martinfowler.com/articles/cd4ml.html).
20 |
21 | ### Technologies:
22 | - Amazon Cloudformation
23 | - Amazon Step Functions
24 | - Amazon CodeBuild
25 | - AWS Step Functions Data Science SDK
26 | - AWS Serverless Application Model
27 | - Amazon Lambda
28 | - Amazon API Gateway
29 | - Amazon SSM Parameter Store
30 |
31 |
32 | This project gets you out of the play/lab mode with Amazon SageMaker Autopilot into running real-life applications with Amazon SageMaker Autopilot.
33 |
34 | ### State machine Workflow
35 | The entire workflow is managed with AWS Step Functions Data Science SDK.
36 | Amazon Step Functions does not have service integration with Amazon SageMaker Autopilot out of the box. To manage this, I leveraged
37 | Amazon Lambda integration with Step Functions to periodically poll for Amazon SageMaker Autopilot job status.
38 |
39 | Once the AutoML job is completed, a model is created using the Amazon SageMaker Autopilot Inference Containers, and an Amazon
40 | SageMaker Endpoint is deployed. But there is more...
41 |
42 | On completion of the deployment of the Amazon SageMaker Endpoint, an Amazon CodeBuild Project state machine task is triggered
43 | which deploys our Amazon API Gateway with AWS Serverless Application Model.
44 |
45 | See workflow image below:
46 |
47 | CD4AutoML
48 |
49 | 
50 |
51 |
52 | ## Future Work
53 | I have plans to abstract away all deployment details and convert this into a Python Module or better put AutoML-as-a-Service.
54 | Users can either provide their Pandas DataFrame or local CSV/JSON data, and the service takes care of the rest. Users will
55 | get a secure REST API which they can make predictions in their applications.
56 |
57 | If you're interested in working on this together, feel free to reach out. Also feel free to extend this project as it suites
58 | you. Experiencing any challenges getting started, create an issue and I will have a look as soon as I can.
59 |
60 |
61 | ### TODO
62 | - [X] Add CloudWatch Schedule event trigger
63 | - [ ] Python Module
64 | - [ ] Convert project to AutoML-as-a-Service
--------------------------------------------------------------------------------
/api-gateway/code/app.py:
--------------------------------------------------------------------------------
1 | from boto3 import client
2 | from os import getenv
3 | from json import loads, dumps
4 |
5 |
6 | SAGEMAKER_ENDPOINT = getenv('SAGEMAKER_ENDPOINT')
7 | SAGEMAKER_AUTOPILOT_TARGET_MODEL = getenv('SAGEMAKER_AUTOPILOT_TARGET_MODEL')
8 | sm_runtime = client('sagemaker-runtime')
9 |
10 |
11 | def respond(data, status=501):
12 | return {
13 | "headers": {
14 | "Access-Control-Allow-Headers": "Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token",
15 | "Access-Control-Allow-Methods": "OPTIONS,POST,PUT",
16 | "Access-Control-Allow-Origin": "*"
17 | },
18 | "statusCode": status,
19 | "body": dumps(data)
20 | }
21 |
22 |
23 | def lambda_handler(event, context):
24 | """
25 | :param event:
26 | :param context:
27 | :return:
28 | """
29 | request_method = event['httpMethod']
30 | request_body = loads(event['body'])
31 |
32 | if request_method == 'OPTIONS':
33 | return respond("This is an empty OPTIONS Request", 200)
34 |
35 | response = sm_runtime.invoke_endpoint(
36 | EndpointName=SAGEMAKER_ENDPOINT,
37 | ContentType='text/csv',
38 | Accept='text/csv',
39 | Body=request_body,
40 | TargetModel=SAGEMAKER_AUTOPILOT_TARGET_MODEL
41 | )
42 | payload = {
43 | 'Prediction': response['Body'].read().decode("utf-8"),
44 | 'SageMakerEndpointName': SAGEMAKER_ENDPOINT
45 | }
46 | return respond(payload, 200)
47 |
--------------------------------------------------------------------------------
/api-gateway/template.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion: '2010-09-09'
2 | Transform: AWS::Serverless-2016-10-31
3 | Description: >
4 | SAM Template for gram2shop-backend
5 |
6 | # More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
7 | Globals:
8 | Function:
9 | Timeout: 30
10 |
11 | Parameters:
12 | SageMakerEndpoint:
13 | Type: String
14 | SageMakerAutopilotTargetModel:
15 | Type: String
16 | Description: 'Model to invoke in multi-model endpoint'
17 |
18 |
19 | Resources:
20 | SageMakerInferenceApi:
21 | Type: AWS::Serverless::Function
22 | Properties:
23 | CodeUri: code/
24 | Handler: app.lambda_handler
25 | Runtime: python3.7
26 | Environment:
27 | Variables:
28 | SAGEMAKER_ENDPOINT: !Ref SageMakerEndpoint
29 | SAGEMAKER_AUTOPILOT_TARGET_MODEL: !Ref SageMakerAutopilotTargetModel
30 | Events:
31 | Request:
32 | Type: Api
33 | Properties:
34 | Path: /predictions
35 | Method: POST
36 | Options:
37 | Type: Api
38 | Properties:
39 | Path: /predictions
40 | Method: OPTIONS
41 | Policies:
42 | Version: '2012-10-17'
43 | Statement:
44 | - Effect: Allow
45 | Action:
46 | - sagemaker:InvokeEndpoint
47 | Resource: !Sub 'arn:aws:sagemaker:${AWS::Region}:${AWS::AccountId}:endpoint/${SageMakerEndpoint}'
48 |
49 |
50 | Outputs:
51 | SageMakerInferenceApi:
52 | Description: "API Gateway endpoint URL for Prod stage for ExtractRichMediaFunctionApi function"
53 | Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/predictions"
--------------------------------------------------------------------------------
/autopilot-job-lambda/check-autopilot-status/app.py:
--------------------------------------------------------------------------------
1 | """
2 | Amazon Step Functions Lambda Resource to check Amazon SageMaker AutoPilot job status
3 | """
4 |
5 | from boto3 import client
6 |
7 | sm_client = client('sagemaker')
8 |
9 |
10 | def lambda_handler(event, context):
11 | """
12 | :param event:
13 | :param context:
14 | :return:
15 | """
16 | autopilot_job_name = event['AutopilotJobName']
17 | print(f'Autopilot JOb Name: {autopilot_job_name}')
18 | response = sm_client.describe_auto_ml_job(AutoMLJobName=autopilot_job_name)
19 | job_status = response['AutoMLJobStatus']
20 | job_sec_status = response['AutoMLJobSecondaryStatus']
21 | print(f'Autopilot Job {autopilot_job_name} is currently in {job_status}')
22 | result = {
23 | 'AutopilotJobName': autopilot_job_name,
24 | 'AutopilotJobStatus': job_status,
25 | 'AutopilotSecondaryJobStatus': job_sec_status,
26 | 'FailureReason': response.get('FailureReason', None),
27 | 'MachineLearningTaskType': response.get('ProblemType', None)
28 | }
29 | if job_status == 'Completed':
30 | best_candidate = response['BestCandidate']
31 | inference_containers = best_candidate['InferenceContainers']
32 | multi_model_inference_containers = list(map(_set_multimodel_mode, inference_containers))
33 | result['InferenceContainers'] = multi_model_inference_containers
34 | result['BestCandidateName'] = best_candidate['CandidateName']
35 | return result
36 |
37 |
38 | def _set_multimodel_mode(inference_container: dict) -> dict:
39 | """
40 | :param inference_container:
41 | :return:
42 | """
43 | inference_container['Mode'] = 'MultiModel'
44 | return inference_container
45 |
--------------------------------------------------------------------------------
/autopilot-job-lambda/check-autopilot-status/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/autopilot-job-lambda/check-autopilot-status/requirements.txt
--------------------------------------------------------------------------------
/autopilot-job-lambda/create-autopilot/app.py:
--------------------------------------------------------------------------------
1 | """
2 | Amazon Step Functions Lambda Resource to create Amazon SageMaker AutoPilot job
3 | """
4 | from boto3 import client
5 | from time import gmtime, strftime
6 |
7 | sm_client = client('sagemaker')
8 |
9 |
10 | def lambda_handler(event, context):
11 | """
12 | :param event:
13 | :param context:
14 | :return:
15 | """
16 | print(event)
17 | timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())
18 | default_autopilot_job_name = f'aws-samples-autopilot-workflow-{timestamp_suffix}'
19 | configuration: dict = event['Configuration']
20 | autopilot_job_name = configuration.get('AutoMLJobName', default_autopilot_job_name)
21 | input_data = configuration.get('S3InputData')
22 | job_execution_role = configuration.get('IamRole')
23 | target_column = configuration.get('TargetColumnName')
24 | output_path = configuration.get('S3OutputData')
25 | tags = configuration.get('Tags')
26 | autopilot_job_tags = generate_job_tags(tags)
27 | autopilot_job_config: dict = {
28 | 'CompletionCriteria': {
29 | 'MaxRuntimePerTrainingJobInSeconds': 600,
30 | 'MaxCandidates': 5,
31 | 'MaxAutoMLJobRuntimeInSeconds': 5400
32 | }
33 | }
34 |
35 | autopilot_input_data_config = [
36 | {
37 | 'DataSource': {
38 | 'S3DataSource': {
39 | 'S3DataType': 'S3Prefix',
40 | 'S3Uri': input_data
41 | }
42 | },
43 | 'TargetAttributeName': target_column
44 | }
45 | ]
46 |
47 | autopilot_output_data_config = {
48 | 'S3OutputPath': output_path
49 | }
50 |
51 | response = sm_client.create_auto_ml_job(
52 | AutoMLJobName=autopilot_job_name,
53 | InputDataConfig=autopilot_input_data_config,
54 | OutputDataConfig=autopilot_output_data_config,
55 | AutoMLJobConfig=autopilot_job_config,
56 | RoleArn=job_execution_role,
57 | Tags=autopilot_job_tags
58 | )
59 | return {
60 | 'AutopilotJobName': autopilot_job_name,
61 | 'AutopilotJobArn': response['AutoMLJobArn']
62 | }
63 |
64 |
65 | def generate_job_tags(raw_tags):
66 | """
67 | :param raw_tags:
68 | :return:
69 | """
70 | base_tags = [
71 | {
72 | 'Key': 'provider',
73 | 'Value': 'elesin.olalekan@gmail.com'
74 | },
75 | ]
76 | if raw_tags is None:
77 | return base_tags
78 | input_tags = [{'Key': key, 'Value': value} for key, value in raw_tags]
79 | return base_tags + input_tags
80 |
--------------------------------------------------------------------------------
/autopilot-job-lambda/create-autopilot/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/autopilot-job-lambda/create-autopilot/requirements.txt
--------------------------------------------------------------------------------
/autopilot-job-lambda/template.yaml:
--------------------------------------------------------------------------------
1 | AWSTemplateFormatVersion : '2010-09-09'
2 | Transform: AWS::Serverless-2016-10-31
3 | Description: A sample SAM template for deploying Lambda functions.
4 |
5 | Parameters:
6 | ServiceBuildRoleArn:
7 | Type: String
8 | Description: 'IAM Role to be assumed by CodeBuild to run jobs'
9 | S3InputDataPath:
10 | Type: String
11 | Description: 'AWS S3 Input data path for SageMaker Autopilot model training'
12 | Default: 'Unknown'
13 | TargetColumnName:
14 | Type: String
15 | Description: 'Name of the target column to predict with SageMaker Autopilot'
16 | Default: 'Unknown'
17 | S3OutputPath:
18 | Type: String
19 | Description: 'S3 Path to save SageMaker Autopilot model artifacts'
20 | Default: 'Unknown'
21 | DeployedModelName:
22 | Type: String
23 | Description: |
24 | SageMaker Autopilot model name. This should be unique per account. The name is used for multi-model endpoint
25 | deployment
26 | Default: 'default-autopilot-workflow-model'
27 |
28 | Resources:
29 | CreateAutopilotJobFunction:
30 | Type: AWS::Serverless::Function
31 | Properties:
32 | Handler: app.lambda_handler
33 | FunctionName: 'CreateAutopilotJob'
34 | Description: 'Create SageMaker Autopilot Job'
35 | CodeUri: create-autopilot/
36 | Runtime: python3.7
37 | Policies:
38 | - Version: '2012-10-17'
39 | Statement:
40 | - Effect: Allow
41 | Action:
42 | - lambda:InvokeFunction
43 | - lambda:InvokeAsync
44 | Resource: '*'
45 | - Effect: Allow
46 | Action:
47 | - sagemaker:CreateAutoMLJob
48 | - sagemaker:CreateTrainingJob
49 | - iam:PassRole
50 | Resource: '*'
51 | - Effect: Allow
52 | Action:
53 | - s3:CreateBucket
54 | - s3:PutObject
55 | Resource: arn:aws:s3:::sagemaker-*
56 |
57 | CheckAutopilotJobStatusFunction:
58 | Type: AWS::Serverless::Function
59 | Properties:
60 | Handler: app.lambda_handler
61 | CodeUri: check-autopilot-status/
62 | Runtime: python3.7
63 | FunctionName: 'CheckAutopilotJobStatus'
64 | Description: 'Checks SageMaker Autopilot Job Status'
65 | Policies:
66 | Version: '2012-10-17'
67 | Statement:
68 | - Effect: Allow
69 | Action:
70 | - lambda:InvokeFunction
71 | - lambda:InvokeAsync
72 | Resource: '*'
73 | - Effect: Allow
74 | Action:
75 | - sagemaker:DescribeAutoMLJob
76 | - sagemaker:DescribeTrainingJob
77 | Resource: '*'
78 |
79 | StateMachineWorkflowTriggerFunction:
80 | Type: AWS::Serverless::Function
81 | Properties:
82 | Handler: app.lambda_handler
83 | CodeUri: workflow-trigger/
84 | Runtime: python3.7
85 | FunctionName: 'StateMachineWorkflowTrigger'
86 | Description: 'Lambda function to trigger workflow on schedule'
87 | Environment:
88 | Variables:
89 | SAGEMAKER_EXECUTION_ROLE: !GetAtt SageMakerExecutionRole.Arn
90 | S3_INPUT_DATA_PATH: !Ref S3InputDataPath
91 | S3_OUTPUT_PATH: !Ref S3OutputPath
92 | TARGET_COLUMN_NAME: !Ref TargetColumnName
93 | DEPLOYED_MODEL_NAME: !Ref DeployedModelName
94 | Policies:
95 | Version: '2012-10-17'
96 | Statement:
97 | - Effect: Allow
98 | Action:
99 | - lambda:InvokeFunction
100 | - lambda:InvokeAsync
101 | Resource: '*'
102 | - Effect: Allow
103 | Action:
104 | - states:StartExecution
105 | Resource: '*'
106 | # Events: # uncomment for production deployment
107 | # JobSchedule:
108 | # Type: Schedule
109 | # Properties:
110 | # Schedule: 'rate(14 days)'
111 | # Name: CD4AutoMLSchedule
112 | # Description: test schedule
113 | # Enabled: False
114 |
115 | # AutopilotWorkflowBuildProject:
116 | # Type: AWS::CodeBuild::Project
117 | # Properties:
118 | # Name: AutopilotWorkflowBuild
119 | # ServiceRole: !Ref ServiceBuildRoleArn
120 | # Artifacts:
121 | # Type: NO_ARTIFACTS
122 | # Environment:
123 | # Type: LINUX_CONTAINER
124 | # ComputeType: BUILD_GENERAL1_SMALL
125 | # Image: aws/codebuild/standard:1.0
126 | # PrivilegedMode: true
127 | # Source:
128 | # Type: GITHUB
129 | # Location: https://github.com/OElesin/sagemaker-autopilot-step-functions.git
130 | # Triggers:
131 | # Webhook: true
132 | # FilterGroups:
133 | # - - Type: EVENT
134 | # Pattern: PUSH
135 | # - Type: HEAD_REF
136 | # Pattern: '^refs/heads/master$'
137 | # - Type: FILE_PATH
138 | # Pattern: 'README.md'
139 | # ExcludeMatchedPattern: true
140 | # Tags:
141 | # - Key: 'provider'
142 | # Value: 'elesin.olalekan@gmail.com'
143 |
144 | RestApiBuildProject:
145 | Type: AWS::CodeBuild::Project
146 | Properties:
147 | Name: RestApiBuildProject
148 | ServiceRole: !Ref ServiceBuildRoleArn
149 | BadgeEnabled: true
150 | Artifacts:
151 | Type: NO_ARTIFACTS
152 | Environment:
153 | Type: LINUX_CONTAINER
154 | ComputeType: BUILD_GENERAL1_SMALL
155 | Image: aws/codebuild/standard:1.0
156 | PrivilegedMode: true
157 | Source:
158 | Type: GITHUB
159 | Location: https://github.com/OElesin/sagemaker-autopilot-step-functions.git
160 | BuildSpec: buildspec_deploy_api.yaml
161 | TimeoutInMinutes: 10
162 | Tags:
163 | - Key: 'provider'
164 | Value: 'elesin.olalekan@gmail.com'
165 |
166 | WorkflowExecRole:
167 | Type: AWS::IAM::Role
168 | Properties:
169 | RoleName: WorkflowExecRole
170 | Path: '/'
171 | AssumeRolePolicyDocument:
172 | Version: '2012-10-17'
173 | Statement:
174 | - Action:
175 | - sts:AssumeRole
176 | Effect: Allow
177 | Principal:
178 | Service:
179 | - states.amazonaws.com
180 | ManagedPolicyArns:
181 | - arn:aws:iam::aws:policy/CloudWatchEventsFullAccess
182 |
183 | SageMakerExecutionRole:
184 | Type: AWS::IAM::Role
185 | Properties:
186 | RoleName: SageMakerExecutionRole
187 | Path: '/service-role/'
188 | AssumeRolePolicyDocument:
189 | Version: '2012-10-17'
190 | Statement:
191 | - Action:
192 | - sts:AssumeRole
193 | Effect: Allow
194 | Principal:
195 | Service:
196 | - sagemaker.amazonaws.com
197 | ManagedPolicyArns:
198 | - arn:aws:iam::aws:policy/AmazonS3FullAccess
199 | - arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
200 |
201 | WorkflowExecPolicy:
202 | Type: AWS::IAM::Policy
203 | Properties:
204 | PolicyName: 'WorkflowExecPolicy'
205 | PolicyDocument:
206 | Statement:
207 | - Effect: Allow
208 | Action: iam:PassRole
209 | Resource: '*'
210 | Condition:
211 | StringEquals:
212 | iam:PassedToService: sagemaker.amazonaws.com
213 | - Effect: Allow
214 | Action:
215 | - events:DescribeRule
216 | - events:PutRule
217 | - events:PutTargets
218 | Resource:
219 | - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/StepFunctionsGetEventsForSageMakerTrainingJobsRule
220 | - Effect: Allow
221 | Action:
222 | - sagemaker:CreateModel
223 | - sagemaker:DeleteEndpointConfig
224 | - sagemaker:DescribeTrainingJob
225 | - sagemaker:CreateEndpoint
226 | - sagemaker:StopTrainingJob
227 | - sagemaker:CreateTrainingJob
228 | - sagemaker:UpdateEndpoint
229 | - sagemaker:CreateEndpointConfig
230 | - sagemaker:DeleteEndpoint
231 | Resource:
232 | - arn:aws:sagemaker:*:*:*
233 | - Effect: Allow
234 | Action:
235 | - lambda:InvokeFunction
236 | - lambda:InvokeAsync
237 | Resource:
238 | - !GetAtt CreateAutopilotJobFunction.Arn
239 | - !GetAtt CheckAutopilotJobStatusFunction.Arn
240 | - Effect: Allow
241 | Action:
242 | - codebuild:StartBuild
243 | - codebuild:StopBuild
244 | - codebuild:BatchGetBuilds
245 | - codebuild:BatchGetReports
246 | Resource:
247 | - !GetAtt RestApiBuildProject.Arn
248 | Roles:
249 | - !Ref WorkflowExecRole
250 |
251 | WorkflowRoleParameter:
252 | Type: AWS::SSM::Parameter
253 | Properties:
254 | Name: AutopilotWorkflowExecRole
255 | Type: String
256 | Value: !GetAtt WorkflowExecRole.Arn
257 | Description: 'AWS SageMaker Autopilot Step Functions workflow execution role arn.'
258 |
259 | RestApiBuildProjectParameter:
260 | Type: AWS::SSM::Parameter
261 | Properties:
262 | Name: RestApiBuildProject
263 | Type: String
264 | Value: !Ref RestApiBuildProject
265 | Description: 'CodeBuild Project Name for deploying REST API.'
266 |
267 | SageMakerExecutionRoleParameter:
268 | Type: AWS::SSM::Parameter
269 | Properties:
270 | Name: SageMakerExecutionRole
271 | Type: String
272 | Value: !GetAtt SageMakerExecutionRole.Arn
273 | Description: 'AWS SageMaker execution role arn.'
274 |
275 | SageMakerAutopilotModelName:
276 | Type: AWS::SSM::Parameter
277 | Properties:
278 | Name: SageMakerAutopilotModelName
279 | Type: String
280 | Value: !Ref DeployedModelName
281 | Description: 'SageMaker Autopilot model name. This should be unique per account. The name is used for multi-model endpoint deployment'
282 |
283 | Outputs:
284 | WorkflowExecArn:
285 | Value: !GetAtt WorkflowExecRole.Arn
286 | Export:
287 | Name: AutopilotWorkflowExecRole
--------------------------------------------------------------------------------
/autopilot-job-lambda/workflow-trigger/app.py:
--------------------------------------------------------------------------------
1 | from boto3 import client
2 | from time import gmtime, strftime
3 | from os import getenv
4 | from json import dumps
5 |
6 |
7 | ssm_client = client('ssm')
8 | sfn_client = client('stepfunctions')
9 | sagemaker_exec_role = getenv('SAGEMAKER_EXECUTION_ROLE')
10 | s3_input_data_path = getenv('S3_INPUT_DATA_PATH')
11 | deployed_model_name = getenv('DEPLOYED_MODEL_NAME')
12 | s3_output_path = getenv('S3_OUTPUT_PATH')
13 | target_column_name = getenv('TARGET_COLUMN_NAME')
14 |
15 | state_machine_arn = ssm_client.get_parameter(
16 | Name='AutopilotStateMachineWorkflowArn',
17 | )['Parameter']['Value']
18 |
19 |
20 | def lambda_handler(event, context):
21 | """
22 | :param event:
23 | :param context:
24 | :return:
25 | """
26 | timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())
27 | execution_input_dict = {
28 | 'AutoMLJobName': f'autopilot-workflow-job-{timestamp_suffix}',
29 | 'ModelName': deployed_model_name,
30 | 'EndpointName': f'autopilot-workflow-endpoint',
31 | 'S3InputData': s3_input_data_path,
32 | 'TargetColumnName': target_column_name,
33 | 'S3OutputData': s3_output_path,
34 | 'IamRole': sagemaker_exec_role,
35 | }
36 | execution_input = dumps(execution_input_dict)
37 | sfn_client.start_execution(
38 | stateMachineArn=state_machine_arn,
39 | input=execution_input
40 | )
41 |
--------------------------------------------------------------------------------
/autopilot-job-lambda/workflow-trigger/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/autopilot-job-lambda/workflow-trigger/requirements.txt
--------------------------------------------------------------------------------
/buildspec.yaml:
--------------------------------------------------------------------------------
1 | version: 0.2
2 |
3 | env:
4 | variables:
5 | ARTIFACTS_BUCKET: galudy-visual-search-artifacts
6 | S3_WORKFLOW_PREFIX: sagemaker-autopilot-step-functions-workflow
7 | S3_API_PREFIX: sagemaker-autopilot-api-gateway
8 | WORKFLOW_STACK_NAME: sagemaker-autopilot-ml-workflow
9 | API_STACK_NAME: sagemaker-autopilot-api-gateway
10 |
11 | phases:
12 | install:
13 | runtime-versions:
14 | python: 3.7
15 | commands:
16 | - apt-get update
17 | - apt-get install libblas-dev liblapack-dev -y
18 | - pip install --upgrade pip boto3
19 | - pip install stepfunctions aws-sam-cli
20 | pre_build:
21 | commands:
22 | - echo "Run AWS SAM Build on workflow stack and API stack"
23 | - export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
24 | - export SERVICE_BUILD_ROLE="arn:aws:iam::${AWS_ACCOUNT_ID}:role/service-role/codebuild-Attires24-App-service-role"
25 | - echo "SAM package workflow template"
26 | - |
27 | sh -c """
28 | cd autopilot-job-lambda
29 | sam build -b deploy/
30 | sam package --template-file deploy/template.yaml \
31 | --output-template-file packaged.template.yaml \
32 | --s3-bucket ${ARTIFACTS_BUCKET} \
33 | --s3-prefix ${S3_WORKFLOW_PREFIX}
34 | """
35 | - echo "SAM package REST API template"
36 | build:
37 | commands:
38 | - echo Build started on `date`
39 | - |
40 | sh -c """
41 | cd autopilot-job-lambda
42 | sam deploy --stack-name ${WORKFLOW_STACK_NAME} \
43 | --template-file packaged.template.yaml \
44 | --parameter-overrides S3ApiCodePath="${ARTIFACTS_BUCKET}/${S3_API_PREFIX}" ServiceBuildRoleArn=${SERVICE_BUILD_ROLE} \
45 | --capabilities CAPABILITY_NAMED_IAM \
46 | --no-fail-on-empty-changeset
47 | """
48 | - echo Done Deploying Stack
49 | - echo "Deploying Step Functions State Machine"
50 | - |
51 | sh -c """
52 | cd workflow/
53 | python main.py
54 | """
55 | - echo "Upload REST API Code to AWS S3"
56 | - aws s3 cp api-gateway "s3://${ARTIFACTS_BUCKET}/${S3_API_PREFIX}" --recursive
57 | post_build:
58 | commands:
59 | - echo "Build Completed"
--------------------------------------------------------------------------------
/buildspec_deploy_api.yaml:
--------------------------------------------------------------------------------
1 | version: 0.2
2 |
3 | env:
4 | variables:
5 | LC_ALL: en_US.UTF-8
6 | LANG: en_US.UTF-8
7 | ARTIFACTS_BUCKET: galudy-visual-search-artifacts
8 | S3_API_PREFIX: sagemaker-autopilot-api-gateway
9 | API_STACK_NAME: sagemaker-autopilot-api-gateway
10 |
11 | phases:
12 | install:
13 | runtime-versions:
14 | python: 3.8
15 | commands:
16 | - pip install -r requirements.txt
17 | pre_build:
18 | commands:
19 | - echo "Run AWS SAM Build on workflow stack and API stack"
20 | - export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
21 | - echo "SAM package workflow template"
22 | - |
23 | sh -c """
24 | cd api-gateway
25 | sam build -b deploy/
26 | sam package --template-file deploy/template.yaml \
27 | --output-template-file packaged.template.yaml \
28 | --s3-bucket ${ARTIFACTS_BUCKET} \
29 | --s3-prefix ${S3_API_PREFIX}
30 | """
31 | - echo "SAM package REST API template"
32 | build:
33 | commands:
34 | - echo Build started on `date`
35 | - |
36 | sh -c """
37 | cd api-gateway
38 | sam deploy --stack-name ${API_STACK_NAME} \
39 | --template-file packaged.template.yaml \
40 | --parameter-overrides SageMakerEndpoint=${SAGEMAKER_ENDPOINT} SageMakerAutopilotTargetModel=${SAGEMAKER_AUTOPILOT_TARGET_MODEL} \
41 | --capabilities CAPABILITY_IAM \
42 | --no-fail-on-empty-changeset
43 | """
44 | post_build:
45 | commands:
46 | - echo "Build Completed"
--------------------------------------------------------------------------------
/img/Machine-Learning-Recipe-with-Olalekan.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/img/Machine-Learning-Recipe-with-Olalekan.jpg
--------------------------------------------------------------------------------
/img/stepfunctions_graph_horitontal.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/img/stepfunctions_graph_horitontal.png
--------------------------------------------------------------------------------
/img/stepfunctions_graph_vertical.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/img/stepfunctions_graph_vertical.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | boto3
2 | stepfunctions
3 | aws-sam-cli
4 | sagemaker
--------------------------------------------------------------------------------
/workflow/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/OElesin/sagemaker-autopilot-step-functions/38e7679f1aff35973f9ce403795cab332c0f3dbd/workflow/__init__.py
--------------------------------------------------------------------------------
/workflow/main.py:
--------------------------------------------------------------------------------
1 | import sagemaker
2 | from stepfunctions.steps import LambdaStep, Wait, Choice, Task, Chain, ChoiceRule, \
3 | Catch, Retry, Fail, EndpointConfigStep, EndpointStep
4 | from boto3 import client
5 | from stepfunctions.inputs import ExecutionInput
6 | from stepfunctions.workflow import Workflow
7 | from time import gmtime, strftime
8 | from sagemaker.model_monitor import DataCaptureConfig
9 | import utils
10 |
11 | sagemaker_session = sagemaker.Session()
12 | sagemaker_exec_role = utils.get_sagemaker_execution_role()
13 | sfn_client = client('stepfunctions')
14 | # define execution input
15 | execution_input = ExecutionInput(schema={
16 | 'AutoMLJobName': str,
17 | 'ModelName': str,
18 | 'S3InputData': str,
19 | 'IamRole': str,
20 | 'TargetColumnName': str,
21 | 'S3OutputData': str,
22 | 'Tags': dict,
23 | 'EndpointName': str,
24 | 'EndpointConfigName': str
25 |
26 | })
27 |
28 | # TODO: make this a notification
29 | workflow_failure = Fail(
30 | 'WorkflowFailed'
31 | )
32 |
33 | # create autopilot lambda step
34 | create_autopilot_job_step = LambdaStep(
35 | 'StartAutopilotJob',
36 | parameters={
37 | 'FunctionName': 'CreateAutopilotJob',
38 | 'Payload': {
39 | 'Configuration': {
40 | 'AutoMLJobName': execution_input['AutoMLJobName'],
41 | 'S3InputData': execution_input['S3InputData'],
42 | 'IamRole': execution_input['IamRole'],
43 | 'TargetColumnName': execution_input['TargetColumnName'],
44 | 'S3OutputData': execution_input['S3OutputData'],
45 | 'Tags': execution_input['Tags']
46 | }
47 | }
48 | }
49 | )
50 |
51 | create_autopilot_job_step.add_retry(Retry(
52 | error_equals=["States.TaskFailed"],
53 | interval_seconds=15,
54 | max_attempts=2,
55 | backoff_rate=4.0
56 | ))
57 |
58 | create_autopilot_job_step.add_catch(Catch(
59 | error_equals=["States.TaskFailed"],
60 | next_step=workflow_failure
61 | ))
62 |
63 | check_autopilot_job_status = LambdaStep(
64 | 'CheckAutopilotJobStatus',
65 | parameters={
66 | 'FunctionName': 'CheckAutopilotJobStatus',
67 | 'Payload': {
68 | 'AutopilotJobName': create_autopilot_job_step.output()['Payload']['AutopilotJobName']
69 | }
70 | }
71 | )
72 |
73 | check_job_wait_state = Wait(
74 | state_id="Wait",
75 | seconds=360
76 | )
77 |
78 | check_job_choice = Choice(
79 | state_id="IsAutopilotJobComplete"
80 | )
81 |
82 |
83 | model_step = Task(
84 | 'CreateAutopilotModel',
85 | resource='arn:aws:states:::sagemaker:createModel',
86 | parameters={
87 | 'Containers': check_autopilot_job_status.output()['Payload']['InferenceContainers'],
88 | 'ModelName': execution_input['ModelName'],
89 | 'ExecutionRoleArn': sagemaker_exec_role
90 | }
91 | )
92 |
93 | endpoint_config_step = EndpointConfigStep(
94 | 'CreateModelEndpointConfig',
95 | endpoint_config_name=execution_input['EndpointConfigName'],
96 | model_name=execution_input['ModelName'],
97 | initial_instance_count=1,
98 | instance_type='ml.m4.xlarge',
99 | data_capture_config=DataCaptureConfig(
100 | enable_capture=True,
101 | sampling_percentage=100,
102 | )
103 | )
104 |
105 |
106 | endpoint_step = EndpointStep(
107 | 'UpdateModelEndpoint',
108 | endpoint_name=execution_input['EndpointName'],
109 | endpoint_config_name=execution_input['EndpointConfigName'],
110 | update=False
111 | )
112 |
113 | # define Amazon CodeBuild Step Functions Task
114 | deploy_rest_api_task = Task(
115 | 'DeployRestAPI',
116 | resource='arn:aws:states:::codebuild:startBuild.sync',
117 | parameters={
118 | 'ProjectName': utils.get_api_codebuild_project(),
119 | 'EnvironmentVariablesOverride': [
120 | {
121 | 'Name': 'SAGEMAKER_ENDPOINT',
122 | 'Type': 'PLAIN_TEXT',
123 | 'Value': execution_input['EndpointName']
124 | },
125 | {
126 | 'Name': 'SAGEMAKER_AUTOPILOT_TARGET_MODEL',
127 | 'Type': 'PLAIN_TEXT',
128 | 'Value': '{}.tar.gz'.format(execution_input['ModelName'])
129 | }
130 | ]
131 | }
132 | )
133 |
134 | # happy path
135 | model_and_endpoint_step = Chain([
136 | model_step,
137 | endpoint_config_step,
138 | endpoint_step,
139 | deploy_rest_api_task
140 | ])
141 |
142 |
143 | # define choice
144 | check_job_choice.add_choice(
145 | ChoiceRule.StringEquals(
146 | variable=check_autopilot_job_status.output()['Payload']['AutopilotJobStatus'],
147 | value='InProgress'
148 | ),
149 | next_step=check_autopilot_job_status
150 | )
151 |
152 | check_job_choice.add_choice(
153 | ChoiceRule.StringEquals(
154 | variable=check_autopilot_job_status.output()['Payload']['AutopilotJobStatus'],
155 | value='Stopping'
156 | ),
157 | next_step=check_autopilot_job_status
158 | )
159 |
160 | check_job_choice.add_choice(
161 | ChoiceRule.StringEquals(
162 | variable=check_autopilot_job_status.output()['Payload']['AutopilotJobStatus'],
163 | value='Failed'
164 | ),
165 | next_step=workflow_failure
166 | )
167 |
168 | check_job_choice.add_choice(
169 | ChoiceRule.StringEquals(
170 | variable=check_autopilot_job_status.output()['Payload']['AutopilotJobStatus'],
171 | value='Stopped'
172 | ),
173 | next_step=workflow_failure
174 | )
175 |
176 | check_job_choice.add_choice(
177 | ChoiceRule.StringEquals(
178 | variable=check_autopilot_job_status.output()['Payload']['AutopilotJobStatus'],
179 | value='Completed'
180 | ),
181 | next_step=model_and_endpoint_step
182 | )
183 |
184 | workflow_definition = Chain([
185 | create_autopilot_job_step,
186 | check_autopilot_job_status,
187 | check_job_wait_state,
188 | check_job_choice
189 | ])
190 |
191 | autopilot_ml_workflow = Workflow(
192 | name="AutopilotStateMachineWorkflow",
193 | definition=workflow_definition,
194 | role=utils.get_workflow_role()
195 | )
196 |
197 | try:
198 | state_machine_arn = autopilot_ml_workflow.create()
199 | except sfn_client.exceptions.StateMachineAlreadyExists as e:
200 | print(e.message)
201 | else:
202 | print("Updating workflow definition")
203 | state_machine_arn = autopilot_ml_workflow.update(workflow_definition)
204 |
205 |
206 | utils.save_state_machine_arn(state_machine_arn)
207 |
208 | timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())
209 |
210 | # Uncomment below when you're ready to execute workflow on deployment
211 | # autopilot_ml_workflow.execute(
212 | # inputs={
213 | # 'AutoMLJobName': f'autopilot-workflow-job-{timestamp_suffix}',
214 | # 'ModelName': f'autopilot-workflow-{timestamp_suffix}-model',
215 | # 'EndpointConfigName': f'autopilot-workflow-{timestamp_suffix}-endpoint-config',
216 | # 'EndpointName': f'autopilot-workflow-{timestamp_suffix}-endpoint',
217 | # 'S3InputData': '',
218 | # 'TargetColumnName': '',
219 | # 'S3OutputData': '',
220 | # 'IamRole': sagemaker_exec_role,
221 | # }
222 | # )
223 |
224 |
--------------------------------------------------------------------------------
/workflow/utils.py:
--------------------------------------------------------------------------------
1 | import boto3
2 |
3 | ssm_client = boto3.client('ssm')
4 |
5 |
6 | def get_workflow_role() -> str:
7 | """
8 | :return:
9 | """
10 | response = ssm_client.get_parameter(
11 | Name='AutopilotWorkflowExecRole',
12 | )
13 | return response['Parameter']['Value']
14 |
15 |
16 | def get_api_codebuild_project() -> str:
17 | """
18 | :return:
19 | """
20 | response = ssm_client.get_parameter(
21 | Name='RestApiBuildProject',
22 | )
23 | return response['Parameter']['Value']
24 |
25 |
26 | def get_sagemaker_execution_role():
27 | """
28 | Convert SageMaker Autopilot Inference Containers to PipelineModel
29 | :return:
30 | """
31 | response = ssm_client.get_parameter(
32 | Name='SageMakerExecutionRole',
33 | )
34 | return response['Parameter']['Value']
35 |
36 |
37 | def save_state_machine_arn(state_machine_arn: str):
38 | """
39 | Save state machine ARN to Amazon SSM Parameter Store
40 | :param state_machine_arn:
41 | :return:
42 | """
43 | response = ssm_client.put_parameter(
44 | Name='AutopilotStateMachineWorkflowArn',
45 | Description='SageMaker Autopilot Step Function State machine ARN',
46 | Value=state_machine_arn,
47 | Type='String',
48 | Overwrite=True
49 | )
50 | print(response)
51 | return None
52 |
53 |
--------------------------------------------------------------------------------