├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── customer-stack ├── create-resources.py ├── customer-stack.yml └── generate-layer.sh ├── images ├── image1.png ├── image2.png └── image3.png ├── policies.zip ├── sample-templates ├── private-link-setup.yml └── vpc-setup.yml └── snowflake-integration-overview.md /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Amazon SageMaker Integration with Snowflake 2 | 3 | You can use the CloudFormation template provided in this repository to add machine-learning capabilities to your Snowflake account using Amazon SageMaker. 4 | 5 | In order to use this package, you need a Snowflake account and an AWS account. 6 | After a few manual steps, the CloudFormation template can be deployed to your AWS account in order to create all the AWS resources (API Gateway, Lambda) and Snowflake resources (external functions) required. 7 | 8 | The instructions that follow allow you to set up and deploy the CloudFormation template for development/debugging/testing purposes. For a quick start user guide on how to set up your Snowflake account with Amazon SageMaker, please refer to the [Snowflake Integration Overview](snowflake-integration-overview.md) article. 9 | 10 | # Preparation 11 | 12 | These are the steps to prepare the CloudFormation template (customer-stack.yml) to be executable. 13 | 14 | Included in the repository are a minimal set of AWS policies needed to be able to run the CloudFormation template for integration with Snowflake. It is packaged in the policies.zip file that contains the various policies in json file format. 15 | 16 | ## Snowflake Resources needed 17 | 18 | Load a tabular dataset (i.e. a CSV file) into Snowflake and put it on a Snowflake table. For instance, you can use the Abalone data, originally from the UCI data repository (https://archive.ics.uci.edu/ml/datasets/abalone). 19 | 20 | ## Create an AWS secret containing the Credentials to access your Snowflake account 21 | 22 | The credentials to access Snowflake must be stored on a Secret in AWS Secret Manager. In order to set that up: 23 | 24 | 1. Go to the Secrets Manager console. 25 | 2. Click on *Store a new Secret* 26 | 3. Select *Other type of secrets* 27 | 4. On the *Secret key/value* tab fill 3 key/value rows: 28 | 29 | * accountid (this contains your Snowflake account id) 30 | * username (this contains your Snowflake username) 31 | * password (this contains your Snowflake password) 32 | 33 | If you click *Plaintext* you should see something like this: 34 | 35 | ``` 36 | { 37 | "accountid": "your_account_id", 38 | "username": "your_username", 39 | "password": "your_password" 40 | } 41 | ``` 42 | 43 | 5. Leave the default encryption key selected and click next. 44 | 6. Give a name to your Secret and click next (for example: mySecret). 45 | 46 | 47 | ## Private network setup 48 | The setup below is needed if the integration infrastructure is supposed to be deployed inside a VPC. 49 | 50 | ### VPC setup 51 | Sample VPC setup cloudformation template: [vpc-setup.yml](sample-templates/vpc-setup.yml) 52 | 53 | 54 | ### Snowflake Privatelink config and VPC id 55 | Get Snowflake PrivateLink configurations using the following command 56 | 57 | ``` 58 | select SYSTEM$GET_PRIVATELINK_CONFIG(); 59 | ``` 60 | 61 | Response output will have the following details: 62 | * `privatelink-account-name` 63 | * `privatelink-internal-stage` 64 | * `privatelink-account-url` 65 | * `privatelink-ocsp-url` 66 | * `privatelink-vpce-id` 67 | 68 | The resources above are needed for private link setup. See [Private Link setup](#private-link-setup). 69 | 70 | 71 | Get Snowflake VPC ID using the following command 72 | ``` 73 | select SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO(); 74 | ``` 75 | 76 | Response output response will have 77 | * `snowflake-vpc-id` 78 | 79 | This attribute is named as `snowflakeVpcId` in the cfn template parameters. 80 | 81 | ### Private Link setup 82 | Sample PrivateLink setup cloudformation template: [private-link-setup.yml](sample-templates/private-link-setup.yml) 83 | 84 | **Note**: The [vpc-setup.yml](sample-templates/vpc-setup.yml) and [private-link-setup.yml](sample-templates/private-link-setup.yml) are sample representative templates. It may not have everything for your usecase. 85 | 86 | # Creation of the stack 87 | 88 | ## CloudFormation Parameters 89 | 90 | These parameters are needed to create the stack. 91 | 92 | * s3BucketName: "Name of the S3 bucket to be created to store the training data and artifacts produced by the SageMaker AutoML jobs" 93 | * snowflakeSecretArn: "ARN of the AWS Secret containing the Snowflake login information" 94 | * kmsKeyArn (Optional): "ARN of the AWS Key Management Service key that Amazon SageMaker uses to encrypt job outputs. The KmsKeyId is applied to all outputs." 95 | * snowflakeRole (Optional): "Snowflake Role with permissions to create Storage and API Integrations" 96 | * snowflakeDatabaseName: "Snowflake Database in which external functions will be created" 97 | * snowflakeSchemaName: "Snowflake Database Schema in which external functions will be created" 98 | * apiGatewayName (Optional): "API Gateway name" 99 | * apiGatewayStageName (Optional): "API Gateway stage name" 100 | * apiGatewayType (Optional): "API Gateway type, it can be PRIVATE or REGIONAL. If not provided, then it defaults to REGIONAL " 101 | * snowflakeResourceSuffix (Optional): "Suffix for resources created in Snowflake. This suffix will be added to all function names created in the database schema." 102 | 103 | Following parameters are required if the setup needs to be inside a VPC. 104 | * snowflakeVpcId: "Snowflake VPC ID. Required if setup is to be done inside VPC" 105 | * vpcSecurityGroupIds: "List of security group ids" 106 | * vpcSubnetIds: "List of VPC subnet ids" 107 | 108 | ## Create the stack via the CLI 109 | 110 | You can create the stack via CLI by using these commands: 111 | 112 | ``` 113 | aws cloudformation create-stack \ 114 | --region YOUR_REGION \ 115 | --stack-name myteststack \ 116 | --template-body file://path/to/customer-stack.yml \ 117 | --capabilities CAPABILITY_NAMED_IAM \ 118 | --parameters ParameterKey=s3BucketName,ParameterValue=S3_BUCKET_NAME \ 119 | ParameterKey=snowflakeSecretArn,ParameterValue=CREDENTIALS_SECRET_ARN \ 120 | ParameterKey=kmsKeyArn,ParameterValue=KMS_KEY_ARN \ 121 | ParameterKey=snowflakeRole,ParameterValue=SNOWFLAKE_ROLE \ 122 | ParameterKey=snowflakeDatabaseName,ParameterValue=SNOWFLAKE_DATABASE_NAME \ 123 | ParameterKey=snowflakeSchemaName,ParameterValue=SNOWFLAKE_SCHEMA_NAME \ 124 | ParameterKey=apiGatewayName,ParameterValue=API_GW_NAME \ 125 | ParameterKey=apiGatewayStageName,ParameterValue=API_GW_STAGE_NAME \ 126 | ParameterKey=snowflakeResourceSuffix,ParameterValue=SUFFIX 127 | ``` 128 | 129 | If you want to create the setup in a VPC, use the below command: 130 | 131 | ``` 132 | aws cloudformation create-stack \ 133 | --region YOUR_REGION \ 134 | --stack-name myteststack \ 135 | --template-body file://path/to/customer-stack.yml \ 136 | --capabilities CAPABILITY_NAMED_IAM \ 137 | --parameters ParameterKey=s3BucketName,ParameterValue=S3_BUCKET_NAME \ 138 | ParameterKey=snowflakeSecretArn,ParameterValue=CREDENTIALS_SECRET_ARN \ 139 | ParameterKey=kmsKeyArn,ParameterValue=KMS_KEY_ARN \ 140 | ParameterKey=snowflakeRole,ParameterValue=SNOWFLAKE_ROLE \ 141 | ParameterKey=snowflakeDatabaseName,ParameterValue=SNOWFLAKE_DATABASE_NAME \ 142 | ParameterKey=snowflakeSchemaName,ParameterValue=SNOWFLAKE_SCHEMA_NAME \ 143 | ParameterKey=vpcSecurityGroupIds,ParameterValue=SECURITY_GROUPS \ 144 | ParameterKey=vpcSubnetIds,ParameterValue=VPC_SUBNETS \ 145 | ParameterKey=apiGatewayName,ParameterValue=API_GW_NAME \ 146 | ParameterKey=apiGatewayStageName,ParameterValue=API_GW_STAGE_NAME \ 147 | ParameterKey=apiGatewayType,ParameterValue=API_GW_TYPE \ 148 | ParameterKey=snowflakeVpcId,ParameterValue=SNOWFLAKE_VPC_ID \ 149 | ParameterKey=snowflakeResourceSuffix,ParameterValue=SUFFIX 150 | ``` 151 | 152 | **Note:** If the stack was created already, you can update it by changing *create-stack* by *update-stack* on the previous command. 153 | 154 | ## Create the stack via the Console 155 | 156 | If you want to do it via the console: 157 | 158 | * Go to CloudFormation 159 | * Create stack with new resources 160 | * Upload template file 161 | * Set the parameters 162 | * Click Next 163 | * Write the template name 164 | * Click next to create the resources 165 | 166 | 167 | ## Generate and upload Layer and Lambda code to an existing S3 bucket 168 | 169 | The Snowflake Python connector is not part of the AWS Lambda runtime. In order to load the Snowflake Python connectior into Lambda, we need to use a Lambda layer. 170 | 171 | Lambda layers take a ZIP file with the libraries (formatted according to the language used https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html) from S3 and loads them into the Lambda runtime environment. 172 | 173 | The Lambda layer ZIP file is hosted in a publicly accessible S3 bucket (`sagemaker-sample-files`) that this CloudFormation template refers to. In case you wish to generate the layer manually (for development/testing), please follow the instructions below. 174 | 175 | ### Generate ZIP file containing the Layer code 176 | 177 | The script *generate-layer.sh* located in the *customer-stack/* directory will be the responsible of downloading the needed files. 178 | 179 | In order to execute it, from a Linux terminal run: 180 | 181 | ``` 182 | % cd customer-stack/ 183 | % bash generate-layer.sh 184 | % cd layer/snowflake-connector-python/ 185 | % zip -r snowflake-connector-python-.zip . 186 | ``` 187 | 188 | These commands will generate a file called *snowflake-connector-python-.zip* containing the libraries for the Lambda. 189 | 190 | You can then upload the generated file in your S3 bucket and use the corresponding S3 URL as a reference for your Lambda layer. 191 | 192 | ### Generate ZIP file containing the Lambda code 193 | 194 | In order to load the libraries, the Lambda function can't be specified inline on the CloudFormation template (it will be visible and editable for the customers once the stack was created). 195 | 196 | As such, we need to ZIP the Lambda Python code and upload it in the same S3 bucket where we already uploaded the layer ZIP file in the previous step. 197 | 198 | From a Linux terminal, run: 199 | 200 | ``` 201 | % cd customer-stack/ 202 | zip -r create-resources-.zip create-resources.py 203 | ``` 204 | 205 | These commands will generate a file called *create-resources-.zip* containing the Lambda code. 206 | 207 | You can then upload the generated file in your S3 bucket and use the corresponding S3 URL as a reference for your Lambda function code. 208 | 209 | # APIs 210 | 211 | For detailed documentation about the APIs provided by the stack, please refer to the [Snowflake Integration Overview](snowflake-integration-overview.md) article. 212 | -------------------------------------------------------------------------------- /customer-stack/create-resources.py: -------------------------------------------------------------------------------- 1 | import json 2 | import boto3 3 | import os 4 | import logging 5 | from botocore.exceptions import ClientError 6 | import requests 7 | 8 | import snowflake.connector 9 | 10 | SUCCESS = 'SUCCESS' 11 | FAILED = 'FAILED' 12 | EMPTY_RESPONSE_DATA = {} 13 | 14 | EXTERNAL_ID = "external_id" 15 | SERVICE = "service" 16 | USER_ARN = "user_arn" 17 | 18 | logger = logging.getLogger(__name__) 19 | logger.setLevel(logging.INFO) 20 | 21 | def lambda_handler(event, context): 22 | 23 | # Get variables from os 24 | api_gateway_url = os.environ['ApiGatewayURL'] 25 | api_gateway_role_arn = os.environ['ApiGatewayRoleARN'] 26 | api_gateway_role_name = os.environ['ApiGatewayRoleName'] 27 | auto_ml_role_arn = os.environ['AutoMLRoleARN'] 28 | auto_ml_role_name = os.environ['AutoMLRoleName'] 29 | region_name = os.environ['Region'] 30 | s3_bucket_name = os.environ['S3BucketName'] 31 | secret_name = os.environ['SecretArn'] 32 | kms_key_arn = os.environ['KmsKeyArn'] 33 | vpc_security_group_ids = os.environ['VpcSecurityGroupIds'] 34 | vpc_subnet_ids = os.environ['VpcSubnetIds'] 35 | snowflake_role_name = os.environ['SnowflakeRole'] 36 | stack_name = os.environ['StackName'] 37 | database_name = os.environ['DatabaseName'] 38 | schema_name = os.environ['SchemaName'] 39 | apigw_type = os.environ['ApiGatewayType'] 40 | 41 | logger.info("api_gateway_url: " + api_gateway_url) 42 | logger.info("api_gateway_role_arn: " + api_gateway_role_arn) 43 | logger.info("api_gateway_role_name: " + api_gateway_role_name) 44 | logger.info("auto_ml_role_arn: " + auto_ml_role_arn) 45 | logger.info("auto_ml_role_name: " + auto_ml_role_name) 46 | logger.info("region_name: " + region_name) 47 | logger.info("s3_bucket_name: " + s3_bucket_name) 48 | logger.info("secret_name: " + secret_name) 49 | logger.info("kms_key_arn: " + kms_key_arn) 50 | logger.info("vpc_security_group_ids: " + vpc_security_group_ids) 51 | logger.info("vpc_subnet_ids: " + vpc_subnet_ids) 52 | logger.info("snowflake_role_name: " + snowflake_role_name) 53 | logger.info("stack_name: " + stack_name) 54 | logger.info("database_name: " + database_name) 55 | logger.info("schema_name: " + schema_name) 56 | logger.info("Snowflake resource suffix: " + os.environ['SnowflakeResourceSuffix']) 57 | 58 | # Delete 59 | if event['RequestType'] == 'Delete': 60 | logger.info("No action for Delete. Exiting.") 61 | sendResponse(event, context, SUCCESS, EMPTY_RESPONSE_DATA) 62 | return 63 | 64 | # Get the information connection from Secrets Manager 65 | try: 66 | get_secret_value_response = get_secret_information(region_name, secret_name) 67 | except: 68 | sendResponse(event, context, FAILED, EMPTY_RESPONSE_DATA) 69 | return 70 | 71 | # Decrypted secret using the associated KMS CMK 72 | # Ensure the Secret is in String mode 73 | if 'SecretString' not in get_secret_value_response: 74 | logger.error("The Secret is not in String mode") 75 | sendResponse(event, context, FAILED, EMPTY_RESPONSE_DATA) 76 | return 77 | 78 | # Create Snowflake resource 79 | try: 80 | snowflake_connection = connect_to_snowflake(get_secret_value_response, snowflake_role_name) 81 | snowflake_cursor = snowflake_connection.cursor() 82 | 83 | snowflake_cursor.execute(("use database %s;") % (database_name)) 84 | 85 | snowflake_cursor.execute(("use schema %s;") % (schema_name)) 86 | 87 | storage_integration_name = "AWS_AUTOPILOT_STORAGE_INTEGRATION" + "_" + stack_name 88 | api_integration_name = "AWS_AUTOPILOT_API_INTEGRATION" + "_" + stack_name 89 | 90 | # Create Snowflake Integrations 91 | create_storage_integration(snowflake_cursor, storage_integration_name, auto_ml_role_arn, s3_bucket_name) 92 | create_api_integration(snowflake_cursor, api_integration_name, api_gateway_role_arn, api_gateway_url, apigw_type) 93 | create_external_functions(snowflake_cursor, api_integration_name, auto_ml_role_arn, api_gateway_url, 94 | s3_bucket_name, secret_name, storage_integration_name, snowflake_role_name, 95 | kms_key_arn, vpc_security_group_ids, vpc_subnet_ids) 96 | 97 | # Describe Snowflake integrations 98 | storage_integration_info = get_storage_integration_info_for_policy(snowflake_cursor, storage_integration_name) 99 | api_integration_info = get_api_integration_info_for_policy(snowflake_cursor, api_integration_name) 100 | except Exception as e: 101 | logger.exception('Problem running SQL statements: ' + str(e)) 102 | responseData = {'Failed': 'Unable to execute SQL statements in Snowflake'} 103 | sendResponse(event, context, FAILED, responseData) 104 | return 105 | finally: 106 | if 'snowflake_cursor' in vars(): 107 | snowflake_cursor.close() 108 | if 'snowflake_connection' in vars(): 109 | snowflake_connection.close() 110 | 111 | # Update IAM role to add Snowflake information 112 | logger.info("Updating IAM Role") 113 | storage_integration_policy_str = create_policy_string(storage_integration_info) 114 | api_integration_policy_str = create_policy_string(api_integration_info) 115 | 116 | try: 117 | update_assume_role_policy(storage_integration_policy_str, auto_ml_role_name) 118 | update_assume_role_policy(api_integration_policy_str, api_gateway_role_name) 119 | except Exception as e: 120 | logger.exception('Problem updating assume role policy: ' + str(e)) 121 | responseData = {'Failed': 'There was a problem updating the assume role policies'} 122 | sendResponse(event, context, FAILED, responseData) 123 | return 124 | 125 | responseData = {'Success': 'Snowflake resources created.'} 126 | sendResponse(event, context, SUCCESS, responseData) 127 | logger.info("Success") 128 | 129 | def get_secret_information(region_name, secret_name): 130 | logger.info("Getting secret information") 131 | try: 132 | secretsmanager = boto3.client('secretsmanager') 133 | 134 | return secretsmanager.get_secret_value( 135 | SecretId=secret_name 136 | ) 137 | except ClientError as e: 138 | if e.response['Error']['Code'] == 'ResourceNotFoundException': 139 | logger.exception("The requested secret " + secret_name + " was not found") 140 | else: 141 | logger.exception(e) 142 | raise e 143 | 144 | def connect_to_snowflake(get_secret_value_response, snowflake_role_name): 145 | secret_string = get_secret_value_response['SecretString'] 146 | 147 | secret = json.loads(secret_string) 148 | snowflake_account = secret['accountid'] 149 | snowflake_password = secret['password'] 150 | snowflake_userName = secret['username'] 151 | 152 | # Connect to Snowflake 153 | logger.info("Connecting to Snowflake") 154 | snowflake_connection = snowflake.connector.connect( 155 | user=snowflake_userName, 156 | password=snowflake_password, 157 | account=snowflake_account, 158 | role=snowflake_role_name 159 | ) 160 | 161 | return snowflake_connection 162 | 163 | def sendResponse(event, context, responseStatus, responseData): 164 | responseBody = {'Status': responseStatus, 165 | 'Reason': 'See the details in CloudWatch Log Stream: ' + context.log_stream_name, 166 | 'PhysicalResourceId': context.log_stream_name, 167 | 'StackId': event['StackId'], 168 | 'RequestId': event['RequestId'], 169 | 'LogicalResourceId': event['LogicalResourceId'], 170 | 'Data': responseData} 171 | req = requests.put(event['ResponseURL'], data=json.dumps(responseBody)) 172 | if req.status_code != 200: 173 | raise Exception('Received a non-200 HTTP response while sending response to CloudFormation.') 174 | return 175 | 176 | def create_storage_integration(snowflake_cursor, storage_integration_name, auto_ml_role_arn, s3_bucket_name): 177 | logger.info("Creating Storage Integration [storage_integration_name=%s, auto_ml_role_arn=%s, s3_bucket_name=%s]", 178 | storage_integration_name, auto_ml_role_arn, s3_bucket_name) 179 | 180 | storage_integration_str = ("create or replace storage integration \"%s\" \ 181 | type = external_stage \ 182 | storage_provider = s3 \ 183 | enabled = true \ 184 | storage_aws_role_arn = '%s' \ 185 | storage_allowed_locations = ('s3://%s')") % (storage_integration_name, auto_ml_role_arn, s3_bucket_name) 186 | 187 | snowflake_cursor.execute(storage_integration_str) 188 | 189 | def create_api_integration(snowflake_cursor, api_integration_name, api_gateway_role_arn, api_gateway_url, apigw_type): 190 | logger.info("Creating API Integration [api_integration_name=%s, api_gateway_role_arn=%s, api_gateway_url=%s]", 191 | api_integration_name, api_gateway_role_arn, api_gateway_url) 192 | 193 | apigw_provider = "aws_api_gateway" 194 | if apigw_type == 'PRIVATE': 195 | apigw_provider = "aws_private_api_gateway" 196 | 197 | api_integration_str = ("create or replace api integration \"%s\" \ 198 | api_provider = '%s' \ 199 | api_aws_role_arn = '%s' \ 200 | api_allowed_prefixes = ('%s') \ 201 | enabled = true \ 202 | ") % (api_integration_name, apigw_provider, api_gateway_role_arn, api_gateway_url) 203 | 204 | snowflake_cursor.execute(api_integration_str) 205 | 206 | 207 | def create_external_functions(snowflake_cursor, api_integration_name, auto_ml_role_arn, api_gateway_url, s3_bucket_name, 208 | secret_arn, storage_integration_name, snowflake_role_name, 209 | kms_key_arn, vpc_security_group_ids, vpc_subnet_ids): 210 | create_describemodel_ef(snowflake_cursor, api_integration_name, api_gateway_url) 211 | create_createendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url) 212 | create_createendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url) 213 | create_describeendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url) 214 | create_deleteendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url) 215 | create_predictoutcome_ef(snowflake_cursor, api_integration_name, api_gateway_url) 216 | create_createmodel_ef(snowflake_cursor, api_integration_name, api_gateway_url, secret_arn, s3_bucket_name, 217 | storage_integration_name, auto_ml_role_arn, snowflake_role_name, 218 | kms_key_arn, vpc_security_group_ids, vpc_subnet_ids) 219 | create_deleteendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url) 220 | create_describeendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url) 221 | 222 | 223 | def create_describemodel_ef(snowflake_cursor, api_integration_name, api_gateway_url): 224 | logger.info("Creating External function: AWS_AUTOPILOT_DESCRIBE_MODEL [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 225 | 226 | describemodel_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 227 | returns OBJECT LANGUAGE JAVASCRIPT AS 228 | $$ 229 | let item = EVENT.body.data[0][1]; 230 | let payload = { 231 | \"AutoMLJobName\" : item + \"-job\" 232 | }; 233 | return {\"body\": JSON.stringify(payload)}; 234 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_MODEL_REQUEST_TRANSLATOR")) 235 | 236 | snowflake_cursor.execute(describemodel_request_translator_str) 237 | 238 | describemodel_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 239 | returns OBJECT LANGUAGE JAVASCRIPT AS 240 | $$ 241 | let responseBody = EVENT.body; 242 | let response ={}; 243 | response[\"JobStatus\"] = responseBody.AutoMLJobStatus; 244 | response[\"JobStatusDetails\"] = responseBody.AutoMLJobSecondaryStatus; 245 | if (responseBody.AutoMLJobStatus === \"Completed\") 246 | { 247 | if (responseBody.BestCandidate) { 248 | response[\"ObjectiveMetric\"] = responseBody.BestCandidate.FinalAutoMLJobObjectiveMetric.MetricName; 249 | response[\"BestObjectiveMetric\"] = responseBody.BestCandidate.FinalAutoMLJobObjectiveMetric.Value; 250 | } 251 | } else if (responseBody.AutoMLJobStatus === \"Failed\") 252 | { 253 | response[\"FailureReason\"] = responseBody.FailureReason; 254 | } 255 | 256 | response[\"PartialFailureReasons\"] = responseBody.PartialFailureReasons; 257 | 258 | return {\"body\":{ \"data\" : [[0,response]] }}; 259 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_MODEL_RESPONSE_TRANSLATOR")) 260 | 261 | snowflake_cursor.execute(describemodel_response_translator_str) 262 | 263 | create_describemodel_ef_str = ("""create or replace external function %s(modelname varchar) 264 | returns variant 265 | api_integration = \"%s\" 266 | request_translator =%s 267 | response_translator=%s 268 | max_batch_rows=1 269 | as '%s/describemodel';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_MODEL"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_MODEL_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_MODEL_RESPONSE_TRANSLATOR"), api_gateway_url) 270 | 271 | snowflake_cursor.execute(create_describemodel_ef_str) 272 | 273 | 274 | def create_createendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url): 275 | logger.info("Creating External function: AWS_AUTOPILOT_CREATE_ENDPOINT [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 276 | 277 | createendpoint_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 278 | returns OBJECT LANGUAGE JAVASCRIPT AS 279 | $$ 280 | let endpointName = EVENT.body.data[0][1]; 281 | let endpointConfigName = EVENT.body.data[0][2]; 282 | let endpointTTL = 7*24*60*60; 283 | 284 | if (EVENT.body.data[0][3] != undefined) { 285 | endpointTTL = EVENT.body.data[0][3]; 286 | } 287 | 288 | let payload = { 289 | \"EndpointName\" : endpointName, 290 | \"EndpointConfigName\" : endpointConfigName, 291 | \"DeletionCondition\": { 292 | \"MaxRuntimeInSeconds\": endpointTTL 293 | } 294 | }; 295 | return {\"body\": payload}; 296 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_REQUEST_TRANSLATOR")) 297 | 298 | snowflake_cursor.execute(createendpoint_request_translator_str) 299 | 300 | createendpoint_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 301 | returns OBJECT LANGUAGE JAVASCRIPT AS 302 | $$ 303 | return {\"body\": { \"data\" : [[0, EVENT.body]] }} 304 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_RESPONSE_TRANSLATOR")) 305 | 306 | snowflake_cursor.execute(createendpoint_response_translator_str) 307 | 308 | create_createendpoint_ef_str = ("""create or replace external function %s(endpointName varchar, endpointConfigName varchar, endpointTTL integer) 309 | returns variant 310 | api_integration = \"%s\" 311 | request_translator = %s 312 | response_translator=%s 313 | max_batch_rows=1 314 | as '%s/createendpoint';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_RESPONSE_TRANSLATOR"), api_gateway_url) 315 | 316 | snowflake_cursor.execute(create_createendpoint_ef_str) 317 | 318 | 319 | def create_createendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url): 320 | logger.info("Creating External function: AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 321 | 322 | createendpointconfig_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 323 | returns OBJECT LANGUAGE JAVASCRIPT AS 324 | $$ 325 | let endpointConfigName = EVENT.body.data[0][1]; 326 | let modelName = EVENT.body.data[0][2]; 327 | let instanceType = EVENT.body.data[0][3]; 328 | let instanceCount = EVENT.body.data[0][4]; 329 | let payload = { 330 | \"EndpointConfigName\": endpointConfigName, 331 | \"ProductionVariants\" : [ 332 | { 333 | \"InstanceType\": instanceType, 334 | \"ModelName\": modelName + \"-job-best-model\", 335 | \"InitialInstanceCount\": instanceCount, 336 | \"VariantName\" : \"AllTrafficVariant\" 337 | }] 338 | }; 339 | return {\"body\": payload}; 340 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR")) 341 | 342 | snowflake_cursor.execute(createendpointconfig_request_translator_str) 343 | 344 | createendpointconfig_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 345 | returns OBJECT LANGUAGE JAVASCRIPT AS 346 | $$ 347 | return {\"body\": { \"data\" : [[0, EVENT.body]] }}; 348 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR")) 349 | 350 | snowflake_cursor.execute(createendpointconfig_response_translator_str) 351 | 352 | create_createendpointconfig_ef_str = ("""create or replace external function %s(endpointConfigName varchar, modelName varchar, instanceType varchar, instanceCount int) 353 | returns variant 354 | api_integration = \"%s\" 355 | request_translator = %s 356 | response_translator=%s 357 | max_batch_rows=1 358 | as '%s/createendpointconfig';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR"), api_gateway_url) 359 | 360 | snowflake_cursor.execute(create_createendpointconfig_ef_str) 361 | 362 | def create_describeendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url): 363 | logger.info("Creating External function: AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 364 | 365 | describeendpointconfig_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 366 | returns OBJECT LANGUAGE JAVASCRIPT AS 367 | $$ 368 | let endpointConfigName = EVENT.body.data[0][1]; 369 | let payload = { 370 | \"EndpointConfigName\": endpointConfigName 371 | }; 372 | return {\"body\": payload}; 373 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR")) 374 | 375 | snowflake_cursor.execute(describeendpointconfig_request_translator_str) 376 | 377 | describeendpointconfig_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 378 | returns OBJECT LANGUAGE JAVASCRIPT AS 379 | $$ 380 | return {\"body\": { \"data\" : [[0, EVENT.body]] }}; 381 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR")) 382 | 383 | snowflake_cursor.execute(describeendpointconfig_response_translator_str) 384 | 385 | create_describeendpointconfig_ef_str = ("""create or replace external function %s(endpointConfigName varchar) 386 | returns variant 387 | api_integration = \"%s\" 388 | request_translator = %s 389 | response_translator=%s 390 | max_batch_rows=1 391 | as '%s/describeendpointconfig';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR"), api_gateway_url) 392 | 393 | snowflake_cursor.execute(create_describeendpointconfig_ef_str) 394 | 395 | def create_deleteendpointconfig_ef(snowflake_cursor, api_integration_name, api_gateway_url): 396 | logger.info("Creating External function: AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 397 | 398 | deleteendpointconfig_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 399 | returns OBJECT LANGUAGE JAVASCRIPT AS 400 | $$ 401 | let endpointConfigName = EVENT.body.data[0][1]; 402 | let payload = { 403 | \"EndpointConfigName\": endpointConfigName 404 | }; 405 | return {\"body\": payload}; 406 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR")) 407 | 408 | snowflake_cursor.execute(deleteendpointconfig_request_translator_str) 409 | 410 | deleteendpointconfig_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 411 | returns OBJECT LANGUAGE JAVASCRIPT AS 412 | $$ 413 | return {\"body\": { \"data\" : [[0, EVENT.body]] }}; 414 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR")) 415 | 416 | snowflake_cursor.execute(deleteendpointconfig_response_translator_str) 417 | 418 | create_deleteendpointconfig_ef_str = ("""create or replace external function %s(endpointConfigName varchar) 419 | returns variant 420 | api_integration = \"%s\" 421 | request_translator = %s 422 | response_translator=%s 423 | max_batch_rows=1 424 | as '%s/deleteendpointconfig';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR"), api_gateway_url) 425 | 426 | snowflake_cursor.execute(create_deleteendpointconfig_ef_str) 427 | 428 | def create_describeendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url): 429 | logger.info("Creating External function: AWS_AUTOPILOT_DESCRIBE_ENDPOINT [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 430 | 431 | describeendpoint_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 432 | returns OBJECT LANGUAGE JAVASCRIPT AS 433 | $$ 434 | let endpointName = EVENT.body.data[0][1]; 435 | let payload = { 436 | \"EndpointName\" : endpointName 437 | }; 438 | return {\"body\": JSON.stringify(payload)}; 439 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_REQUEST_TRANSLATOR")) 440 | 441 | snowflake_cursor.execute(describeendpoint_request_translator_str) 442 | 443 | describeendpoint_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 444 | returns OBJECT LANGUAGE JAVASCRIPT AS 445 | $$ 446 | return {\"body\": { \"data\" : [[0, EVENT.body]] }} 447 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_RESPONSE_TRANSLATOR")) 448 | 449 | snowflake_cursor.execute(describeendpoint_response_translator_str) 450 | 451 | create_describeendpoint_ef_str = ("""create or replace external function %s(endpointName varchar) 452 | returns variant 453 | api_integration = \"%s\" 454 | request_translator = %s 455 | response_translator=%s 456 | max_batch_rows=1 457 | as '%s/describeendpoint';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_DESCRIBE_ENDPOINT_RESPONSE_TRANSLATOR"), api_gateway_url) 458 | 459 | snowflake_cursor.execute(create_describeendpoint_ef_str) 460 | 461 | 462 | def create_deleteendpoint_ef(snowflake_cursor, api_integration_name, api_gateway_url): 463 | logger.info("Creating External function: AWS_AUTOPILOT_DELETE_ENDPOINT [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 464 | 465 | deleteendpoint_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 466 | returns OBJECT LANGUAGE JAVASCRIPT AS 467 | $$ 468 | let endpointName = EVENT.body.data[0][1]; 469 | let payload = { 470 | \"EndpointName\" : endpointName 471 | }; 472 | return {\"body\": JSON.stringify(payload)}; 473 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_REQUEST_TRANSLATOR")) 474 | 475 | snowflake_cursor.execute(deleteendpoint_request_translator_str) 476 | 477 | deleteendpoint_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 478 | returns OBJECT LANGUAGE JAVASCRIPT AS 479 | $$ 480 | return {\"body\": { \"data\" : [[0, EVENT.body]] }} 481 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_RESPONSE_TRANSLATOR")) 482 | 483 | snowflake_cursor.execute(deleteendpoint_response_translator_str) 484 | 485 | create_deleteendpoint_ef_str = ("""create or replace external function %s(endpointName varchar) 486 | returns variant 487 | api_integration = \"%s\" 488 | request_translator = %s 489 | response_translator=%s 490 | max_batch_rows=1 491 | as '%s/deleteendpoint';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_DELETE_ENDPOINT_RESPONSE_TRANSLATOR"), api_gateway_url) 492 | 493 | snowflake_cursor.execute(create_deleteendpoint_ef_str) 494 | 495 | 496 | def create_predictoutcome_ef(snowflake_cursor, api_integration_name, api_gateway_url): 497 | logger.info("Creating External function: AWS_AUTOPILOT_PREDICT_OUTCOME [api_integration_name=%s, api_gateway_url=%s]", api_integration_name, api_gateway_url) 498 | 499 | predictoutcome_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 500 | returns OBJECT LANGUAGE JAVASCRIPT AS 501 | $$ 502 | let endpointName = \"/\" + encodeURIComponent(EVENT.body.data[0][1]); 503 | var payload = []; 504 | for(i = 0; i < EVENT.body.data.length; i++) { 505 | var row = EVENT.body.data[i]; 506 | payload[i] = row[2]; 507 | } 508 | payloadBody = payload.map(e => e.join(',')).join('\\n'); 509 | return {\"body\": payloadBody, \"urlSuffix\" : endpointName}; 510 | $$""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_PREDICT_OUTCOME_REQUEST_TRANSLATOR")) 511 | 512 | snowflake_cursor.execute(predictoutcome_request_translator_str) 513 | 514 | predictoutcome_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 515 | returns OBJECT LANGUAGE JAVASCRIPT AS 516 | $$ 517 | let array_of_rows_to_return = []; 518 | let rows = EVENT.body.predictions; 519 | for (let i = 0; i < rows.length; i++) { 520 | let row_to_return = [i, rows[i]]; 521 | array_of_rows_to_return.push(row_to_return); 522 | } 523 | return {\"body\": {\"data\": array_of_rows_to_return}}; 524 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_PREDICT_OUTCOME_RESPONSE_TRANSLATOR")) 525 | 526 | snowflake_cursor.execute(predictoutcome_response_translator_str) 527 | 528 | create_predictoutcome_ef_str = ("""create or replace external function %s(endpointName varchar, columns array) 529 | returns variant 530 | api_integration = \"%s\" 531 | request_translator = %s 532 | response_translator=%s 533 | max_batch_rows=100 534 | as '%s/predictoutcome';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_PREDICT_OUTCOME"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_PREDICT_OUTCOME_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_PREDICT_OUTCOME_RESPONSE_TRANSLATOR"), api_gateway_url) 535 | 536 | snowflake_cursor.execute(create_predictoutcome_ef_str) 537 | 538 | 539 | def create_createmodel_ef(snowflake_cursor, api_integration_name, api_gateway_url, secret_arn, s3_bucket_name, 540 | storage_integration_name, auto_ml_role_arn, snowflake_role_name, 541 | kms_key_arn, vpc_security_group_ids, vpc_subnet_ids): 542 | logger.info( 543 | "Creating External function: AWS_AUTOPILOT_CREATE_MODEL [api_integration_name=%s, api_gateway_url=%s, secret_arn=%s, s3_bucket_name=%s, storage_integration_name=%s, auto_ml_role_arn=%s, snowflake_role_name=%s, kms_key_arn=%s, vpc_security_group_ids=%s, vpc_subnet_ids=%s]", 544 | api_integration_name, api_gateway_url, secret_arn, s3_bucket_name, storage_integration_name, auto_ml_role_arn, 545 | snowflake_role_name, kms_key_arn, vpc_security_group_ids, vpc_subnet_ids) 546 | 547 | vpc_security_group_ids_with_quotes = add_quotes_to_comma_delimited_list_items(vpc_security_group_ids) 548 | vpc_subnet_ids_with_quotes = add_quotes_to_comma_delimited_list_items(vpc_subnet_ids) 549 | logger.info("vpc_security_group_ids_with_quotes = %s, vpc_subnet_ids_with_quotes = %s", vpc_security_group_ids_with_quotes, vpc_subnet_ids_with_quotes) 550 | 551 | createmodel_request_translator_str = ("""create or replace function %s(EVENT OBJECT) 552 | returns OBJECT LANGUAGE JAVASCRIPT AS 553 | $$ 554 | let modelname = EVENT.body.data[0][1]; 555 | let targetTable = EVENT.body.data[0][2]; 556 | let targetCol = EVENT.body.data[0][3]; 557 | let maxRunningTime = 24*60*60; 558 | let deployModel = true; 559 | let modelEndpointTTL = 7*24*60*60; 560 | let problemType; 561 | let objectiveMetric; 562 | let maxCandidates; 563 | 564 | if (EVENT.body.data[0].length == 10) { 565 | if (EVENT.body.data[0][4] != undefined) { 566 | objectiveMetric = EVENT.body.data[0][4]; 567 | } 568 | 569 | if (EVENT.body.data[0][5] != undefined) { 570 | problemType = EVENT.body.data[0][5]; 571 | } 572 | 573 | if (EVENT.body.data[0][6] != undefined) { 574 | maxCandidates = EVENT.body.data[0][6]; 575 | } 576 | 577 | if (EVENT.body.data[0][7] != undefined) { 578 | maxRunningTime = EVENT.body.data[0][7]; 579 | } 580 | 581 | if (EVENT.body.data[0][8] != undefined) { 582 | deployModel = EVENT.body.data[0][8]; 583 | } 584 | 585 | if (EVENT.body.data[0][9] != undefined) { 586 | modelEndpointTTL = EVENT.body.data[0][9]; 587 | } 588 | } 589 | 590 | let contextHeaders = EVENT.contextHeaders; 591 | let jobDatasetsPath = modelname + \"-job/datasets/\" ; 592 | let databaseName = contextHeaders[\"sf-context-current-database\"]; 593 | let schemaName = contextHeaders[\"sf-context-current-schema\"]; 594 | let tableNameComponents = targetTable.split(\".\"); 595 | let s3OutputUri = \"s3://%s/output/\"; 596 | let kmsKeyArn = \"%s\"; 597 | let vpcSecurityGroupIds = [%s]; 598 | let vpcSubnetIds = [%s]; 599 | if (tableNameComponents.length === 3) 600 | { 601 | databaseName = tableNameComponents[0]; 602 | schemaName = tableNameComponents[1]; 603 | targetTable = tableNameComponents[2]; 604 | 605 | } else if (tableNameComponents.length === 2) 606 | { 607 | schemaName = tableNameComponents[0]; 608 | targetTable = tableNameComponents[1]; 609 | } 610 | 611 | let payload = { 612 | \"AutoMLJobConfig\": { 613 | \"CompletionCriteria\": { 614 | \"MaxAutoMLJobRuntimeInSeconds\": maxRunningTime 615 | } 616 | }, 617 | \"AutoMLJobName\": modelname + \"-job\", 618 | \"InputDataConfig\": [ 619 | { 620 | \"TargetAttributeName\": targetCol.toUpperCase(), 621 | \"AutoMLDatasetDefinition\": { 622 | \"AutoMLSnowflakeDatasetDefinition\": { 623 | \"Warehouse\": contextHeaders[\"sf-context-current-warehouse\"], 624 | \"Database\": databaseName, 625 | \"Schema\": schemaName, 626 | \"TableName\": targetTable, 627 | \"SnowflakeRole\": \"%s\", 628 | \"SecretArn\": \"%s\", 629 | \"OutputS3Uri\": s3OutputUri + jobDatasetsPath, 630 | \"StorageIntegration\": \"%s\" 631 | } 632 | } 633 | } 634 | ], 635 | \"OutputDataConfig\": { 636 | \"S3OutputPath\": s3OutputUri 637 | }, 638 | \"RoleArn\": \"%s\" 639 | }; 640 | 641 | if (objectiveMetric) { 642 | payload[\"AutoMLJobObjective\"] = { 643 | \"MetricName\": objectiveMetric 644 | }; 645 | } 646 | if (problemType) { 647 | payload[\"ProblemType\"] = problemType; 648 | } 649 | if (kmsKeyArn) { 650 | payload[\"OutputDataConfig\"][\"KmsKeyId\"] = kmsKeyArn; 651 | payload[\"InputDataConfig\"][\"AutoMLSnowflakeDatasetDefinition\"] = { 652 | \"KmsKeyId\" : kmsKeyArn 653 | }; 654 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] = payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] || {}; 655 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] = { 656 | \"VolumeKmsKeyId\": kmsKeyArn, 657 | \"EnableInterContainerTrafficEncryption\": true 658 | }; 659 | } 660 | if (vpcSecurityGroupIds.length > 0) { 661 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] = payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] || {}; 662 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"] = payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"] || {}; 663 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"][\"SecurityGroupIds\"] = vpcSecurityGroupIds; 664 | } 665 | if (vpcSubnetIds.length > 0) { 666 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] = payload[\"AutoMLJobConfig\"][\"SecurityConfig\"] || {}; 667 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"] = payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"] || {}; 668 | payload[\"AutoMLJobConfig\"][\"SecurityConfig\"][\"VpcConfig\"][\"Subnets\"] = vpcSubnetIds; 669 | } 670 | 671 | if(deployModel) { 672 | payload[\"ModelDeployConfig\"] ={ 673 | \"ModelDeployMode\": \"Endpoint\", 674 | \"EndpointConfigDefinitions\": [ 675 | { 676 | \"EndpointConfigName\": modelname + \"-m5-4xl-2\", 677 | \"InitialInstanceCount\": 2, 678 | \"InstanceType\": \"ml.m5.4xlarge\" 679 | } 680 | ], 681 | \"EndpointDefinitions\": [ 682 | { 683 | \"EndpointName\": modelname, 684 | \"EndpointConfigName\": modelname + \"-m5-4xl-2\", 685 | \"DeletionCondition\": { 686 | \"MaxRuntimeInSeconds\": modelEndpointTTL 687 | } 688 | } 689 | ] 690 | }; 691 | } 692 | 693 | if (maxCandidates) { 694 | payload[\"AutoMLJobConfig\"][\"CompletionCriteria\"][\"MaxCandidates\"] = maxCandidates; 695 | } 696 | 697 | return {\"body\": JSON.stringify(payload)}; 698 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_MODEL_REQUEST_TRANSLATOR"), s3_bucket_name, kms_key_arn, vpc_security_group_ids_with_quotes, vpc_subnet_ids_with_quotes, snowflake_role_name, secret_arn, storage_integration_name, auto_ml_role_arn) 699 | 700 | snowflake_cursor.execute(createmodel_request_translator_str) 701 | 702 | createmodel_response_translator_str = ("""create or replace function %s(EVENT OBJECT) 703 | returns OBJECT LANGUAGE JAVASCRIPT AS 704 | $$ 705 | let arn = EVENT.body.AutoMLJobArn; 706 | let message = \"Model creation in progress. Job ARN = \" + arn; 707 | return {\"body\": { \"data\" : [[0, message]] }} 708 | $$;""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_MODEL_RESPONSE_TRANSLATOR")) 709 | 710 | snowflake_cursor.execute(createmodel_response_translator_str) 711 | 712 | create_createmodel_ef_str = ("""create or replace external function %s(modelname varchar, targettable varchar, targetcol varchar) 713 | returns variant 714 | api_integration = \"%s\" 715 | context_headers = (CURRENT_DATABASE, CURRENT_SCHEMA, CURRENT_WAREHOUSE) 716 | request_translator = %s 717 | response_translator=%s 718 | max_batch_rows=1 719 | as '%s/createmodel';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_MODEL"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_MODEL_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_MODEL_RESPONSE_TRANSLATOR"), api_gateway_url) 720 | 721 | snowflake_cursor.execute(create_createmodel_ef_str) 722 | 723 | create_createmodel_ef_str2 = ("""create or replace external function %s(modelname varchar, targettable varchar, 724 | targetcol varchar, objectiveMetric varchar, problemType varchar, maxCandidates integer, maxRunningTime integer, deployModel boolean, modelEndpointTTL integer) 725 | returns variant 726 | api_integration = \"%s\" 727 | context_headers = (CURRENT_DATABASE, CURRENT_SCHEMA, CURRENT_WAREHOUSE) 728 | request_translator = %s 729 | response_translator=%s 730 | max_batch_rows=1 731 | as '%s/createmodel';""") % (add_snowflake_resource_suffix("AWS_AUTOPILOT_CREATE_MODEL"), api_integration_name, get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_MODEL_REQUEST_TRANSLATOR"), get_full_resource_name_with_suffix("AWS_AUTOPILOT_CREATE_MODEL_RESPONSE_TRANSLATOR"), api_gateway_url) 732 | 733 | snowflake_cursor.execute(create_createmodel_ef_str2) 734 | 735 | 736 | def get_storage_integration_info_for_policy(snowflake_cursor, storage_integration_name): 737 | logger.info("Describing Storage Integration") 738 | storage_user_arn = '' 739 | storage_external_id = '' 740 | 741 | snowflake_cursor.execute(("describe integration \"%s\"") % (storage_integration_name)) 742 | rows = snowflake_cursor.fetchall() 743 | for row in rows: 744 | value = list(row) 745 | if (value[0] == "STORAGE_AWS_IAM_USER_ARN"): 746 | storage_user_arn = value[2] 747 | if (value[0] == "STORAGE_AWS_EXTERNAL_ID"): 748 | storage_external_id = value[2] 749 | return { 750 | SERVICE: "sagemaker.amazonaws.com", 751 | USER_ARN: storage_user_arn, 752 | EXTERNAL_ID: storage_external_id 753 | } 754 | 755 | def get_api_integration_info_for_policy(snowflake_cursor, api_integration_name): 756 | logger.info("Describing API Integration") 757 | storage_user_arn = '' 758 | storage_external_id = '' 759 | 760 | snowflake_cursor.execute(("describe integration \"%s\"") % (api_integration_name)) 761 | rows = snowflake_cursor.fetchall() 762 | for row in rows: 763 | value = list(row) 764 | if (value[0] == "API_AWS_IAM_USER_ARN"): 765 | api_user_arn = value[2] 766 | if (value[0] == "API_AWS_EXTERNAL_ID"): 767 | api_external_id = value[2] 768 | return { 769 | SERVICE: "apigateway.amazonaws.com", 770 | USER_ARN: api_user_arn, 771 | EXTERNAL_ID: api_external_id 772 | } 773 | 774 | def create_policy_string(integration_info): 775 | policy_json = { 776 | "Version": "2012-10-17", 777 | "Statement":[ 778 | { 779 | "Effect": "Allow", 780 | "Principal": {"Service":[integration_info[SERVICE]]}, 781 | "Action": "sts:AssumeRole" 782 | }, 783 | { 784 | "Effect": "Allow", 785 | "Principal": { 786 | "AWS":[integration_info[USER_ARN]] 787 | }, 788 | "Action": "sts:AssumeRole", 789 | "Condition": { 790 | "StringEquals": { 791 | "sts:ExternalId": integration_info[EXTERNAL_ID] 792 | } 793 | } 794 | } 795 | ] 796 | } 797 | return json.dumps(policy_json) 798 | 799 | def update_assume_role_policy(policy_str, role_name): 800 | logger.info('Updating assume role policy for role: ' + role_name) 801 | logger.info('Policy used: ' + policy_str) 802 | iam = boto3.client('iam') 803 | iam.update_assume_role_policy( 804 | PolicyDocument=policy_str, 805 | RoleName=role_name 806 | ) 807 | 808 | def add_quotes_to_comma_delimited_list_items(comma_delimited_list: str): 809 | if comma_delimited_list: 810 | items = comma_delimited_list.replace(" ", "").split(",") 811 | comma_delimited_list_with_quotes = ', '.join('"' + item + '"' for item in items) 812 | else: 813 | comma_delimited_list_with_quotes = '' 814 | return comma_delimited_list_with_quotes 815 | 816 | def add_snowflake_resource_suffix(resource_name: str): 817 | suffix = os.environ['SnowflakeResourceSuffix'] 818 | 819 | if suffix and suffix.strip() : 820 | return resource_name + "_" + suffix 821 | 822 | return resource_name 823 | 824 | 825 | def get_full_resource_name_with_suffix(resource_name: str): 826 | database_name = os.environ['DatabaseName'] 827 | schema_name = os.environ['SchemaName'] 828 | resource_name_with_suffix = add_snowflake_resource_suffix(resource_name) 829 | return database_name + "." + schema_name + "." + resource_name_with_suffix 830 | -------------------------------------------------------------------------------- /customer-stack/customer-stack.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: "2010-09-09" 2 | Parameters: 3 | s3BucketName: 4 | Type: String 5 | Description: "Name of the S3 bucket to be created" 6 | MinLength: 1 7 | snowflakeSecretArn: 8 | Type: String 9 | Description: "ARN of the AWS Secret containing the Snowflake login information" 10 | MinLength: 1 11 | kmsKeyArn: 12 | Type: String 13 | AllowedPattern: "^(arn:aws[a-z-]*:kms:[a-z0-9-]*:[0-9]{12}:key\\/.+)?$" 14 | Default: "" 15 | Description: "(Optional) ARN of the AWS Key Management Service key that Amazon SageMaker uses to encrypt job outputs. The KmsKeyId is applied to all outputs." 16 | vpcSecurityGroupIds: 17 | Type: "String" 18 | Default: "" 19 | Description: "(Optional) Comma delimited list of security group ids for VPC configuration" 20 | AllowedPattern: "^(sg\\-[a-zA-Z0-9]+(\\,s*sg\\-[a-zA-Z0-9]+)*)?$" 21 | vpcSubnetIds: 22 | Type: "String" 23 | Default: "" 24 | Description: "(Optional) Comma delimited list of subnet ids for VPC configuration" 25 | AllowedPattern: "^(subnet\\-[a-zA-Z0-9]+(\\,s*subnet\\-[a-zA-Z0-9]+)*)?$" 26 | snowflakeRole: 27 | Type: String 28 | Description: "Snowflake Role with permissions to create Storage Integrations, API Integrations and Functions" 29 | Default: "ACCOUNTADMIN" 30 | MinLength: 1 31 | snowflakeDatabaseName: 32 | Type: String 33 | Description: "Snowflake Database in which external functions will be created" 34 | MinLength: 1 35 | snowflakeSchemaName: 36 | Type: String 37 | Description: "Snowflake Database Schema in which external functions will be created" 38 | MinLength: 1 39 | apiGatewayName: 40 | Type: "String" 41 | AllowedPattern: "^[a-zA-Z0-9]+[-a-zA-Z0-9-]+[-a-zA-Z0-9]+$" 42 | Default: "snowflake-autopilot-api" 43 | Description: "API Gateway name" 44 | apiGatewayStageName: 45 | Type: "String" 46 | AllowedPattern: "^[-a-zA-Z0-9]+$" 47 | Default: "main" 48 | Description: "API deployment stage" 49 | MinLength: 1 50 | apiGatewayType: 51 | Type: "String" 52 | Default: "REGIONAL" 53 | AllowedValues: 54 | - "REGIONAL" 55 | - "PRIVATE" 56 | Description: "API Gateway type to create" 57 | snowflakeResourceSuffix: 58 | Type: String 59 | Description: "(Optional) Suffix for resources created in Snowflake. This suffix will be added to all function names created in the database schema." 60 | Default: "" 61 | snowflakeVpcId: 62 | Type: "String" 63 | Default: "" 64 | Description: "Snowflake VPC that has access to private API Gateway. Used only when creating a private API Gateway" 65 | AllowedPattern: "^(vpc\\-[a-zA-Z0-9]+)?$" 66 | Mappings: 67 | Package: 68 | Attributes: 69 | Identifier: "'SagemakerProxy/1.0'" 70 | Locations: 71 | CodeBucket: "sagemaker-sample-files" 72 | PathToLayerCode: "libraries/snowflake-connector-python-1.0.zip" 73 | PathToLambdaCode: "libraries/create-resources-1.0.zip" 74 | Conditions: 75 | KMSKeyArnProvided: !Not 76 | - !Equals 77 | - !Ref kmsKeyArn 78 | - "" 79 | shouldCreateRegionalGateway: 80 | !Equals [!Ref apiGatewayType, "REGIONAL"] 81 | isVPCConfigNotPresent: !Or 82 | - !Equals [!Ref "vpcSubnetIds", ""] 83 | - !Equals [!Ref "vpcSecurityGroupIds", ""] 84 | Metadata: 85 | AWS::CloudFormation::Interface: 86 | ParameterGroups: 87 | - 88 | Label: "" 89 | Parameters: 90 | - apiGatewayName 91 | - apiGatewayStageName 92 | - s3BucketName 93 | - kmsKeyArn 94 | - snowflakeDatabaseName 95 | - snowflakeSchemaName 96 | - snowflakeResourceSuffix 97 | - snowflakeRole 98 | - snowflakeSecretArn 99 | Resources: 100 | S3Bucket: 101 | Type: 'AWS::S3::Bucket' 102 | DeletionPolicy: Delete 103 | Properties: 104 | BucketName: !Ref s3BucketName 105 | SnowflakeAutoMLExecutionRole: 106 | Type: 'AWS::IAM::Role' 107 | Properties: 108 | Description: IAM Role used to execute the AutoML jobs from Snowflake 109 | AssumeRolePolicyDocument: 110 | Version: '2012-10-17' 111 | Statement: 112 | - Effect: Allow 113 | Principal: 114 | Service: 115 | - sagemaker.amazonaws.com 116 | Action: 117 | - 'sts:AssumeRole' 118 | Path: / 119 | ManagedPolicyArns: 120 | - !Sub 'arn:${AWS::Partition}:iam::aws:policy/AmazonSageMakerFullAccess' 121 | Policies: 122 | - PolicyName: s3-permissions 123 | PolicyDocument: 124 | Version: 2012-10-17 125 | Statement: 126 | - Effect: Allow 127 | Action: 128 | - 's3:GetObject' 129 | - 's3:PutObject' 130 | - 's3:DeleteObject' 131 | - 's3:ListBucket' 132 | Resource: 133 | - !Join 134 | - '' 135 | - - !GetAtt S3Bucket.Arn 136 | - '/*' 137 | - PolicyName: kms-permissions 138 | PolicyDocument: 139 | Version: 2012-10-17 140 | Statement: 141 | - Effect: Allow 142 | Action: 143 | - 'kms:CreateGrant' 144 | - "kms:Decrypt" 145 | - "kms:DescribeKey" 146 | - "kms:Encrypt" 147 | - "kms:GenerateDataKey*" 148 | - "kms:ReEncrypt*" 149 | Resource: 150 | - !Join 151 | - ":" 152 | - - "arn" 153 | - !Ref AWS::Partition 154 | - "kms" 155 | - !Ref AWS::Region 156 | - !Ref AWS::AccountId 157 | - "alias/aws/secretsmanager" 158 | - !If 159 | - KMSKeyArnProvided 160 | - !Ref kmsKeyArn 161 | - !Ref AWS::NoValue 162 | - PolicyName: secrets-permissions 163 | PolicyDocument: 164 | Version: 2012-10-17 165 | Statement: 166 | - Effect: Allow 167 | Action: 168 | - 'secretsmanager:GetSecretValue' 169 | Resource: !Ref snowflakeSecretArn 170 | SnowflakeAPIGatewayExecutionRole: 171 | Type: 'AWS::IAM::Role' 172 | Properties: 173 | Description: IAM Role used to call SageMaker from API Gateway for SnowFlake 174 | AssumeRolePolicyDocument: 175 | Version: '2012-10-17' 176 | Statement: 177 | - Effect: Allow 178 | Principal: 179 | Service: 180 | - apigateway.amazonaws.com 181 | Action: 182 | - 'sts:AssumeRole' 183 | Path: / 184 | Policies: 185 | - PolicyName: root 186 | PolicyDocument: 187 | Version: 2012-10-17 188 | Statement: 189 | - Effect: Allow 190 | Action: 191 | - 'sagemaker:CreateAutoMLJob' 192 | - 'sagemaker:DescribeAutoMLJob' 193 | - 'sagemaker:CreateEndpointConfig' 194 | - 'sagemaker:DescribeEndpointConfig' 195 | - 'sagemaker:DeleteEndpointConfig' 196 | - 'sagemaker:CreateEndpoint' 197 | - 'sagemaker:DescribeEndpoint' 198 | - 'sagemaker:InvokeEndpoint' 199 | - 'sagemaker:DeleteEndpoint' 200 | Resource: '*' 201 | - PolicyName: passRoleToExecute 202 | PolicyDocument: 203 | Version: 2012-10-17 204 | Statement: 205 | - Effect: Allow 206 | Action: 207 | - 'iam:PassRole' 208 | Resource: !GetAtt "SnowflakeAutoMLExecutionRole.Arn" 209 | - PolicyName: kms-permissions 210 | PolicyDocument: 211 | Version: 2012-10-17 212 | Statement: 213 | - Effect: Allow 214 | Action: 215 | - 'kms:CreateGrant' 216 | - "kms:Decrypt" 217 | - "kms:DescribeKey" 218 | - "kms:Encrypt" 219 | - "kms:GenerateDataKey*" 220 | - "kms:ReEncrypt*" 221 | Resource: 222 | - !Join 223 | - ":" 224 | - - "arn" 225 | - !Ref AWS::Partition 226 | - "kms" 227 | - !Ref AWS::Region 228 | - !Ref AWS::AccountId 229 | - "alias/aws/secretsmanager" 230 | - !If 231 | - KMSKeyArnProvided 232 | - !Ref kmsKeyArn 233 | - !Ref AWS::NoValue 234 | CopyZipsRole: 235 | Type: AWS::IAM::Role 236 | Properties: 237 | Description: IAM Role used to copy Snowflake libraries form the shared repository 238 | AssumeRolePolicyDocument: 239 | Version: '2012-10-17' 240 | Statement: 241 | - Effect: Allow 242 | Principal: 243 | Service: 244 | - lambda.amazonaws.com 245 | Action: 246 | - sts:AssumeRole 247 | Path: '/' 248 | ManagedPolicyArns: 249 | - !Sub 'arn:${AWS::Partition}:iam::aws:policy/CloudWatchLogsFullAccess' 250 | - !Sub 'arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole' 251 | Policies: 252 | - PolicyName: s3-dest-permissions 253 | PolicyDocument: 254 | Version: 2012-10-17 255 | Statement: 256 | - Effect: Allow 257 | Action: 258 | - 's3:PutObject' 259 | - 's3:DeleteObject' 260 | Resource: 261 | - !Join 262 | - '' 263 | - - !GetAtt S3Bucket.Arn 264 | - '/*' 265 | - PolicyName: s3-src-permissions 266 | PolicyDocument: 267 | Version: 2012-10-17 268 | Statement: 269 | - Effect: Allow 270 | Action: 271 | - 's3:GetObject' 272 | - 's3:ListBucket' 273 | Resource: '*' 274 | CreateSnowflakeResourcesExecutionRole: 275 | Type: AWS::IAM::Role 276 | Properties: 277 | Description: IAM Role used to create Snowflake resources from the CloudFormation template 278 | AssumeRolePolicyDocument: 279 | Version: '2012-10-17' 280 | Statement: 281 | - Effect: Allow 282 | Principal: 283 | Service: 284 | - lambda.amazonaws.com 285 | Action: 286 | - sts:AssumeRole 287 | Path: '/' 288 | ManagedPolicyArns: 289 | - !Sub 'arn:${AWS::Partition}:iam::aws:policy/CloudWatchLogsFullAccess' 290 | - !Sub 'arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole' 291 | Policies: 292 | - PolicyName: secrets-permissions 293 | PolicyDocument: 294 | Version: 2012-10-17 295 | Statement: 296 | - Effect: Allow 297 | Action: 298 | - 'secretsmanager:GetSecretValue' 299 | Resource: !Ref snowflakeSecretArn 300 | - PolicyName: update-iam-role 301 | PolicyDocument: 302 | Version: 2012-10-17 303 | Statement: 304 | - Effect: Allow 305 | Action: 306 | - 'iam:UpdateAssumeRolePolicy' 307 | Resource: 308 | - !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 309 | - !GetAtt SnowflakeAutoMLExecutionRole.Arn 310 | SnowflakeApiGateway: 311 | Type: "AWS::ApiGateway::RestApi" 312 | DependsOn: SnowflakeAPIGatewayExecutionRole 313 | Properties: 314 | Name: !Ref apiGatewayName 315 | Description: "Snowflake external functions Gateway" 316 | Policy: !Sub 317 | - '{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:${AWS::Partition}:sts::${AWS::AccountId}:assumed-role/${SnowflakeAPIGatewayExecutionRole}/snowflake" }, "Action": "execute-api:Invoke", "Resource": "${resourceArn}", "Condition": { ${vpcCondition} } }]}' 318 | - resourceArn: !Join [ "", [ "execute-api:/", "*" ] ] 319 | vpcCondition: !If 320 | - shouldCreateRegionalGateway 321 | - "" 322 | - !Sub '"StringEquals": { "aws:sourceVpc": "${snowflakeVpcId}"}' 323 | EndpointConfiguration: 324 | Types: 325 | - !Ref apiGatewayType 326 | SnowflakeApiGatewayDeployment: 327 | Type: "AWS::ApiGateway::Deployment" 328 | DependsOn: 329 | - "CreateModelPostMethod" 330 | - "PredictOutcomePostMethod" 331 | - "DeleteEndpointPostMethod" 332 | - "CreateEndpointPostMethod" 333 | - "DescribeModelPostMethod" 334 | - "DescribeEndpointPostMethod" 335 | - "CreateEndpointConfigPostMethod" 336 | - "DescribeEndpointConfigPostMethod" 337 | - "DeleteEndpointConfigPostMethod" 338 | Properties: 339 | RestApiId: !Ref "SnowflakeApiGateway" 340 | StageName: !Ref apiGatewayStageName 341 | RootApiResource: 342 | Type: 'AWS::ApiGateway::Resource' 343 | Properties: 344 | RestApiId: !Ref SnowflakeApiGateway 345 | ParentId: !GetAtt 346 | - SnowflakeApiGateway 347 | - RootResourceId 348 | PathPart: sagemaker 349 | CreateModelApiResource: 350 | Type: 'AWS::ApiGateway::Resource' 351 | Properties: 352 | RestApiId: !Ref SnowflakeApiGateway 353 | ParentId: !Ref RootApiResource 354 | PathPart: createmodel 355 | PredictOutcomeApiResource: 356 | Type: 'AWS::ApiGateway::Resource' 357 | Properties: 358 | RestApiId: !Ref SnowflakeApiGateway 359 | ParentId: !Ref RootApiResource 360 | PathPart: predictoutcome 361 | PredictOutcomeEndpointNameApiResource: 362 | Type: 'AWS::ApiGateway::Resource' 363 | Properties: 364 | RestApiId: !Ref SnowflakeApiGateway 365 | ParentId: !Ref PredictOutcomeApiResource 366 | PathPart: "{endpointName}" 367 | DeleteEndpointApiResource: 368 | Type: 'AWS::ApiGateway::Resource' 369 | Properties: 370 | RestApiId: !Ref SnowflakeApiGateway 371 | ParentId: !Ref RootApiResource 372 | PathPart: deleteendpoint 373 | DeleteEndpointConfigApiResource: 374 | Type: 'AWS::ApiGateway::Resource' 375 | Properties: 376 | RestApiId: !Ref SnowflakeApiGateway 377 | ParentId: !Ref RootApiResource 378 | PathPart: deleteendpointconfig 379 | CreateEndpointApiResource: 380 | Type: 'AWS::ApiGateway::Resource' 381 | Properties: 382 | RestApiId: !Ref SnowflakeApiGateway 383 | ParentId: !Ref RootApiResource 384 | PathPart: createendpoint 385 | CreateEndpointConfigApiResource: 386 | Type: 'AWS::ApiGateway::Resource' 387 | Properties: 388 | RestApiId: !Ref SnowflakeApiGateway 389 | ParentId: !Ref RootApiResource 390 | PathPart: createendpointconfig 391 | DescribeModelApiResource: 392 | Type: 'AWS::ApiGateway::Resource' 393 | Properties: 394 | RestApiId: !Ref SnowflakeApiGateway 395 | ParentId: !Ref RootApiResource 396 | PathPart: describemodel 397 | DescribeEndpointApiResource: 398 | Type: 'AWS::ApiGateway::Resource' 399 | Properties: 400 | RestApiId: !Ref SnowflakeApiGateway 401 | ParentId: !Ref RootApiResource 402 | PathPart: describeendpoint 403 | DescribeEndpointConfigApiResource: 404 | Type: 'AWS::ApiGateway::Resource' 405 | Properties: 406 | RestApiId: !Ref SnowflakeApiGateway 407 | ParentId: !Ref RootApiResource 408 | PathPart: describeendpointconfig 409 | CreateModelPostMethod: 410 | Type: "AWS::ApiGateway::Method" 411 | Properties: 412 | AuthorizationType: "AWS_IAM" 413 | HttpMethod: "POST" 414 | Integration: 415 | IntegrationHttpMethod: "POST" 416 | Type: "AWS" 417 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 418 | Uri: 419 | Fn::Join: 420 | - ":" 421 | - - "arn" 422 | - Ref: AWS::Partition 423 | - "apigateway" 424 | - Ref: AWS::Region 425 | - "sagemaker:action/CreateAutoMLJob" 426 | RequestParameters: 427 | integration.request.header.X-Amz-Target: "'SageMaker.CreateAutoMLJob'" 428 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 429 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 430 | PassthroughBehavior: WHEN_NO_MATCH 431 | IntegrationResponses: 432 | - StatusCode: 200 433 | SelectionPattern: '2..' 434 | - StatusCode: 400 435 | SelectionPattern: '4..' 436 | - StatusCode: 500 437 | SelectionPattern: '5..' 438 | MethodResponses: 439 | - StatusCode: 200 440 | - StatusCode: 400 441 | - StatusCode: 500 442 | ResourceId: !Ref "CreateModelApiResource" 443 | RestApiId: !Ref "SnowflakeApiGateway" 444 | PredictOutcomePostMethod: 445 | Type: "AWS::ApiGateway::Method" 446 | Properties: 447 | AuthorizationType: "AWS_IAM" 448 | HttpMethod: "POST" 449 | RequestParameters: 450 | method.request.path.endpointName: true 451 | Integration: 452 | IntegrationHttpMethod: "POST" 453 | Type: "AWS" 454 | RequestParameters: 455 | integration.request.path.endpointName: method.request.path.endpointName 456 | integration.request.header.Content-Type: "'text/csv'" 457 | integration.request.header.Accept: "'application/json'" 458 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 459 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 460 | PassthroughBehavior: WHEN_NO_MATCH 461 | Uri: 462 | Fn::Join: 463 | - "" 464 | - - "arn:" 465 | - Ref: AWS::Partition 466 | - ":apigateway:" 467 | - Ref: AWS::Region 468 | - :runtime.sagemaker:path/endpoints/{endpointName} 469 | - /invocations 470 | IntegrationResponses: 471 | - StatusCode: 200 472 | SelectionPattern: '2..' 473 | - StatusCode: 400 474 | SelectionPattern: '4..' 475 | - StatusCode: 500 476 | SelectionPattern: '5..' 477 | MethodResponses: 478 | - StatusCode: 200 479 | - StatusCode: 400 480 | - StatusCode: 500 481 | ResourceId: !Ref "PredictOutcomeEndpointNameApiResource" 482 | RestApiId: !Ref "SnowflakeApiGateway" 483 | DeleteEndpointPostMethod: 484 | Type: "AWS::ApiGateway::Method" 485 | Properties: 486 | AuthorizationType: "AWS_IAM" 487 | HttpMethod: "POST" 488 | Integration: 489 | IntegrationHttpMethod: "POST" 490 | Type: "AWS" 491 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 492 | Uri: 493 | Fn::Join: 494 | - ":" 495 | - - "arn" 496 | - Ref: AWS::Partition 497 | - "apigateway" 498 | - Ref: AWS::Region 499 | - "sagemaker:action/DeleteEndpoint" 500 | RequestParameters: 501 | integration.request.header.X-Amz-Target: "'SageMaker.DeleteEndpoint'" 502 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 503 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 504 | PassthroughBehavior: WHEN_NO_MATCH 505 | IntegrationResponses: 506 | - StatusCode: 200 507 | SelectionPattern: '2..' 508 | - StatusCode: 400 509 | SelectionPattern: '4..' 510 | - StatusCode: 500 511 | SelectionPattern: '5..' 512 | MethodResponses: 513 | - StatusCode: 200 514 | - StatusCode: 400 515 | - StatusCode: 500 516 | ResourceId: !Ref "DeleteEndpointApiResource" 517 | RestApiId: !Ref "SnowflakeApiGateway" 518 | CreateEndpointPostMethod: 519 | Type: "AWS::ApiGateway::Method" 520 | Properties: 521 | AuthorizationType: "AWS_IAM" 522 | HttpMethod: "POST" 523 | Integration: 524 | IntegrationHttpMethod: "POST" 525 | Type: "AWS" 526 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 527 | Uri: 528 | Fn::Join: 529 | - ":" 530 | - - "arn" 531 | - Ref: AWS::Partition 532 | - "apigateway" 533 | - Ref: AWS::Region 534 | - "sagemaker:action/CreateEndpoint" 535 | RequestParameters: 536 | integration.request.header.X-Amz-Target: "'SageMaker.CreateEndpoint'" 537 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 538 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 539 | PassthroughBehavior: WHEN_NO_MATCH 540 | IntegrationResponses: 541 | - StatusCode: 200 542 | SelectionPattern: '2..' 543 | - StatusCode: 400 544 | SelectionPattern: '4..' 545 | - StatusCode: 500 546 | SelectionPattern: '5..' 547 | MethodResponses: 548 | - StatusCode: 200 549 | - StatusCode: 400 550 | - StatusCode: 500 551 | ResourceId: !Ref "CreateEndpointApiResource" 552 | RestApiId: !Ref "SnowflakeApiGateway" 553 | DescribeModelPostMethod: 554 | Type: "AWS::ApiGateway::Method" 555 | Properties: 556 | AuthorizationType: "AWS_IAM" 557 | HttpMethod: "POST" 558 | Integration: 559 | IntegrationHttpMethod: "POST" 560 | Type: "AWS" 561 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 562 | Uri: 563 | Fn::Join: 564 | - ":" 565 | - - "arn" 566 | - Ref: AWS::Partition 567 | - "apigateway" 568 | - Ref: AWS::Region 569 | - "sagemaker:action/DescribeAutoMLJob" 570 | RequestParameters: 571 | integration.request.header.X-Amz-Target: "'SageMaker.DescribeAutoMLJob'" 572 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 573 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 574 | PassthroughBehavior: WHEN_NO_MATCH 575 | IntegrationResponses: 576 | - StatusCode: 200 577 | SelectionPattern: '2..' 578 | - StatusCode: 400 579 | SelectionPattern: '4..' 580 | - StatusCode: 500 581 | SelectionPattern: '5..' 582 | MethodResponses: 583 | - StatusCode: 200 584 | - StatusCode: 400 585 | - StatusCode: 500 586 | ResourceId: !Ref "DescribeModelApiResource" 587 | RestApiId: !Ref "SnowflakeApiGateway" 588 | DescribeEndpointPostMethod: 589 | Type: "AWS::ApiGateway::Method" 590 | Properties: 591 | AuthorizationType: "AWS_IAM" 592 | HttpMethod: "POST" 593 | Integration: 594 | IntegrationHttpMethod: "POST" 595 | Type: "AWS" 596 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 597 | Uri: 598 | Fn::Join: 599 | - ":" 600 | - - "arn" 601 | - Ref: AWS::Partition 602 | - "apigateway" 603 | - Ref: AWS::Region 604 | - "sagemaker:action/DescribeEndpoint" 605 | RequestParameters: 606 | integration.request.header.X-Amz-Target: "'SageMaker.DescribeEndpoint'" 607 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 608 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 609 | PassthroughBehavior: WHEN_NO_MATCH 610 | IntegrationResponses: 611 | - StatusCode: 200 612 | SelectionPattern: '2..' 613 | - StatusCode: 400 614 | SelectionPattern: '4..' 615 | - StatusCode: 500 616 | SelectionPattern: '5..' 617 | MethodResponses: 618 | - StatusCode: 200 619 | - StatusCode: 400 620 | - StatusCode: 500 621 | ResourceId: !Ref "DescribeEndpointApiResource" 622 | RestApiId: !Ref "SnowflakeApiGateway" 623 | DescribeEndpointConfigPostMethod: 624 | Type: "AWS::ApiGateway::Method" 625 | Properties: 626 | AuthorizationType: "AWS_IAM" 627 | HttpMethod: "POST" 628 | Integration: 629 | IntegrationHttpMethod: "POST" 630 | Type: "AWS" 631 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 632 | Uri: 633 | Fn::Join: 634 | - ":" 635 | - - "arn" 636 | - Ref: AWS::Partition 637 | - "apigateway" 638 | - Ref: AWS::Region 639 | - "sagemaker:action/DescribeEndpointConfig" 640 | RequestParameters: 641 | integration.request.header.X-Amz-Target: "'SageMaker.DescribeEndpointConfig'" 642 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 643 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 644 | PassthroughBehavior: WHEN_NO_MATCH 645 | IntegrationResponses: 646 | - StatusCode: 200 647 | SelectionPattern: '2..' 648 | - StatusCode: 400 649 | SelectionPattern: '4..' 650 | - StatusCode: 500 651 | SelectionPattern: '5..' 652 | MethodResponses: 653 | - StatusCode: 200 654 | - StatusCode: 400 655 | - StatusCode: 500 656 | ResourceId: !Ref "DescribeEndpointConfigApiResource" 657 | RestApiId: !Ref "SnowflakeApiGateway" 658 | CreateEndpointConfigPostMethod: 659 | Type: "AWS::ApiGateway::Method" 660 | Properties: 661 | AuthorizationType: "AWS_IAM" 662 | HttpMethod: "POST" 663 | Integration: 664 | IntegrationHttpMethod: "POST" 665 | Type: "AWS" 666 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 667 | Uri: 668 | Fn::Join: 669 | - ":" 670 | - - "arn" 671 | - Ref: AWS::Partition 672 | - "apigateway" 673 | - Ref: AWS::Region 674 | - "sagemaker:action/CreateEndpointConfig" 675 | RequestParameters: 676 | integration.request.header.X-Amz-Target: "'SageMaker.CreateEndpointConfig'" 677 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 678 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 679 | PassthroughBehavior: WHEN_NO_MATCH 680 | IntegrationResponses: 681 | - StatusCode: 200 682 | SelectionPattern: '2..' 683 | - StatusCode: 400 684 | SelectionPattern: '4..' 685 | - StatusCode: 500 686 | SelectionPattern: '5..' 687 | MethodResponses: 688 | - StatusCode: 200 689 | - StatusCode: 400 690 | - StatusCode: 500 691 | ResourceId: !Ref "CreateEndpointConfigApiResource" 692 | RestApiId: !Ref "SnowflakeApiGateway" 693 | DeleteEndpointConfigPostMethod: 694 | Type: "AWS::ApiGateway::Method" 695 | Properties: 696 | AuthorizationType: "AWS_IAM" 697 | HttpMethod: "POST" 698 | Integration: 699 | IntegrationHttpMethod: "POST" 700 | Type: "AWS" 701 | Credentials: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 702 | Uri: 703 | Fn::Join: 704 | - ":" 705 | - - "arn" 706 | - Ref: AWS::Partition 707 | - "apigateway" 708 | - Ref: AWS::Region 709 | - "sagemaker:action/DeleteEndpointConfig" 710 | RequestParameters: 711 | integration.request.header.X-Amz-Target: "'SageMaker.DeleteEndpointConfig'" 712 | integration.request.header.Content-Type: "'application/x-amz-json-1.1'" 713 | integration.request.header.X-Proxy-Agent: !FindInMap [Package, Attributes, Identifier] 714 | PassthroughBehavior: WHEN_NO_MATCH 715 | IntegrationResponses: 716 | - StatusCode: 200 717 | SelectionPattern: '2..' 718 | - StatusCode: 400 719 | SelectionPattern: '4..' 720 | - StatusCode: 500 721 | SelectionPattern: '5..' 722 | MethodResponses: 723 | - StatusCode: 200 724 | - StatusCode: 400 725 | - StatusCode: 500 726 | ResourceId: !Ref "DeleteEndpointConfigApiResource" 727 | RestApiId: !Ref "SnowflakeApiGateway" 728 | CopyZipsLambda: 729 | Type: AWS::Lambda::Function 730 | Properties: 731 | Code: 732 | ZipFile: | 733 | # Inspired by https://aws.amazon.com/blogs/infrastructure-and-automation/deploying-aws-lambda-functions-using-aws-cloudformation-the-portable-way/ 734 | import boto3 735 | import json 736 | import logging 737 | import os 738 | import requests 739 | import time 740 | 741 | EMPTY_RESPONSE_DATA = {} 742 | FAILED = 'FAILED' 743 | SUCCESS = 'SUCCESS' 744 | 745 | logger = logging.getLogger(__name__) 746 | logger.setLevel(logging.INFO) 747 | 748 | def lambda_handler(event, context): 749 | logger.info('Starting CopyZipsLambda') 750 | 751 | try: 752 | s3_destination_bucket_name = event['ResourceProperties']['DestBucket'] 753 | s3_source_bucket_name = event['ResourceProperties']['SourceBucket'] 754 | object_keys = event['ResourceProperties']['ObjectKeys'] 755 | 756 | if event['RequestType'] != 'Delete': 757 | copy_objects(s3_source_bucket_name, s3_destination_bucket_name, object_keys) 758 | logger.info("Files copied successfully") 759 | else: 760 | delete_objects(s3_destination_bucket_name, object_keys) 761 | logger.info("Files deleted successfully") 762 | 763 | sendResponse(event, context, SUCCESS, EMPTY_RESPONSE_DATA) 764 | logger.info('CopyZipsLambda finished') 765 | except: 766 | logger.exception("There was a problem running CopyZipsLambda") 767 | sendResponse(event, context, FAILED, EMPTY_RESPONSE_DATA) 768 | return 769 | 770 | def copy_objects(s3_source_bucket_name, s3_destination_bucket_name, object_keys): 771 | s3 = boto3.resource('s3') 772 | destination_bucket = s3.Bucket(s3_destination_bucket_name) 773 | 774 | for object_key in object_keys: 775 | copy_object(s3_source_bucket_name, destination_bucket, object_key) 776 | 777 | def copy_object(s3_source_bucket_name, destination_bucket, object_key): 778 | logger.info('Copying object key: ' + object_key) 779 | copy_source = { 780 | 'Bucket': s3_source_bucket_name, 781 | 'Key': object_key 782 | } 783 | destination_bucket.copy(copy_source, object_key) 784 | 785 | def delete_objects(s3_destination_bucket_name, object_keys): 786 | s3 = boto3.client('s3') 787 | 788 | for object_key in object_keys: 789 | delete_object(s3, s3_destination_bucket_name, object_key) 790 | 791 | def delete_object(s3, s3_destination_bucket_name, object_key): 792 | logger.info('Deleting object key: ' + object_key) 793 | s3.delete_object(Bucket=s3_destination_bucket_name, Key=object_key) 794 | 795 | def sendResponse(event, context, responseStatus, responseData): 796 | responseBody = {'Status': responseStatus, 797 | 'Reason': 'See the details in CloudWatch Log Stream: ' + context.log_stream_name, 798 | 'PhysicalResourceId': context.log_stream_name, 799 | 'StackId': event['StackId'], 800 | 'RequestId': event['RequestId'], 801 | 'LogicalResourceId': event['LogicalResourceId'], 802 | 'Data': responseData} 803 | req = requests.put(event['ResponseURL'], data=json.dumps(responseBody)) 804 | if req.status_code != 200: 805 | raise Exception('Received a non-200 HTTP response while sending response to CloudFormation.') 806 | return 807 | Handler: index.lambda_handler 808 | Role: !GetAtt CopyZipsRole.Arn 809 | Runtime: python3.7 810 | Timeout: 600 811 | CopyZips: 812 | Type: Custom::CopyZips 813 | DependsOn: 814 | - S3Bucket 815 | - CopyZipsRole 816 | Properties: 817 | ServiceToken: !GetAtt CopyZipsLambda.Arn 818 | DestBucket: !Ref s3BucketName 819 | SourceBucket: !FindInMap [Package, Locations, CodeBucket] 820 | ObjectKeys: 821 | - !FindInMap [Package, Locations, PathToLayerCode] 822 | - !FindInMap [Package, Locations, PathToLambdaCode] 823 | CreateSnowflakeResourcesLambdaLayer: 824 | Type: AWS::Lambda::LayerVersion 825 | DependsOn: 826 | - CopyZips 827 | Properties: 828 | CompatibleRuntimes: 829 | - python3.7 830 | Content: 831 | S3Bucket: !Ref s3BucketName 832 | S3Key: !FindInMap [Package, Locations, PathToLayerCode] 833 | Description: 'Layer to download Snowflake driver' 834 | CreateSnowflakeResourcesLambda: 835 | Type: AWS::Lambda::Function 836 | Properties: 837 | Code: 838 | S3Bucket: !Ref s3BucketName 839 | S3Key: !FindInMap [Package, Locations, PathToLambdaCode] 840 | Layers: 841 | - Ref: CreateSnowflakeResourcesLambdaLayer 842 | Handler: create-resources.lambda_handler 843 | Role: !GetAtt CreateSnowflakeResourcesExecutionRole.Arn 844 | Runtime: python3.7 845 | Timeout: 600 846 | Environment: 847 | Variables: 848 | ApiGatewayURL: !Sub "https://${SnowflakeApiGateway}.execute-api.${AWS::Region}.amazonaws.com/${apiGatewayStageName}/sagemaker" 849 | ApiGatewayRoleARN: !GetAtt SnowflakeAPIGatewayExecutionRole.Arn 850 | ApiGatewayRoleName: !Ref SnowflakeAPIGatewayExecutionRole 851 | AutoMLRoleARN: !GetAtt SnowflakeAutoMLExecutionRole.Arn 852 | AutoMLRoleName: !Ref SnowflakeAutoMLExecutionRole 853 | Region: !Sub "${AWS::Region}" 854 | S3BucketName: !Ref s3BucketName 855 | SecretArn: !Ref snowflakeSecretArn 856 | KmsKeyArn: !Ref kmsKeyArn 857 | VpcSecurityGroupIds: !Ref vpcSecurityGroupIds 858 | VpcSubnetIds: !Ref vpcSubnetIds 859 | SnowflakeRole: !Ref snowflakeRole 860 | StackName: !Sub "${AWS::StackName}" 861 | DatabaseName: !Ref snowflakeDatabaseName 862 | SchemaName: !Ref snowflakeSchemaName 863 | SnowflakeResourceSuffix: !Ref snowflakeResourceSuffix 864 | ApiGatewayType: !Ref apiGatewayType 865 | VpcConfig: 866 | Fn::If: 867 | - isVPCConfigNotPresent 868 | - { Ref: "AWS::NoValue" } 869 | - SecurityGroupIds: !Split [",", !Ref vpcSecurityGroupIds] 870 | SubnetIds: !Split [",", !Ref vpcSubnetIds] 871 | SnowflakeResources: 872 | Type: Custom::SnowflakeResources 873 | DependsOn: 874 | - SnowflakeAPIGatewayExecutionRole 875 | - SnowflakeAutoMLExecutionRole 876 | Properties: 877 | ServiceToken: !Sub 878 | - "${lambdaArn}" 879 | - lambdaArn: !GetAtt CreateSnowflakeResourcesLambda.Arn 880 | PackageIdentifier: !FindInMap [Package, Attributes, Identifier] 881 | -------------------------------------------------------------------------------- /customer-stack/generate-layer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | mkdir layer 4 | 5 | mkdir -p layer/snowflake-connector-python/python/lib/python3.7/site-packages 6 | python3.7 -m venv layer/.temp 7 | source layer/.temp/bin/activate 8 | pip3 install snowflake-connector-python 9 | deactivate 10 | mv layer/.temp/lib/python3.7/site-packages/* layer/snowflake-connector-python/python/lib/python3.7/site-packages 11 | rm -rf layer/.temp 12 | -------------------------------------------------------------------------------- /images/image1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-integration-with-snowflake/95951108d2846358e86af35be04e32f91960f281/images/image1.png -------------------------------------------------------------------------------- /images/image2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-integration-with-snowflake/95951108d2846358e86af35be04e32f91960f281/images/image2.png -------------------------------------------------------------------------------- /images/image3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-integration-with-snowflake/95951108d2846358e86af35be04e32f91960f281/images/image3.png -------------------------------------------------------------------------------- /policies.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-integration-with-snowflake/95951108d2846358e86af35be04e32f91960f281/policies.zip -------------------------------------------------------------------------------- /sample-templates/private-link-setup.yml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: 2010-09-09 2 | Description: Private links for snowflake and Autopilot integration 3 | Parameters: 4 | VpcId: 5 | Type: String 6 | PrivateSubnetIds: 7 | Type: String 8 | SecurityGroupId: 9 | Type: String 10 | SnowflakeTestAccountId: 11 | Type: String 12 | SnowflakePrivateLink: 13 | Type: String 14 | SecurityGroupIngressRuleToPort: 15 | Type: String 16 | Default: 65535 17 | SecurityGroupIngressRuleFromPort: 18 | Type: String 19 | Default: 0 20 | S3RouteTableIds: 21 | Type: CommaDelimitedList 22 | Conditions: 23 | IsSnowflakePrivateLinkSupported: !Not 24 | - !Equals 25 | - placeholder 26 | - !Ref SnowflakePrivateLink 27 | Resources: 28 | SnowflakeVpce: 29 | Condition: IsSnowflakePrivateLinkSupported 30 | Type: 'AWS::EC2::VPCEndpoint' 31 | Properties: 32 | ServiceName: !Ref SnowflakePrivateLink 33 | VpcId: !Ref VpcId 34 | SecurityGroupIds: 35 | - !Ref SecurityGroupId 36 | SubnetIds: !Split 37 | - ',' 38 | - !Ref PrivateSubnetIds 39 | VpcEndpointType: Interface 40 | SnowflakePrivateLinkPrivateHostedZone: 41 | Condition: IsSnowflakePrivateLinkSupported 42 | Type: 'AWS::Route53::HostedZone' 43 | Properties: 44 | Name: privatelink.snowflakecomputing.com 45 | VPCs: 46 | - VpcId: !Ref VpcId 47 | VPCRegion: !Ref 'AWS::Region' 48 | SnowflakePrivateLinkAccountRecordSets: 49 | Condition: IsSnowflakePrivateLinkSupported 50 | DependsOn: 51 | - SnowflakeVpce 52 | - SnowflakePrivateLinkPrivateHostedZone 53 | Type: 'AWS::Route53::RecordSet' 54 | Properties: 55 | Name: !Join 56 | - . 57 | - - !Ref SnowflakeTestAccountId 58 | - !Ref 'AWS::Region' 59 | - privatelink.snowflakecomputing.com 60 | Type: CNAME 61 | TTL: 300 62 | ResourceRecords: 63 | - 'Fn::Select': 64 | - 1 65 | - 'Fn::Split': 66 | - ':' 67 | - 'Fn::Select': 68 | - 0 69 | - !GetAtt SnowflakeVpce.DnsEntries 70 | HostedZoneId: !Ref SnowflakePrivateLinkPrivateHostedZone 71 | SnowflakePrivateLinkAppRecordSets: 72 | Condition: IsSnowflakePrivateLinkSupported 73 | DependsOn: 74 | - SnowflakeVpce 75 | - SnowflakePrivateLinkPrivateHostedZone 76 | Type: 'AWS::Route53::RecordSet' 77 | Properties: 78 | Name: !Join 79 | - . 80 | - - app 81 | - !Ref 'AWS::Region' 82 | - privatelink.snowflakecomputing.com 83 | Type: CNAME 84 | TTL: 300 85 | ResourceRecords: 86 | - 'Fn::Select': 87 | - 1 88 | - 'Fn::Split': 89 | - ':' 90 | - 'Fn::Select': 91 | - 0 92 | - !GetAtt SnowflakeVpce.DnsEntries 93 | HostedZoneId: !Ref SnowflakePrivateLinkPrivateHostedZone 94 | SnowflakePrivateLinkOcspRecordSets: 95 | Condition: IsSnowflakePrivateLinkSupported 96 | DependsOn: 97 | - SnowflakeVpce 98 | - SnowflakePrivateLinkPrivateHostedZone 99 | Type: 'AWS::Route53::RecordSet' 100 | Properties: 101 | Name: !Join 102 | - . 103 | - - ocsp 104 | - !Ref SnowflakeTestAccountId 105 | - !Ref 'AWS::Region' 106 | - privatelink.snowflakecomputing.com 107 | Type: CNAME 108 | TTL: 300 109 | ResourceRecords: 110 | - 'Fn::Select': 111 | - 1 112 | - 'Fn::Split': 113 | - ':' 114 | - 'Fn::Select': 115 | - 0 116 | - !GetAtt SnowflakeVpce.DnsEntries 117 | HostedZoneId: !Ref SnowflakePrivateLinkPrivateHostedZone 118 | SecretManagerVpce: 119 | Condition: IsSnowflakePrivateLinkSupported 120 | Type: 'AWS::EC2::VPCEndpoint' 121 | Properties: 122 | ServiceName: !Join 123 | - '' 124 | - - com.amazonaws. 125 | - !Ref 'AWS::Region' 126 | - .secretsmanager 127 | VpcId: !Ref VpcId 128 | SecurityGroupIds: 129 | - !Ref SecurityGroupId 130 | SubnetIds: !Split 131 | - ',' 132 | - !Ref PrivateSubnetIds 133 | VpcEndpointType: Interface 134 | PrivateDnsEnabled: true 135 | SagemakerAPIVpce: 136 | Condition: IsSnowflakePrivateLinkSupported 137 | Type: 'AWS::EC2::VPCEndpoint' 138 | Properties: 139 | ServiceName: !Join 140 | - '' 141 | - - com.amazonaws. 142 | - !Ref 'AWS::Region' 143 | - .sagemaker.api 144 | VpcId: !Ref VpcId 145 | SecurityGroupIds: 146 | - !Ref SecurityGroupId 147 | SubnetIds: !Split 148 | - ',' 149 | - !Ref PrivateSubnetIds 150 | VpcEndpointType: Interface 151 | PrivateDnsEnabled: true 152 | SagemakerRuntimeVpce: 153 | Condition: IsSnowflakePrivateLinkSupported 154 | Type: 'AWS::EC2::VPCEndpoint' 155 | Properties: 156 | ServiceName: !Join 157 | - '' 158 | - - com.amazonaws. 159 | - !Ref 'AWS::Region' 160 | - .sagemaker.runtime 161 | VpcId: !Ref VpcId 162 | SecurityGroupIds: 163 | - !Ref SecurityGroupId 164 | SubnetIds: !Split 165 | - ',' 166 | - !Ref PrivateSubnetIds 167 | VpcEndpointType: Interface 168 | PrivateDnsEnabled: true 169 | S3Vpce: 170 | Type: AWS::EC2::VPCEndpoint 171 | Properties: 172 | PolicyDocument: 173 | Version: '2012-10-17' 174 | Statement: 175 | - Action: 176 | - s3:GetObject 177 | - s3:PutObject 178 | - s3:ListBucket 179 | - s3:GetBucketLocation 180 | - s3:DeleteObject 181 | - s3:ListMultipartUploadParts 182 | - s3:AbortMultipartUpload 183 | Effect: Allow 184 | Resource: 185 | - '*' 186 | Principal: '*' 187 | RouteTableIds: !Ref S3RouteTableIds 188 | VpcEndpointType: Gateway 189 | ServiceName: !Join 190 | - '' 191 | - - com.amazonaws. 192 | - !Ref 'AWS::Region' 193 | - .s3 194 | VpcId: !Ref 'VpcId' 195 | -------------------------------------------------------------------------------- /sample-templates/vpc-setup.yml: -------------------------------------------------------------------------------- 1 | Description: This template deploys a VPC, with a pair of public and private subnets spread 2 | across two Availability Zones. It deploys an internet gateway, with a default 3 | route on the public subnets. It deploys a pair of NAT gateways (one in each AZ), 4 | and default routes for them in the private subnets. 5 | 6 | Parameters: 7 | EnvironmentName: 8 | Description: An environment name that is prefixed to resource names 9 | Type: String 10 | 11 | VpcCIDR: 12 | Description: Please enter the IP range (CIDR notation) for this VPC 13 | Type: String 14 | Default: 10.192.0.0/16 15 | 16 | PublicSubnet1CIDR: 17 | Description: Please enter the IP range (CIDR notation) for the public subnet in the first Availability Zone 18 | Type: String 19 | Default: 10.192.10.0/24 20 | 21 | PublicSubnet2CIDR: 22 | Description: Please enter the IP range (CIDR notation) for the public subnet in the second Availability Zone 23 | Type: String 24 | Default: 10.192.11.0/24 25 | 26 | PrivateSubnet1CIDR: 27 | Description: Please enter the IP range (CIDR notation) for the private subnet in the first Availability Zone 28 | Type: String 29 | Default: 10.192.20.0/24 30 | 31 | PrivateSubnet2CIDR: 32 | Description: Please enter the IP range (CIDR notation) for the private subnet in the second Availability Zone 33 | Type: String 34 | Default: 10.192.21.0/24 35 | 36 | Resources: 37 | VPC: 38 | Type: AWS::EC2::VPC 39 | Properties: 40 | CidrBlock: !Ref VpcCIDR 41 | EnableDnsSupport: true 42 | EnableDnsHostnames: true 43 | Tags: 44 | - Key: Name 45 | Value: !Ref EnvironmentName 46 | 47 | InternetGateway: 48 | Type: AWS::EC2::InternetGateway 49 | Properties: 50 | Tags: 51 | - Key: Name 52 | Value: !Ref EnvironmentName 53 | 54 | InternetGatewayAttachment: 55 | Type: AWS::EC2::VPCGatewayAttachment 56 | Properties: 57 | InternetGatewayId: !Ref InternetGateway 58 | VpcId: !Ref VPC 59 | 60 | PublicSubnet1: 61 | Type: AWS::EC2::Subnet 62 | Properties: 63 | VpcId: !Ref VPC 64 | AvailabilityZone: !Select [ 0, !GetAZs '' ] 65 | CidrBlock: !Ref PublicSubnet1CIDR 66 | MapPublicIpOnLaunch: true 67 | Tags: 68 | - Key: Name 69 | Value: !Sub ${EnvironmentName} Public Subnet (AZ1) 70 | 71 | PublicSubnet2: 72 | Type: AWS::EC2::Subnet 73 | Properties: 74 | VpcId: !Ref VPC 75 | AvailabilityZone: !Select [ 1, !GetAZs '' ] 76 | CidrBlock: !Ref PublicSubnet2CIDR 77 | MapPublicIpOnLaunch: true 78 | Tags: 79 | - Key: Name 80 | Value: !Sub ${EnvironmentName} Public Subnet (AZ2) 81 | 82 | PrivateSubnet1: 83 | Type: AWS::EC2::Subnet 84 | Properties: 85 | VpcId: !Ref VPC 86 | AvailabilityZone: !Select [ 0, !GetAZs '' ] 87 | CidrBlock: !Ref PrivateSubnet1CIDR 88 | MapPublicIpOnLaunch: false 89 | Tags: 90 | - Key: Name 91 | Value: !Sub ${EnvironmentName} Private Subnet (AZ1) 92 | 93 | PrivateSubnet2: 94 | Type: AWS::EC2::Subnet 95 | Properties: 96 | VpcId: !Ref VPC 97 | AvailabilityZone: !Select [ 1, !GetAZs '' ] 98 | CidrBlock: !Ref PrivateSubnet2CIDR 99 | MapPublicIpOnLaunch: false 100 | Tags: 101 | - Key: Name 102 | Value: !Sub ${EnvironmentName} Private Subnet (AZ2) 103 | 104 | NatGateway1EIP: 105 | Type: AWS::EC2::EIP 106 | DependsOn: InternetGatewayAttachment 107 | Properties: 108 | Domain: vpc 109 | 110 | NatGateway2EIP: 111 | Type: AWS::EC2::EIP 112 | DependsOn: InternetGatewayAttachment 113 | Properties: 114 | Domain: vpc 115 | 116 | NatGateway1: 117 | Type: AWS::EC2::NatGateway 118 | Properties: 119 | AllocationId: !GetAtt NatGateway1EIP.AllocationId 120 | SubnetId: !Ref PublicSubnet1 121 | 122 | NatGateway2: 123 | Type: AWS::EC2::NatGateway 124 | Properties: 125 | AllocationId: !GetAtt NatGateway2EIP.AllocationId 126 | SubnetId: !Ref PublicSubnet2 127 | 128 | PublicRouteTable: 129 | Type: AWS::EC2::RouteTable 130 | Properties: 131 | VpcId: !Ref VPC 132 | Tags: 133 | - Key: Name 134 | Value: !Sub ${EnvironmentName} Public Routes 135 | 136 | DefaultPublicRoute: 137 | Type: AWS::EC2::Route 138 | DependsOn: InternetGatewayAttachment 139 | Properties: 140 | RouteTableId: !Ref PublicRouteTable 141 | DestinationCidrBlock: 0.0.0.0/0 142 | GatewayId: !Ref InternetGateway 143 | 144 | PublicSubnet1RouteTableAssociation: 145 | Type: AWS::EC2::SubnetRouteTableAssociation 146 | Properties: 147 | RouteTableId: !Ref PublicRouteTable 148 | SubnetId: !Ref PublicSubnet1 149 | 150 | PublicSubnet2RouteTableAssociation: 151 | Type: AWS::EC2::SubnetRouteTableAssociation 152 | Properties: 153 | RouteTableId: !Ref PublicRouteTable 154 | SubnetId: !Ref PublicSubnet2 155 | 156 | 157 | PrivateRouteTable1: 158 | Type: AWS::EC2::RouteTable 159 | Properties: 160 | VpcId: !Ref VPC 161 | Tags: 162 | - Key: Name 163 | Value: !Sub ${EnvironmentName} Private Routes (AZ1) 164 | 165 | DefaultPrivateRoute1: 166 | Type: AWS::EC2::Route 167 | Properties: 168 | RouteTableId: !Ref PrivateRouteTable1 169 | DestinationCidrBlock: 0.0.0.0/0 170 | NatGatewayId: !Ref NatGateway1 171 | 172 | PrivateSubnet1RouteTableAssociation: 173 | Type: AWS::EC2::SubnetRouteTableAssociation 174 | Properties: 175 | RouteTableId: !Ref PrivateRouteTable1 176 | SubnetId: !Ref PrivateSubnet1 177 | 178 | PrivateRouteTable2: 179 | Type: AWS::EC2::RouteTable 180 | Properties: 181 | VpcId: !Ref VPC 182 | Tags: 183 | - Key: Name 184 | Value: !Sub ${EnvironmentName} Private Routes (AZ2) 185 | 186 | DefaultPrivateRoute2: 187 | Type: AWS::EC2::Route 188 | Properties: 189 | RouteTableId: !Ref PrivateRouteTable2 190 | DestinationCidrBlock: 0.0.0.0/0 191 | NatGatewayId: !Ref NatGateway2 192 | 193 | PrivateSubnet2RouteTableAssociation: 194 | Type: AWS::EC2::SubnetRouteTableAssociation 195 | Properties: 196 | RouteTableId: !Ref PrivateRouteTable2 197 | SubnetId: !Ref PrivateSubnet2 198 | 199 | NoIngressSecurityGroup: 200 | Type: AWS::EC2::SecurityGroup 201 | Properties: 202 | GroupName: "no-ingress-sg" 203 | GroupDescription: "Security group with no ingress rule" 204 | VpcId: !Ref VPC 205 | 206 | Outputs: 207 | VPC: 208 | Description: A reference to the created VPC 209 | Value: !Ref VPC 210 | 211 | PublicSubnets: 212 | Description: A list of the public subnets 213 | Value: !Join [ ",", [ !Ref PublicSubnet1, !Ref PublicSubnet2 ]] 214 | 215 | PrivateSubnets: 216 | Description: A list of the private subnets 217 | Value: !Join [ ",", [ !Ref PrivateSubnet1, !Ref PrivateSubnet2 ]] 218 | 219 | PublicSubnet1: 220 | Description: A reference to the public subnet in the 1st Availability Zone 221 | Value: !Ref PublicSubnet1 222 | 223 | PublicSubnet2: 224 | Description: A reference to the public subnet in the 2nd Availability Zone 225 | Value: !Ref PublicSubnet2 226 | 227 | PrivateSubnet1: 228 | Description: A reference to the private subnet in the 1st Availability Zone 229 | Value: !Ref PrivateSubnet1 230 | 231 | PrivateSubnet2: 232 | Description: A reference to the private subnet in the 2nd Availability Zone 233 | Value: !Ref PrivateSubnet2 234 | 235 | NoIngressSecurityGroup: 236 | Description: Security group with no ingress rule 237 | Value: !Ref NoIngressSecurityGroup 238 | -------------------------------------------------------------------------------- /snowflake-integration-overview.md: -------------------------------------------------------------------------------- 1 | # Snowflake + Amazon SageMaker Autopilot Integration Overview 2 | 3 | Organizations are increasingly using Snowflake to unify, integrate, 4 | analyze, and share previously fragmented data, and want to use state of 5 | the art machine learning (ML) to glean business insights. However, 6 | development of ML models based on large datasets requires extensive 7 | programming expertise and knowledge of ML frameworks. Meanwhile, most 8 | organizations have teams of analysts with the domain knowledge necessary 9 | to build ML models but lack the machine learning expertise required to 10 | train and deploy them. To address this, Snowflake is now integrated with 11 | Amazon SageMaker Autopilot to enable analysts and other SQL users to 12 | automatically build and deploy state-of-the-art machine learning models. 13 | 14 | Snowflake + Amazon SageMaker Autopilot Integration enables users to: 15 | 16 | - **Create and manage ML models**: Use standard SQL queries in 17 | Snowflake to access Autopilot APIs and automatically create the 18 | best machine learning model for your data in Snowflake. Autopilot 19 | does all the heavy lifting by automatically exploring, training, 20 | and tuning different ML algorithms, and providing the model that 21 | best fits your data. 22 | 23 | - **Make predictions**: Use standard SQL queries to deploy, invoke and 24 | manage ML models to SageMaker endpoints and make predictions from 25 | within Snowflake. 26 | 27 | ## Solution Architecture 28 | 29 | ### Solution Overview 30 | 31 | Snowflake + Amazon SageMaker Autopilot Integration sets up a reference 32 | architecture that allows you to directly access Amazon SageMaker machine 33 | learning (ML) APIs in Snowflake. The application it deploys is powered 34 | by Snowflake's [external 35 | functions](https://docs.snowflake.com/en/sql-reference/external-functions-introduction.html) 36 | and [request 37 | translators](https://docs.snowflake.com/en/LIMITEDACCESS/external-functions-serializers.html) 38 | features, which allow you to directly create, use, and make predictions 39 | from SageMaker machine learning models using simple SQL commands. 40 | 41 | | ![Snowflake + Amazon SageMaker Autopilot Solution Architecture](images/image1.png) | 42 | |:--:| 43 | | *Fig 1. Snowflake + Amazon SageMaker Autopilot Solution Architecture* | 44 | 45 | 1. When a supported `AWS_AUTOPILOT` SQL command is executed, the UI 46 | client program passes Snowflake a SQL statement that calls an 47 | external function. 48 | 49 | As part of query execution, Snowflake reads the external function 50 | definition, which contains the URL of the API Gateway service and the 51 | name of the API integration that contains authentication information 52 | for that proxy service. It also passes the data for formatting through 53 | any request translators and response translators associated with the 54 | external function. 55 | 56 | 2. Snowflake then reads information from the API integration and 57 | composes an HTTP POST request that contains the headers, data to 58 | be sent and authentication information and forwards the requests 59 | to the API Gateway. 60 | 61 | 3. API Gateway then forwards the call to the respective SageMaker API. 62 | 63 | ### Setup 64 | 65 | The integration provides a reference AWS 66 | [CloudFormation](https://aws.amazon.com/cloudformation/resources/templates/) 67 | template that sets up the required resources on AWS and Snowflake. The 68 | template aims to automate as much of the setup and act as a starting 69 | point and can be extended as needed. Deploying the CloudFormation 70 | template using the default parameters builds the following serverless 71 | environment: 72 | 73 | | ![AWS Cloudformation Template Setup](images/image2.png) | 74 | |:--:| 75 | | *Fig 2. AWS Cloudformation Template Setup* | 76 | 77 | 78 | 79 | 80 | | ![AWS Cloudformation Template Setup](images/image3.png) | 81 | |:--:| 82 | | *Fig 3. AWS Cloudformation Template Setup with VPC* | 83 | 84 | The CloudFormation template transparently and automatically creates the 85 | following 86 | 87 | **AWS Resources:** 88 | 89 | - **Amazon API Gateway** REST API with endpoints to facilitate 90 | connection between Snowflake external functions and SageMaker 91 | API's. See [Amazon API Gateway 92 | documentation](https://docs.aws.amazon.com/apigateway/index.html) 93 | to learn more about the service. 94 | 95 | - **S3 bucket** to store the training data and model artifacts created 96 | by Autopilot. See [S3 97 | documentation](https://aws.amazon.com/s3/getting-started/) 98 | to learn more about the service. 99 | 100 | - **AWS Lambda** which acts as a setup Lambda function that uses the 101 | Snowflake Python connector and credentials stored in the AWS 102 | Secrets manager to connect to and setup resources in Snowflake. 103 | See [AWS Lambda 104 | documentation](https://docs.aws.amazon.com/lambda/index.html) 105 | to learn more about the service. 106 | 107 | - **IAM roles** to access the resources and set up trust relationships 108 | between Snowflake and the Amazon API Gateway. See [IAM roles 109 | documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) 110 | to learn more about the service. See Snowflake documentation to 111 | learn more on [linking the API integration object in Snowflake to 112 | Amazon API Gateway using IAM 113 | roles](https://docs.snowflake.com/en/sql-reference/external-functions-creating-aws-common-api-integration-proxy-link.html). 114 | 115 | **Snowflake Resources:** 116 | 117 | - **Storage Integration** required to copy data from a Snowflake table 118 | to an Amazon S3 bucket for training. See Snowflake's documentation 119 | on [Storage 120 | Integrations](https://docs.snowflake.com/en/sql-reference/sql/create-storage-integration.html) 121 | to learn more. 122 | 123 | - **API Integration** required by the Snowflake external functions to 124 | talk to Amazon API Gateway. See Snowflake's documentation on [API 125 | Integrations](https://docs.snowflake.com/en/sql-reference/sql/create-api-integration.html) 126 | to learn more. 127 | 128 | - **External functions and associated request translators and response 129 | translators** that correspond to various SageMaker calls. See 130 | Snowflake's documentation on [External 131 | Functions](https://docs.snowflake.com/en/sql-reference/external-functions-introduction.html) 132 | and [Request 133 | Translators](https://docs.snowflake.com/en/LIMITEDACCESS/external-functions-serializers.html) 134 | to learn more. 135 | 136 | ## Getting Started 137 | 138 | ### Planning the Deployment 139 | 140 | Before you deploy the CloudFormation template, review the following 141 | information and ensure that your AWS and Snowflake accounts are properly 142 | configured and you have the right set of permissions. Otherwise, 143 | deployment might fail. 144 | 145 | **Snowflake account** - If you don't already have a Snowflake account, 146 | create one at 147 | [https://signup.snowflake.com/](https://signup.snowflake.com/). 148 | As SageMaker runs on the AWS cloud, for best performance it is 149 | recommended to use a Snowflake AWS deployment. 150 | 151 | **AWS account** - If you don't already have an AWS account, create one 152 | at [https://aws.amazon.com](https://aws.amazon.com). Your AWS 153 | account is automatically signed up for all AWS services. You are charged 154 | only for the services you use. 155 | 156 | #### AWS Services Quotas 157 | 158 | The resources created by the CloudFormation template provided should 159 | not exceed any service quota for your AWS account.\ 160 | Should any service quota exceed the limit, you can verify your limits 161 | and ask for quota increases in the [Service Quotas 162 | console](https://console.aws.amazon.com/servicequotas/home?region=us-east-2#!/). 163 | 164 | When creating models and performing predictions, Snowflake will create 165 | AutoML jobs and SageMaker Endpoints in your AWS account.\ 166 | This can result in reaching the [SageMaker service 167 | quotas](https://docs.aws.amazon.com/general/latest/gr/sagemaker.html#limits_sagemaker) 168 | for your AWS account. If you encounter error messages that you\'ve 169 | exceeded your quota, use [AWS 170 | Support](https://console.aws.amazon.com/support/) to request a 171 | service limit increase for the SageMaker resources you want to scale up. 172 | 173 | #### Permissions 174 | 175 | **AWS IAM permissions:** Before deploying the CloudFormation template, 176 | you must sign in to the AWS Management Console with IAM permissions for 177 | the resources that the templates deploy. The AdministratorAccess managed 178 | policy within IAM provides sufficient permissions, although your 179 | organization may choose to use a custom policy with more restrictions. 180 | For more information, see [AWS managed policies for job 181 | functions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_job-functions.html). 182 | 183 | **Snowflake permissions:** In order for the template to create the 184 | required Snowflake resources, you will need to have a Snowflake role 185 | with permissions to create Storage Integrations, API Integrations and 186 | Functions. This could be the Account Administrator role or a custom role 187 | with the above privileges. See Snowflake 188 | [roles](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#roles) 189 | and 190 | [privileges](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#privileges) 191 | for more information. 192 | 193 | **Storing Snowflake credentials in AWS Secrets Manager:** The 194 | CloudFormation template takes as an input an ARN to an AWS Secret that 195 | has the Snowflake account details and credentials to securely connect to 196 | and create Snowflake resources required by the integration. To save your 197 | credentials: 198 | 199 | - Go to the AWS Management Console. 200 | 201 | - From the top right corner select the AWS region, you plan to deploy 202 | the template in. **Note:** It is required that you store your 203 | secret in the same region you will be deploying the template in. 204 | 205 | - In the top search bar, search for **Secrets Manager**. 206 | 207 | - Click on **Store a new secret.** 208 | 209 | - Select **Other type of secrets.** 210 | 211 | - On the **Secret key/value** tab fill 3 key/value rows:\ 212 | username (this contains your Snowflake username)\ 213 | password (this contains your Snowflake password)\ 214 | accountid (this contains your Snowflake [account 215 | identifier](https://docs.snowflake.com/en/user-guide/admin-account-identifier.html)) 216 | 217 | - If you click on the Plaintext tab you should see something like 218 | this: 219 | ``` 220 | { 221 | "accountid": "snowflake_account_id", 222 | "username": "snowflake_user", 223 | "password": "snowflake_password" 224 | } 225 | ``` 226 | 227 | - Leave the default encryption key selected and click next. 228 | 229 | - Give a name to your Secret and click next. 230 | 231 | - You can leave the remaining options unchanged and click **Store** on 232 | the final screen. 233 | 234 | ### Deploying the CloudFormation Template 235 | 236 | Sign in to your AWS account, and from the upper-right corner of the 237 | navigation bar choose the Region you want the resources created by the 238 | CloudFormation Template to be set up in. It is recommended to deploy the 239 | AWS resources to the same region the Snowflake deployment runs on. 240 | 241 | #### Upload the Template 242 | 243 | 1. Go to the AWS Management Console. 244 | 245 | 2. In the top search bar, search for **CloudFormation**. 246 | 247 | 3. Under Services, click on **CloudFormation**. 248 | 249 | 4. Click on **Create stack**.\ 250 | If given a choice between **With new resources (standard)** or 251 | **With existing resources (import resources)**, then choose **With 252 | new resources (standard)**. 253 | 254 | 5. On the **Create stack** page, under **Prepare template**, select 255 | **Template is ready**. 256 | 257 | 6. Select **Upload a template file**. 258 | 259 | 7. Select **Choose file**. 260 | 261 | 8. Navigate to the directory that contains your copy of the template, 262 | then select that template. 263 | 264 | 9. Click **Next** to reach the page on which you enter names for 265 | resources, etc. 266 | 267 | #### Configure Your Options 268 | 269 | The template contains default values for most fields. However, you 270 | need to enter a few values, such as the names for the resources and the 271 | ARN to the AWS Secret Manager. 272 | 273 | 1. Enter a name for the stack. 274 | 275 | 2. **apiGatewayName** - Enter the name of the API Gateway to be 276 | created. Default name will be snowflake-autopilot-api. 277 | 278 | 3. **apiGatewayStageName** - Enter the name of the API deployment stage 279 | to be created. Default name will be snowflake-autopilot-stage. 280 | 281 | 4. **s3BucketName** - Enter the name of the S3 bucket to be created to 282 | store the training data and artifacts produced by the AutoML jobs. 283 | 284 | 5. **kmsKeyArn** - Optional parameter. Enter ARN of the AWS Key 285 | Management Service key that Amazon SageMaker can use to encrypt 286 | job outputs. The KmsKeyId is applied to all outputs. 287 | 288 | 6. **snowflakeDatabaseName** - Enter the name of the Snowflake Database 289 | in which to create the external functions and request translators. 290 | 291 | 7. **snowflakeSchemaName** - Enter the name of the Snowflake Database 292 | Schema in which to create the external functions and request 293 | translators. 294 | 295 | 8. **snowflakeResourceSuffix** - Optional parameter. Enter a unique 296 | suffix that can be appended to the Snowflake resources created. 297 | This suffix will be added to all the functions created in the 298 | provided Snowflake database schema. 299 | 300 | ***Note:** If you have multiple users deploying the template to the 301 | same Snowflake account and using the same Snowflake database and 302 | schemas it's recommended to provide the snowflakeResourceSuffix in 303 | order to prevent overriding of any existing resources deployed by 304 | other users.* 305 | 306 | 9. **snowflakeRole** - Enter the name of the Snowflake Role with 307 | permissions to create storage integrations, API integrations and 308 | functions. Default value will be the ACCOUNTADMIN role. 309 | 310 | 10. **snowflakeSecretArn** - Enter the ARN of the secret from AWS 311 | Secrets Manager containing the Snowflake login information. 312 | 313 | 11. Click **Next**.\ 314 | This page has some advanced options for template deployment. 315 | 316 | 1. Optionally, set advanced options, such as stack policy. These 317 | are not needed when creating the sample function using the 318 | template supplied by Snowflake. 319 | 320 | 2. Click **Next**. 321 | 322 | 12. On the review page, scroll down to the end and acknowledge that the 323 | CloudFormation template might create IAM resources with custom 324 | names. This is needed because the template creates three IAM roles 325 | as part of the deployment. 326 | 327 | 13. Click on **Create stack**. 328 | 329 | The deployment will take a few seconds. After the deployment is 330 | complete, you should be on the **Events** tab for the newly created 331 | stack. The created resources will be listed under the **Resources** 332 | tab. 333 | 334 | If the deployment of the CloudFormation template was successful, you 335 | now have all the required resources created on the AWS and Snowflake 336 | side required for the integration. 337 | 338 | ## Working with SageMaker APIs from Snowflake 339 | 340 | 1. Login to your Snowflake account in which the resources have been 341 | created by the CloudFormation template. 342 | 343 | 2. The template should have set up: 344 | 345 | a. Storage Integration with the name: `AWS_AUTOPILOT_STORAGE_INTEGRATION_YOURSTACKNAME` 346 | 347 | b. API Integration with the name: `AWS_AUTOPILOT_API_INTEGRATION_YOURSTACKNAME` 348 | 349 | You can use the SQL command `SHOW INTEGRATIONS LIKE '%AWS_AUTOPILOT%'` to see the integrations created and use the [DESCRIBE 350 | INTEGRATION](https://docs.snowflake.com/en/sql-reference/sql/desc-integration.html) 351 | command to get details on properties of a particular integration. 352 | 353 | **Note:** Since API and storage integrations are account-level 354 | objects, in order to avoid overriding existing integrations, the names 355 | are appended with the stack name provided as input during cloud 356 | formation template deployment. 357 | 358 | c. The following external functions and translators (JavaScript 359 | functions) are displayed: 360 | 361 | - `AWS_AUTOPILOT_CREATE_MODEL` 362 | - `AWS_AUTOPILOT_CREATE_MODEL_REQUEST_TRANSLATOR` 363 | - `AWS_AUTOPILOT_CREATE_MODEL_RESPONSE_TRANSLATOR` 364 | - `AWS_AUTOPILOT_DESCRIBE_MODEL` 365 | - `AWS_AUTOPILOT_DESCRIBE_MODEL_REQUEST_TRANSLATOR` 366 | - `AWS_AUTOPILOT_DESCRIBE_MODEL_RESPONSE_TRANSLATOR` 367 | - `AWS_AUTOPILOT_PREDICT_OUTCOME` 368 | - `AWS_AUTOPILOT_PREDICT_OUTCOME_REQUEST_TRANSLATOR` 369 | - `AWS_AUTOPILOT_PREDICT_OUTCOME_RESPONSE_TRANSLATOR` 370 | - `AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG` 371 | - `AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR` 372 | - `AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR` 373 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG` 374 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR` 375 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR` 376 | - `AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG` 377 | - `AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_REQUEST_TRANSLATOR` 378 | - `AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG_RESPONSE_TRANSLATOR` 379 | - `AWS_AUTOPILOT_CREATE_ENDPOINT` 380 | - `AWS_AUTOPILOT_CREATE_ENDPOINT_REQUEST_TRANSLATOR` 381 | - `AWS_AUTOPILOT_CREATE_ENDPOINT_RESPONSE_TRANSLATOR` 382 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT` 383 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_REQUEST_TRANSLATOR` 384 | - `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_RESPONSE_TRANSLATOR` 385 | - `AWS_AUTOPILOT_DELETE_ENDPOINT` 386 | - `AWS_AUTOPILOT_DELETE_ENDPOINT_REQUEST_TRANSLATOR` 387 | - `AWS_AUTOPILOT_DELETE_ENDPOINT_RESPONSE_TRANSLATOR` 388 | 389 | You can use the SQL command `SHOW FUNCTIONS LIKE '%AWS_AUTOPILOT%'` to see 390 | all the functions created and use the [DESCRIBE 391 | FUNCTION](https://docs.snowflake.com/en/sql-reference/sql/desc-function.html) 392 | command to get details on the specified function, including the 393 | signature (i.e. arguments), return value, language, and body (i.e. 394 | definition). 395 | 396 | **Note:** Since API and Storage integrations are account level 397 | objects, in order to avoid overriding existing integrations, the names 398 | are appended with the stack name provided as input during cloud 399 | formation template deployment. 400 | 401 | ### Create Model 402 | 403 | Use the `AWS_AUTOPILOT_CREATE_MODEL` external functions below to 404 | kick-off model creation on your data in a Snowflake table. 405 | 406 | #### Option 1 407 | 408 | **Syntax:** 409 | 410 | ``` 411 | AWS_AUTOPILOT_CREATE_MODEL(MODELNAME VARCHAR, TRAINING_TABLE_NAME VARCHAR, TARGET_COL VARCHAR) 412 | ``` 413 | 414 | **Arguments (all are required parameters):** 415 | 416 | `MODELNAME` - Name that will be used to refer to the best model found by Autopilot. Allowed Pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}` 417 | 418 | `TRAINING_TABLE_NAME` - Name of the table from which to create the model. All rows will be considered to train the model. 419 | 420 | `TARGET_COL` - The name of the target column that we want the model to predict. 421 | 422 | **Usage:** 423 | 424 | ``` 425 | select aws_autopilot_create_model ('abalonemodel', 'abalone_training_dataset', 'rings') 426 | ``` 427 | 428 | **Expected output on success:** 429 | 430 | ``` 431 | "Model creation in progress. Model ARN = 432 | arn:aws:sagemaker:us-west-2:631484165566:automl-job/abalonemodel-job." 433 | ``` 434 | 435 | 436 | - The command above kicks off an AutoML job. 437 | - The [Problem 438 | type](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-problem-types.html) 439 | and [Objective 440 | metric](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLJobObjective.html#sagemaker-Type-AutoMLJobObjective-MetricName) 441 | are auto inferred. 442 | 443 | - Depending on the size of the data the model, creation can take 444 | anywhere from a few minutes for small data sets to 2-3 hours for 445 | large datasets (eg. 5 GB). The default max run time of the AutoML 446 | job is 86400 seconds. If you want more control on the model 447 | creation time, you can use the advanced `AWS_AUTOPILOT_CREATE_MODEL` 448 | option and set the `MAX_RUNNING_TIME` field. **Note:** The parameter 449 | is intended to set a timeout on the length of the training job, 450 | and if the job has not finished within the specified limit it is 451 | forcefully stopped and a model will NOT be created. If you would 452 | like to optimize for speed and have a model successfully created 453 | in a shorter duration consider using the `MAX_CANDIDATES` parameter. 454 | 455 | - Use the [AWS_AUTOPILOT_DESCRIBEMODEL](#describe-model) 456 | function to check the status of the job. 457 | 458 | - When the best model is found, Autopilot transparently deploys the 459 | model to a SageMaker Endpoint of the same name as the model. 460 | 461 | - The aws_autopilot_create_model call creates a default endpoint 462 | configuration with the name `yourmodelname-m5-4xl-2`, with 463 | the following parameters: `"InitialInstanceCount": 2, 464 | "InstanceType": "ml.m5.4xlarge"`. Advanced users can go 465 | lower or higher depending on their dataset sizes and 466 | performance needs. See 467 | [AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG](#create-endpoint-config) 468 | for more details on specifying a custom endpoint 469 | configuration. (In the above example, the name of the endpoint 470 | configuration created would be `abalonemodel-m5-4xl-2`.) 471 | 472 | - Using the above endpoint config, the model will be deployed to 473 | an endpoint with the same name as the model (In the above 474 | example the endpoint name would be `abalonemodel`. The time 475 | to live of the endpoint will be 604800 seconds (7 days), after 476 | which it is automatically deleted. 477 | 478 | - If you would like to redeploy the model after it has been 479 | deleted, use the 480 | [AWS_AUTOPILOT_CREATE_ENDPOINT](#create-endpoint) 481 | command and you can either specify the default endpoint 482 | configuration created or specify a custom endpoint 483 | configuration. 484 | 485 | **Note**: See 486 | [https://aws.amazon.com/sagemaker/pricing/](https://aws.amazon.com/sagemaker/pricing/) 487 | for details on instance pricing and to estimate costs. 488 | 489 | #### Option 2 490 | 491 | Advanced users who would like to specify different default values for 492 | the various optional parameters can use this variation of the 493 | AWS_AUTOPILOT_CREATE_MODEL call. 494 | 495 | **Syntax:** 496 | 497 | ``` 498 | AWS_AUTOPILOT_CREATE_MODEL(MODELNAME VARCHAR, TRAINING_TABLE_NAME 499 | VARCHAR, TARGET_COL VARCHAR, OBJECTIVE_METRIC VARCHAR, PROBLEM_TYPE 500 | VARCHAR,MAX_CANDIDATES INTEGER, MAX_RUNNING_TIME INTEGER, 501 | DEPLOY_MODEL BOOLEAN, MODEL_ENDPOINT_TTL INTEGER) 502 | ``` 503 | 504 | **Arguments:** 505 | 506 | `MODELNAME` (required) - Name that will be used to refer to the best model found by Autopilot. Allowed Pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}` 507 | 508 | `TRAINING_TABLE_NAME` (required) - Name of the table from which to create the model. All rows will be used to train the model. 509 | 510 | `TARGET_COL` (required) - The name of the target column that we want the model to predict. 511 | 512 | `OBJECTIVE_METRIC` (optional) - \"Accuracy\", \"MSE\", \"AUC\", \"F1\", and \"F1macro\". If NULL, Autopilot will auto infer this information. 513 | 514 | `PROBLEM_TYPE` (optional) - Type of problem: \"Regression\", \"BinaryClassification\", \"MulticlassClassification\" or \"Auto\". If NULL the default value will be set to \"Auto\". 515 | 516 | `MAX_CANDIDATES` (optional) - Maximum number of times a training job is allowed to run. Valid values are integers 1 and higher. Can be leveraged to optimize for speed and have the create model call complete quicker by limiting the number of candidates explored. If NULL, Autopilot will auto infer this information. **Note:** For optimizing for `OBJECTIVE_METRIC` we suggest leaving this field unset, such that the AutoML job can explore all possible candidates and pick the best one. 517 | 518 | `MAX_RUNNING_TIME` (optional) - Maximum runtime, in seconds, an AutoML job has to complete.If NULL the default value will be set to 86000 seconds. **Note:** The parameter is intended to set a timeout on the length of the training job, and if the job has not finished within the specified limit it is forcefully stopped and a model will NOT be created. If you would like to optimize for speed and have a model successfully created in a shorter duration consider using the `MAX_CANDIDATES` parameter. 519 | 520 | `DEPLOY_MODEL` (optional) - TRUE or FALSE. If NULL the default value will be TRUE and the best model will be transparently deployed to a SageMaker Endpoint. The default endpoint configuration used is as follows: `"InitialInstanceCount": 2, "InstanceType": "ml.m5.4xlarge"`. Advanced users can go lower or higher depending on their dataset sizes and performance needs. See [AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG](#create-endpoint-config) for more details on specifying a custom endpoint configuration. 521 | 522 | `MODEL_ENDPOINT_TTL` (optional) - Time to live off the model endpoint in seconds. If NULL the default value will be 7 days. 523 | 524 | **Note:** See [https://aws.amazon.com/sagemaker/pricing/](https://aws.amazon.com/sagemaker/pricing/) 525 | for details on instance pricing and to estimate costs. 526 | 527 | **Usage:** 528 | 529 | ``` 530 | select aws_autopilot_create_model ('abalonemodel', 'abalone_training_dataset', 'rings', 'Accuracy', 'MulticlassClassification', 20000, 'True', 86400 ) 531 | ``` 532 | 533 | **Note:** External functions do not support optional parameters. For the optional arguments which are wished to be skipped should be specified as a NULL. 534 | 535 | **Expected output on success:** 536 | 537 | ``` 538 | "Model creation in progress. Model ARN = 539 | arn:aws:sagemaker:us-west-2:631484165566:automl-job/abalonemodel-job." 540 | ``` 541 | 542 | ### Describe Model 543 | 544 | Use the `AWS_AUTOPILOT_DESCRIBE_MODEL` external function in a SQL query to check the status and track progress of your Autopilot training job and the model. 545 | 546 | **Syntax:** 547 | 548 | ``` 549 | AWS_AUTOPILOT_DESCRIBE_MODEL(MODELNAME VARCHAR) 550 | ``` 551 | 552 | **Arguments:** 553 | 554 | `MODELNAME` (required) - Name of the model. 555 | 556 | **Usage:** 557 | 558 | ``` 559 | select aws_autopilot_describe_model ('abalonemodel') 560 | ``` 561 | 562 | **The response includes the following information:** 563 | 564 | **Job status**: "Completed", "InProgress", "Failed", "Stopped", "Stopping" 565 | 566 | **Job status detail**: Starting, AnalyzingData, FeatureEngineering, ModelTuning, MaxCandidatesReached, Failed, Stopped, MaxAutoMLJobRuntimeReached, Stopping, DeployingModel, CandidateDefinitionsGenerated 567 | 568 | **Problem type:** "Regression", "BinaryClassification" or MulticlassClassification". 569 | 570 | **Objective metric:** "Accuracy", "MSE", "AUC", "F1", and "F1macro". 571 | 572 | **Best Objective Metric Value:** Value of the objective metric for the best model found so far. 573 | 574 | **Failure reason:** Returns the reason for failure, if the status was "Failed". 575 | 576 | ### Predict Outcome 577 | 578 | Use the `AWS_AUTOPILOT_PREDICT_OUTCOME` external function in a SQL query to make predictions using the ML model produced by Autopilot. 579 | 580 | **Syntax:** 581 | 582 | ``` 583 | AWS_AUTOPILOT_PREDICT_OUTCOME(MODEL_ENDPOINT_NAME VARCHAR,COLUMNS ARRAY) 584 | ``` 585 | 586 | **Arguments:** 587 | 588 | `MODEL_ENDPOINT_NAME` (required) - Name of the endpoint the model is deployed to. Note: Unless the model was manually deployed to a custom endpoint this will be the same as the model name. 589 | 590 | `COLUMNS` (required) - Array of values or feature columns to pass as inputs for model prediction. The ordering should match that of the training dataset, minus the target column. 591 | 592 | **Usage:** 593 | 594 | ``` 595 | select aws_autopilot_predict_outcome ('abalonemodel', array_construct('M',0.455, 0.365, 0.095, 0.514, 0.2245, 0.101, 0.15)); 596 | 597 | select aws_autopilot_predict_outcome ('abalonemodel', array_construct(sex, length, diameter, height, whole_weight, shucked_weight, viscera_weight, shell_weight) 598 | 599 | ) as prediction 600 | 601 | from abalone_test_dataset; 602 | ``` 603 | 604 | **Response**: 605 | 606 | Returns the predicted target value for each row of attributes. 607 | 608 | ### Create Endpoint Config 609 | 610 | Use the `AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG` external function in a 611 | SQL query to create an endpoint configuration that Amazon SageMaker 612 | hosting services use to deploy models. 613 | 614 | This allows advanced users to pick a custom endpoint configuration to 615 | go lower or higher depending on their dataset sizes and performance 616 | needs compared to the default endpoint configuration used by the 617 | create model call. 618 | 619 | **Syntax:** 620 | 621 | ``` 622 | AWS_AUTOPILOT_CREATE_ENDPOINT_CONFIG(ENDPOINTCONFIG_NAME 623 | VARCHAR,MODELNAME VARCHAR,INSTANCE_TYPE VARCHAR,INSTANCE_COUNT 624 | NUMBER) 625 | ``` 626 | 627 | **Arguments (all are required parameters):** 628 | 629 | `ENDPOINT_CONFIG_NAME`- The name of the endpoint configuration. You specify this name in a CreateEndpoint request. Allowed Pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}` 630 | 631 | `MODELNAME` - The name of the model that you want to host. This is the name that you specified when creating the model. 632 | 633 | `INSTANCE_TYPE` - The ML compute instance type. See [SageMaker instance types](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html#sagemaker-Type-ProductionVariant-InstanceType) 634 | for more details. 635 | 636 | `INSTANCE_COUNT` - Number of instances to launch. 637 | 638 | **Usage:** 639 | 640 | ``` 641 | select aws_autopilot_create_endpoint_config ( 642 | 'abalone-endpoint-config','abalonemodel', 'ml.c5d.4xlarge', 3) 643 | ``` 644 | 645 | ### Describe Endpoint Config 646 | 647 | Use the `AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG` external function in a SQL query to get the description of an endpoint configuration that was created using the Create Endpoint Config call. 648 | 649 | **Syntax:** 650 | 651 | ``` 652 | AWS_AUTOPILOT_DESCRIBE_ENDPOINT_CONFIG(ENDPOINTCONFIG_NAME) 653 | ``` 654 | 655 | **Arguments (all are required parameters):** 656 | 657 | `ENDPOINT_CONFIG_NAME`- The name of the endpoint configuration. 658 | 659 | **Usage:** 660 | 661 | ``` 662 | select aws_autopilot_describe_endpoint_config 663 | ('abalone-endpoint-config') 664 | ``` 665 | 666 | **Response**: 667 | 668 | `ModelName` - The name of the model to be hosted. 669 | 670 | `InstanceCount` - Number of instances to launch. 671 | 672 | `InstanceType` - The ML compute instance type. 673 | 674 | ### Delete Endpoint Config 675 | 676 | Use the `AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG` external function in a 677 | SQL query to delete an endpoint configuration. This command deletes 678 | only the specified configuration. It does not delete endpoints created 679 | using the configuration. 680 | 681 | **Syntax:** 682 | 683 | ``` 684 | AWS_AUTOPILOT_DELETE_ENDPOINT_CONFIG(ENDPOINTCONFIG_NAME) 685 | ``` 686 | 687 | **Arguments (all are required parameters):** 688 | 689 | `ENDPOINT_CONFIG_NAME`- The name of the endpoint configuration. 690 | 691 | **Usage:** 692 | 693 | ``` 694 | select aws_autopilot_delete_endpoint_config ('abalone-endpoint-config') 695 | ``` 696 | 697 | ### Create Endpoint 698 | 699 | Use the `AWS_AUTOPILOT_CREATE_ENDPOINT` external function in a SQL query 700 | to create an endpoint using the endpoint configuration specified in 701 | the request. Amazon SageMaker uses the endpoint to provision resources 702 | and deploy models. 703 | 704 | 705 | **Syntax:** 706 | 707 | ``` 708 | AWS_AUTOPILOT_CREATE_ENDPOINT(ENDPOINT_NAME VARCHAR, ENDPOINT_CONFIG_NAME VARCHAR,MODEL_ENDPOINT_TTL INTEGER) 709 | ``` 710 | 711 | **Arguments (all are required parameters):** 712 | 713 | `ENDPOINT_NAME` - The name of the endpoint. The exact endpoint name must be provided during inference.Allowed Pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}` 714 | 715 | `ENDPOINT_CONFIG_NAME` - The name of the endpoint configuration. 716 | 717 | **Note:** If you would like to reuse the default endpoint config created during model creation this would be `yourmodelname-m5-4xl-2`. 718 | 719 | `MODEL_ENDPOINT_TTL` (optional) - Time to live off the model endpoint in seconds. If NULL the default value will be 7 days. 720 | 721 | **Usage:** 722 | 723 | ``` 724 | select aws_autopilot_create_endpoint ('abalone-endpoint', 'abalone-endpoint-config', 36000) 725 | ``` 726 | 727 | ### Describe Endpoint 728 | 729 | Use the `AWS_AUTOPILOT_DESCRIBE_ENDPOINT` external function in a SQL 730 | query to get the description of an endpoint. 731 | 732 | **Syntax:** 733 | 734 | ``` 735 | AWS_AUTOPILOT_DESCRIBE_ENDPOINT(ENDPOINT_NAME VARCHAR) 736 | ``` 737 | 738 | **Arguments (all are required parameters):** 739 | 740 | `ENDPOINT_NAME` - The name of the endpoint. 741 | 742 | **Usage:** 743 | 744 | ``` 745 | select aws_autopilot_describe_endpoint('abalone-endpoint') 746 | ``` 747 | 748 | **Response:** 749 | 750 | `CreationTime` - A timestamp that shows when the endpoint was created. 751 | 752 | `EndpointConfigName` - The name of the endpoint configuration associated with this endpoint. 753 | 754 | `EndpointStatus` - The status of the endpoint. (Valid values: OutOfService \| Creating \| Updating \| SystemUpdating \| RollingBack \| InService \| Deleting \| Failed) 755 | 756 | `FailureReason` - If the status of the endpoint is Failed, the reason why it failed. 757 | 758 | ### Delete Endpoint 759 | 760 | Use the `AWS_AUTOPILOT_DELETE_ENDPOINT` external function in a SQL query 761 | to delete an endpoint. Amazon SageMaker frees up all of the resources 762 | that were deployed when the endpoint was created. 763 | 764 | **Syntax:** 765 | ``` 766 | AWS_AUTOPILOT_DELETE_ENDPOINT(ENDPOINT_NAME VARCHAR) 767 | ``` 768 | 769 | **Arguments (all are required parameters):** 770 | 771 | `ENDPOINT_NAME` - The name of the endpoint. 772 | 773 | **Usage:** 774 | 775 | ``` 776 | select aws_autopilot_delete_endpoint('abalone-endpoint') 777 | ``` 778 | 779 | ## SageMaker Clarify and SageMaker Studio 780 | 781 | Amazon SageMaker Clarify provides machine learning developers with 782 | greater visibility into their training data and models so they can 783 | identify and limit bias and explain predictions. During the model 784 | training process, SageMaker Autopilot automatically creates a notebook 785 | (and PDF report) that displays the 10 features with the greatest feature 786 | attribution. The notebook is stored in: 787 | 788 | `/output//documentation/explainability/output/` 789 | 790 | Additional information about the generated model can be found in Amazon 791 | SageMaker Studio. 792 | 793 | ## Costs 794 | 795 | There is no additional cost for using the provided Snowflake + Amazon 796 | SageMaker Autopilot Integration. 797 | 798 | You are responsible for: 799 | 800 | - The cost of the AWS services and Snowflake compute and storage used 801 | while running this reference deployment. 802 | 803 | The AWS CloudFormation template includes configuration parameters that 804 | you can customize. Some of these settings, such as instance type, affect 805 | the cost of deployment. For cost estimates, see the pricing pages for 806 | each AWS service you use. Prices are subject to change. 807 | 808 | **Tip:** After you deploy the template, [create AWS Cost and Usage 809 | Reports](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/billing-reports-gettingstarted-turnonreports.html) 810 | to track AWS costs associated with the integration. These reports 811 | deliver billing metrics to an Amazon Simple Storage Service (Amazon S3) 812 | bucket in your account. They provide cost estimates based on usage 813 | throughout each month and aggregate the data at the end of the month. 814 | For more information about the report, see [What are AWS Cost and Usage 815 | Reports?](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/billing-reports-costusage.html) 816 | 817 | ## Cleanup 818 | 819 | To cleanup the resources created by the integration: 820 | 821 | - Delete any Sagemaker endpoints that were provisioned while using the 822 | integration. You can do this by: 823 | 824 | - Using the [Delete Endpoint](#delete-endpoint) SQL command 825 | from Snowflake or 826 | 827 | - By opening the Amazon SageMaker console at 828 | [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) 829 | and deleting the endpoints. Deleting the endpoints also 830 | deletes the ML compute instances that support it. 831 | 832 | - Under Inference, choose Endpoints. 833 | 834 | - Choose the endpoint that you created, choose Actions, and 835 | then choose Delete. 836 | 837 | - Delete any Sagemaker endpoint configurations that were provisioned 838 | while using the integration. You can do this by: 839 | 840 | - Using the [Delete Endpoint 841 | Config](#delete-endpoint-config) SQL command from 842 | Snowflake or 843 | 844 | - By opening the Amazon SageMaker console at 845 | [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) 846 | and: 847 | 848 | - Under Inference, choose Endpoint configurations. 849 | 850 | - Choose the endpoint configurations that you created, choose 851 | Actions, and then choose Delete. 852 | 853 | - Delete any Sagemaker Autopilot Models that were created. You can do 854 | this by: 855 | 856 | - By opening the Amazon SageMaker console at 857 | [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/) 858 | and: 859 | 860 | - Under Inference, choose Models. 861 | 862 | - Choose the model that you created. Choose Actions, and then 863 | choose Delete. 864 | 865 | - Log in to the AWS console and navigate to CloudFormation service. 866 | Select the stack that was created when you deployed the template 867 | and click on Delete. This deletes all the AWS resources 868 | provisioned by the template, except the S3 bucket. S3 bucket is 869 | not automatically deleted as it might contain training data and 870 | outputs from the Autopilot jobs. 871 | 872 | - To delete the S3 bucket, you need to navigate to the S3 service 873 | and manually delete the bucket. For more information see 874 | [Deleting a 875 | bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-bucket.html). 876 | 877 | - Clean up the Snowflake resources by logging into the Snowflake 878 | console and 879 | 880 | - Use the [DROP 881 | INTEGRATION](https://docs.snowflake.com/en/sql-reference/sql/drop-integration.html#drop-integration) 882 | SQL command to delete the API and Storage integrations setup. 883 | 884 | Note: You can use the SQL command `SHOW INTEGRATIONS LIKE '%AWS_AUTOPILOT%'` to see the integrations. 885 | 886 | - Use the [DROP 887 | FUNCTION](https://docs.snowflake.com/en/sql-reference/sql/drop-function.html) 888 | SQL command to delete the user defined functions that were set up. 889 | 890 | Note: You can use the SQL command `SHOW FUNCTIONS LIKE '%AWS_AUTOPILOT%'` to see all the functions. 891 | --------------------------------------------------------------------------------