├── .gitignore
├── README.md
├── client.html
├── create_secure_vw.sql
├── handler.py
├── images
├── architecture.png
├── aws_account.png
├── client.png
├── deployed.png
├── environment.png
├── key_arn.png
├── private_key.png
├── profile.png
└── response.png
├── keypair_auth.py
├── package-lock.json
├── package.json
├── requirements.txt
├── serverless.yml
└── state_machine.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # Distribution / packaging
2 | .DS_Store
3 | .Python
4 | node_modules/
5 | env/
6 | build/
7 | develop-eggs/
8 | dist/
9 | downloads/
10 | eggs/
11 | .eggs/
12 | lib/
13 | lib64/
14 | parts/
15 | sdist/
16 | var/
17 | *.egg-info/
18 | .installed.cfg
19 | *.egg
20 |
21 | # Serverless directories
22 | .serverless
23 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Snowflake API for Analytics Applications
2 | ========================================
3 |
4 | This is an example of how to build a serverless API that leverages Snowflake's elastic data warehouse as DB engine for analytics applications. This example implements a read-only API that can run and paginate through the results of any view defined within a Snowflake database to which it has access to. It also leverages advanced Snowflake features like Multi-cluster Warehouses and multiple caching layers to build a truly scalable and performant analytics API at a fraction of the cost of legacy systems.
5 |
6 | Requirements
7 | ============
8 |
9 | - Access to a Snowflake account. Get your free trial account at: https://trial.snowflake.com/
10 |
11 | - Administrative access to an AWS account: https://aws.amazon.com/
12 |
13 |
14 | Architecture
15 | ===============
16 |
17 | This API is implemented as a completely serverless solution leveraging
18 | various AWS services including Lambda, API Gateway and Step Functions.
19 |
20 | 
21 |
22 | 1. Typical REST endpoints timeout after a few minutes. So, we've
23 | implemented this analytics API on websockets which lends itself
24 | better for long running analytical queries. The client initiates a
25 | connection and sends a message: `{"action": "run_view",
26 | "view_name": "trip_weather_vw"}`
27 |
28 | 2. The request is routed to a Lambda function which, in turn, triggers
29 | an AWS Step Functions workflow.
30 |
31 | 3. The step functions workflow initiates the query/view in Snowflake
32 | using the Snowflake connector for Python. The workflow then checks
33 | the status of the job in Snowflake every 5 seconds.
34 |
35 | 4. Once the query/view finishes execution, the response (either results
36 | or error) is sent back to the client asynchronously.
37 |
38 | Setup & Configuration
39 | ===============================
40 |
41 | 1. Install the latest NodeJS: . You can check to see if you have NodeJS installed by:
42 | ```
43 | node --version
44 | ```
45 |
46 | 2. This lab uses the **Serverless** framework
47 | () which you can install globally using:
48 | ```
49 | sudo npm install -g serverless
50 | ```
51 | If installed successfully, you should be able to now check the installed version:
52 | ```
53 | serverless --version
54 | ```
55 |
56 | 3. The API is implemented using Python 3. Check to see if you Python 3
57 | installed on your machine:
58 | ```
59 | python --version
60 | ```
61 | OR
62 | ```
63 | python3 --version
64 | ```
65 | If not installed, download and install Python 3:
66 |
67 |
68 | 4. If you don't already, install the AWS CLI using pip3:
69 | ```
70 | sudo pip3 install awscli --upgrade --user
71 | ```
72 | You can use `aws --version` command to verify if the AWS CLI was
73 | correctly installed. If it wasn't, see this to troubleshoot:
74 |
75 | If installing for the first time, you will need to configure AWS CLI by using:
76 | ```
77 | aws configure
78 | ```
79 | Make sure you have your AWS credentials handy when you configure the AWS CLI.
80 |
81 | 5. You can check if you have Git installed already by:
82 | ```
83 | git --version
84 | ```
85 | If not, install Git:
86 | https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
87 |
88 | 6. You will also need Docker to deploy the API. Download and install from here: https://www.docker.com/products/docker-desktop
89 |
90 | Snowflake Setup
91 | ===============
92 |
93 | Before we get into building the API, lets setup our backend Snowflake environment correctly
94 | so we have all the parameters ready when it comes time to edit the API code.
95 |
96 | 1. Create and save the RSA public and private keys using the procedure described here:
97 | https://docs.snowflake.net/manuals/user-guide/snowsql-start.html#using-key-pair-authentication
98 |
99 | Jot down the passphrase you used to encrypt the private key.
100 |
101 | 2. This example API is read-only and will get data by running a particular view within Snowflake. Lets go ahead and create a view that the API can run. Login to your Snowflake account and run the following SQL statements:
102 |
103 | ```sql
104 | use role accountadmin;
105 | create role if not exists snowflake_api_role;
106 | grant usage on database to role snowflake_api_role;
107 | grant usage on schema . to role snowflake_api_role;
108 | grant select on all views in schema . to role snowflake_api_role;
109 | grant select on future views in schema . to role snowflake_api_role;
110 | grant usage on warehouse to role snowflake_api_role;
111 |
112 | create user snowflake_api_user password='Snowfl*ke' default_role = snowflake_api_role must_change_password = false;
113 | alter user snowflake_api_user set rsa_public_key=''; --exclude the header and footer
114 | grant role snowflake_api_role to user snowflake_api_user;
115 | grant role snowflake_api_role to user ;
116 | ```
117 |
118 | 3. Create a test view with some test data, switch to using the new `snowflake_api_role` and try a simple select to see if the permissions work:
119 |
120 | ```sql
121 | use role snowflake_api_role;
122 | select * from limit 10;
123 | ```
124 |
125 | Clone, Modify and Deploy Code
126 | ========================================
127 |
128 | 1. Clone this repo:
129 | ```
130 | git clone https://github.com/filanthropic/snowflake-api.git
131 | ```
132 |
133 | 2. Before the Serverless framework can deploy this code, it needs the `serverless-python-requirements` plugin so lets install that (dependency is declared in package.json)
134 | ```
135 | cd snowflake-api/
136 | npm install
137 | ```
138 |
139 | 3. Open the AWS Secrets Manager and create a new secret that will hold the private key. Select 'Other type of secret' and then select `plaintext` and use `p_key` as the key and your private key that you generated in the Snowflake setup step 1 as the value.
140 |
141 | 
142 |
143 | Hit `Next`, give the secret a name and description. Hit `Next` again twice and then hit `Store`. Note the name you gave to the secret.
144 |
145 | 4. From within the `snowflake-api` directory, open `keypair_auth.py` and update the following line with the passphrase that you used when you created the key pair in Snowflake Setup step \#1:
146 | ```
147 | passkey = ""
148 | ```
149 |
150 | 5. Also within `keypair_auth.py`, update the `secret_name` variable with the name (not the ARN) of the secret you just created within AWS Secrets Manager:
151 | ```
152 | secret_name = ""
153 | ```
154 |
155 | 6. Open `serverless.yml`. At the top of this file contains the 'service' -> 'name'
156 | configuration. Go ahead and change the service name to whatever you want to name this project.
157 |
158 | 7. Change AWS account number in serverless.yml
159 |
160 | 
161 |
162 | 8. If using the default AWS CLI profile, remove the `profile` attribute in `serverless.yml`. If using a named profile, change it to match the AWS CLI profile you want to use to deploy:
163 |
164 | 
165 |
166 | 9. In serverless.yml, update the last part of the ARN (not the name) of the secret that holds the private key you previously created:
167 |
168 | 
169 |
170 | 10. Update the rest of the environment variables to match your Snowflake account, warehouse name, database and schema name within `serverless.yml`.
171 |
172 | 
173 |
174 | 11. Now we are ready to deploy the API to AWS. Go to the 'snowflake-api'
175 | folder and deploy the serverless stack:
176 | ```
177 | serverless deploy
178 | ```
179 | The command above will take all the code you cloned, package it up as
180 | AWS Lambda functions and deploys them. It also creates the AWS API
181 | Gateway endpoint with websockets and the AWS Step Functions state
182 | machine that orchestrates the Lambda functions.
183 |
184 | 12. Go ahead and make note of the API endpoint that you just created.
185 | 
186 |
187 | Using the API
188 | =============
189 |
190 | The API is based on websockets because of the long running nature of analytics queries. The best way to understand how the client interacts with the API is to first install the "wscat" tool.
191 |
192 | 1. Install the "wscat"
193 | ```
194 | sudo npm install -g wscat
195 | ```
196 |
197 | 2. Connect to the API endpoint you created in step \#13:
198 | ```
199 | wscat -c wss://
200 | ```
201 |
202 | 3. In the API code, we have implemented two websocket "routes" or types
203 | of actions that the API supports. First one is used to run a
204 | particular view and is called "**run_view**" and the other one
205 | called "**fetch_results**" is used to fetch cached results of an
206 | already run query and helps the client paginate through the results
207 | in an efficient manner.
208 |
209 | 4. Once connected, you can run the secure view you created previously
210 | by running:
211 | ```
212 | {"action": "run_view", "view_name":""}
213 | ```
214 |
215 | -------------------------------------------------------------------
216 |
217 | The response should look something like this:
218 |
219 | 
220 |
221 | 5. The response of the previous command should give you a `query_id`
222 | which you can use to paginate through the results:
223 | ```
224 | {"action": "fetch_results","query_id": "", "offset": "100"}
225 | ```
226 |
227 | 6. Open up the 'client.html' file in a browser to see how a simple HTML
228 | client can interact with our Snowflake API.
229 |
230 | 
231 |
232 |
233 |
234 | Additional Resources
235 | ====================
236 |
237 | - AWS API Gateway WebSocket Support:
238 |
239 |
240 | - Serverless Framework:
241 |
242 | - AWS Serverless Stack:
243 |
244 | - Snowflake Python Connector Keypair Auth:
245 |
246 |
--------------------------------------------------------------------------------
/client.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 | Snowflake API Client
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
Snowflake WebSocket API - Sample Client
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 | e.g. wss://xxxyyzz.execute-api.us-east-1.amazonaws.com/dev
41 |