├── README.md
└── swurl
/README.md:
--------------------------------------------------------------------------------
1 |
2 | # swurl
3 | A python tool intended to provide very basic functionality similar to `curl`, allowing you to make signed HTTP requests to AWS service endpoints over `socks5`.
4 |
5 |
6 |
7 | ## Requirements
8 | - awscli configured locally
9 | - python3 and `botocore` pip module
10 | - SSH access to EC2 instance in same VPC with port forwarding ability
11 |
12 |
13 | ## Info
14 |
15 | This tool only exists so I could provide a way to communicate with an endpoint in a private VPC over SOCKS proxy. It started as a simple request to help access a DB instance and soon turned into both a learning exercise and guide. I've documented it below in case others find it useful or helpful.
16 |
17 | **NOTE:** There are other tools out there that already provide `curl`-like functionality that are probably better managed and supported.
18 |
19 | **Some thoughts after working on this:**
20 | - It would be cool if there was consistency between service names and service endpoints. This would allow us to easily derive both the `region` and `service` from the endpoint itself. For example, the service name required when signing requests for AWS for Neptune is `neptune-db` but all cluster and instance endpoints are in format: `{identifier}.{region}.{neptune}.amazonaws.com`
21 | - `botocore` should allow something other than `HTTPS_PROXY` as an option to provide proxy config. Maybe a unique environment variable such as `AWS_PROXY` or a configuration option. Not saying they should ignore the standard env var but what if I only want to proxy `botocore` functions without affecting other applications?
22 | - `socks5` is awesome. Dynamic port forwarding over SSH in general is a powerful tool for accessing AWS resources and services if you don't have a VPN. With a single `ssh` command I have access to my VPC and multiple services which are not publicly exposed e.g. `neptune` and `es`. This can be combined with AWS SSM to achieve access without exposing any public resources.
23 |
24 |
25 | ## Accessing private resources inside a VPC
26 | So, you've launched a new AWS Neptune cluster and now need to query it remotely via the rest API. A few things worth noting:
27 |
28 | - Neptune must be provisioned inside a VPC
29 | - It will only respond to local requests from within the VPC
30 | - Neptune does not expose any public cluster or instance endpoints
31 | - All connections must be over TLS (enforced in `ap-southeast-2`)
32 | - Username/password auth is not an option for neptune so IAM authentication it is
33 |
34 |
35 | ## General strategies to provide access
36 | - Expose a public NLB that terminates SSL and proxies requests to the cluster endpoint
37 | - Expose a public ALB that forwards requests to `haproxy` running on EC2, which then proxies requests to Neptune
38 | - Setup API gateway and expose an endpoint. This triggers a lambda function with permission to query neptune and return the results. Could be set up to accept either IAM for auth or a custom lambda authoriser
39 | - dedicated VPN to the VPC
40 |
41 | See some samples from AWS -> https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/connecting-using-a-load-balancer
42 |
43 |
44 | ## Access using SSH (and SSM?)
45 |
46 | Here we will go over some of the ways you can use SSH to access private resources in your VPC.
47 |
48 | The most common way is via a public-facing bastion you have access to:
49 | ```
50 | +--------------------------------------------+
51 | | +-----------------+ VPC |
52 | +----------------+ | | public subnet | +-----------------+ |
53 | | laptop/pc | | | +-----------+ | | private subnet | |
54 | | | | | | | | | +----------+ | |
55 | | +-------+ +-------------------+ |<-------->| neptune | | |
56 | | | app |<--->| SSH tunnel | instance | | | +----------+ | |
57 | | +-------+ +-------------------+ (EC2) | | | +--------+ | |
58 | | | | | | |<--------->| ES | | |
59 | | | | | +-----------+ | | +--------+ | |
60 | +----------------+ | +-----------------+ +-----------------+ |
61 | +--------------------------------------------+
62 | ```
63 |
64 |
65 | But, by making use of AWS SSM we can access our neptune cluster without exposing any public resources. Like below:
66 | ```
67 | +---------------------------------+
68 | +----------------+ | PRIVATE SUBNET IN VPC |
69 | | laptop/pc | | +-----------+ |
70 | | | | | | +----------+ |
71 | | +--------+ +-------------------+ |<---->| neptune | |
72 | | | app |<-->| SSH tunnel | instance | +----------+ |
73 | | +--------+ +-------------------+ (EC2) | +--------+ |
74 | | | | | |<---->| ES | |
75 | +----------------+ | +-----------+ +--------+ |
76 | +---------------------------------+
77 | ```
78 |
79 | For more information on SSH and SSM, see the official documentation here -> https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-enable-ssh-connections.html
80 |
81 | I won't be going over how to set up SSM in this guide but continue reading for details on SSH.
82 |
83 |
84 | ## How to forward all the ports using SSH
85 |
86 | ### Local port forwarding
87 |
88 | One way of achieving this is to setup local port forwarding via SSH. For example:
89 | ```
90 | $ ssh -f -NT user@bastion.host.com -L 8182:name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182
91 | ```
92 | This tells SSH to bind to local port `8182` and forward all TCP traffic to the neptune endpoint on port `8182` via the `bastion`. This can be used to securely access unencrypted services (e.g. `HTTP` or `SMTP`) by using SSH to encrypt communication between your machine and the remote server.
93 |
94 | In our case we are required to communicate with `neptune` via TLS. One of the challenges for developers and/or others accessing these endpoints is that local applications might complain about the SSL certificate (provided by the endpoint) not matching the hostname we're connecting to:
95 | ```
96 | $ curl https://localhost:8182/status -H 'Host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182'
97 | curl: (60) SSL: no alternative certificate subject name matches target host name 'localhost'
98 | More details here: https://curl.haxx.se/docs/sslcerts.html
99 | ```
100 | From `curl`'s verbose output:
101 | ```
102 | * Trying ::1:8182...
103 | * TCP_NODELAY set
104 | * Connected to localhost (::1) port 8182 (#0)
105 | * ALPN, offering h2
106 | * ALPN, offering http/1.1
107 | * successfully set certificate verify locations:
108 | * CAfile: /etc/pki/tls/certs/ca-bundle.crt
109 | CApath: none
110 | * TLSv1.3 (OUT), TLS handshake, Client hello (1):
111 | * TLSv1.3 (IN), TLS handshake, Server hello (2):
112 | * TLSv1.2 (IN), TLS handshake, Certificate (11):
113 | * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
114 | * TLSv1.2 (IN), TLS handshake, Server finished (14):
115 | * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
116 | * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
117 | * TLSv1.2 (OUT), TLS handshake, Finished (20):
118 | * TLSv1.2 (IN), TLS handshake, Finished (20):
119 | * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
120 | * ALPN, server did not agree to a protocol
121 | * Server certificate:
122 | * subject: CN=*.identifier.ap-southeast-2.neptune.amazonaws.com
123 | * start date: Jan 15 00:00:00 2020 GMT
124 | * expire date: Feb 15 12:00:00 2021 GMT
125 | * subjectAltName does not match localhost
126 | * SSL: no alternative certificate subject name matches target host name 'localhost'
127 | * Closing connection 0
128 | * TLSv1.2 (OUT), TLS alert, close notify (256):
129 | curl: (60) SSL: no alternative certificate subject name matches target host name 'localhost'
130 | More details here: https://curl.haxx.se/docs/sslcerts.html
131 |
132 | curl failed to verify the legitimacy of the server and therefore could not
133 | establish a secure connection to it. To learn more about this situation and
134 | how to fix it, please visit the web page mentioned above.
135 | ```
136 | So, a quick google search tells us we can "fix" this using the `--insecure` option (and now we're getting output!):
137 | ```
138 | $ curl -k https://localhost:8182/status -H 'Host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182'
139 | {"requestId":"494c6472-b1b5-42ce-80d0-d7d2a46266cd","code":"AccessDeniedException","detailedMessage":"Missing Authentication Token"}
140 | ```
141 | **NOTE**: This is not a solution and will not always work. Applications are not required to have functionality allowing you to ignore or bypass SSL hostname verification. Consider having `ssh` bind to a local port on something like `127.0.1.10` and using your `hosts` file if you need to make this work.
142 |
143 | It's still annoying though having to manually specify the `host` header every time we make a request. We also potentially need to implement alternate logic in our code to account for when we use local port forwarding..
144 |
145 |
146 | ### Dynamic port forwarding (SOCKS)
147 |
148 | As I've come to discover, most if not all AWS services support dynamic port forwarding over SOCKS. This means we can tell `ssh` to bind to a local port and act as a SOCKS proxy server. When we connect to the local port the request is forwarded over the secure tunnel to the bastion and then to the relevant endpoint based on the application protocol (determined by the `hostname` and `port` in our request). We can now send HTTPS requests without having to worry about specifying the `host` header each time!
149 |
150 | Example of setting up dynamic forwarding on local port `8888`:
151 | ```
152 | $ ssh -f -NT -D 8888 user@bastion.host.com
153 | ```
154 | We've told `ssh` to go to the background with `-f` and can use `ss` to verify `ssh` is listening on port `8888`:
155 | ```
156 | $ ss -lntp sport :8888
157 | State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
158 | LISTEN 0 128 127.0.0.1:8888 0.0.0.0:* users:(("ssh",pid=544833,fd=8))
159 | LISTEN 0 128 [::1]:8888 [::]:* users:(("ssh",pid=544833,fd=5))
160 | ```
161 |
162 | So now we can query the cluster endpoint by specifying the `socks5` proxy with `curl`:
163 | ```
164 | $ curl -x socks5://localhost:8888 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status
165 | curl: (6) Could not resolve host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com
166 | ```
167 | Well. That's annoying.. DNS for the cluster endpoints resolve to private IP addresses:
168 | ```
169 | $ dig name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com @1.1.1.1 +short
170 | instance20200521052459345800000002.identifier.ap-southeast-2.neptune.amazonaws.com.
171 | 10.0.40.33
172 | ```
173 |
174 |
175 | ### Using socks5h
176 |
177 | A solution to our problem, implemented by `libcurl`, is `socks5h` (`CURLPROXY_SOCKS5_HOSTNAME`) [0]. The difference between this and regular `socks5` is that we tell the SSH proxy to take care of DNS resolution (in our case, via the bastion). Now we can query the endpoint directly even though we can't resolve DNS locally:
178 | ```
179 | $ curl -x socks5h://localhost:8888 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status
180 | {"requestId":"494c6472-b1b5-42ce-80d0-d7d2a46266cd","code":"AccessDeniedException","detailedMessage":"Missing Authentication Token"}
181 | ```
182 | FYI the python `requests` library supports the `socks5h` implementation. Many applications make use of `libcurl` and hopefully more libraries will support the `hostname` implementation of `socks5` in future.
183 | [0] https://curl.haxx.se/libcurl/c/CURLOPT_SOCKS_PROXY.html
184 |
185 |
186 | ## IAM Authentication
187 |
188 | And now, the last piece of the puzzle.. how do we authenticate using our IAM credentials when making HTTP `GET` or `POST` requests?
189 |
190 | Basically, we need to sign our request by attaching authentication information in the `headers` of our HTTP request. This signature is calculated and created using information from our http request along with our AWS `credentials`, the `service` we're querying and the `region` the service is in.
191 |
192 | See here for more information -> https://docs.aws.amazon.com/general/latest/gr/sigv4_signing.html
193 |
194 | After trying out a couple of tools I ran into some issues when specifying non-standard URLs e.g. while using port forwarding with custom `host` headers. Being able to specify a `socks5h` proxy was also a requirement in my case. I ended up putting together `swurl` which makes use of `botocore`'s signing functions.
195 |
196 |
197 | ## Usage examples
198 |
199 | Query `GetCallerIdentity` using `GET` request to AWS STS endpoint:
200 | ```
201 | [elpy@testbox ~]$ swurl --profile sandpit --service sts --region us-east-1 'https://sts.amazonaws.com/?Action=GetCallerIdentity&Version=2011-06-15'
202 |
203 |
204 | arn:aws:sts::123456789012:assumed-role/elpy-admin/botocore-session-1590482957
205 | AROAUBLYXXXXXXACIWXS2:botocore-session-1590482957
206 | 123456789012
207 |
208 |
209 | 26374e3f-9b12-4526-8649-d19eaf366e02
210 |
211 |
212 | ```
213 |
214 | Query `GetUser` using `GET` request to IAM endpoint:
215 | ```
216 | [elpy@testbox ~]$ swurl --profile sandpit --service iam --region us-east-1 'https://iam.amazonaws.com/?Action=GetUser&UserName=elpy&Version=2010-05-08'
217 |
218 |
219 |
220 | /
221 | arn:aws:iam::0123456789012:user/elpy
222 | elpy
223 | AIDAUBXXXXXX47Z2RDQSM
224 | 2020-04-27T11:27:42Z
225 |
226 |
227 |
228 | 69790a00-0484-49b0-b04f-ff466594807b
229 |
230 |
231 | ```
232 | Querying our `neptune` cluster via the `socks5h` proxy:
233 | ```
234 | [elpy@testbox ~]$ swurl --socks localhost:8888 --profile sandpit --service neptune-db --region ap-southeast-2 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status
235 | {"status":"healthy","startTime":"Thu May 21 05:30:27 UTC 2020","dbEngineVersion":"1.0.2.2.R2","role":"writer","gremlin":{"version":"tinkerpop-3.4.3"},"sparql":{"version":"sparql-1.1"},"labMode":{"ObjectIndex":"disabled","ReadWriteConflictDetection":"enabled"}}
236 | ```
237 |
238 | Use the `--env` option to print out the required environment variables so we don't need to keep specifying cli arguments:
239 | ```
240 | $ swurl --socks localhost:8888 --profile sandpit --service neptune-db --region ap-southeast-2 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status --env
241 | export AWS_PROFILE="sandpit"
242 | export AWS_SERVICE="neptune-db"
243 | export AWS_REGION="ap-southeast-2"
244 | export SWURL_SOCKS="localhost:8888"
245 | ```
246 |
247 | Copy and paste them in, and you can continue without all the arguments:
248 | ```
249 | [elpy@testbox ~]$ export AWS_PROFILE="sandpit"
250 | [elpy@testbox ~]$ export AWS_SERVICE="neptune-db"
251 | [elpy@testbox ~]$ export AWS_REGION="ap-southeast-2"
252 | [elpy@testbox ~]$ export SWURL_SOCKS="localhost:8888"
253 | [elpy@testbox ~]$ swurl https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/gremlin/status
254 | {
255 | "acceptedQueryCount" : 0,
256 | "runningQueryCount" : 0,
257 | "queries" : [ ]
258 | }
259 | ```
260 | Querying a non-public elasticsearch cluster:
261 | ```
262 | [elpy@testbox ~]$ swurl --profile sandpit --socks localhost:8888 --region ap-southeast-2 --service es 'https://elpydev-identifier.ap-southeast-2.es.amazonaws.com/_cluster/health?wait_for_status=yellow&timeout=50s&pretty'
263 | {
264 | "cluster_name" : "824439210008:elpydev",
265 | "status" : "green",
266 | "timed_out" : false,
267 | "number_of_nodes" : 1,
268 | "number_of_data_nodes" : 1,
269 | "discovered_master" : true,
270 | "active_primary_shards" : 1,
271 | "active_shards" : 1,
272 | "relocating_shards" : 0,
273 | "initializing_shards" : 0,
274 | "unassigned_shards" : 0,
275 | "delayed_unassigned_shards" : 0,
276 | "number_of_pending_tasks" : 0,
277 | "number_of_in_flight_fetch" : 0,
278 | "task_max_waiting_in_queue_millis" : 0,
279 | "active_shards_percent_as_number" : 100.0
280 | }
281 | ```
282 | Make a `POST` request to IAM service to create new group:
283 | ```
284 | [elpy@testbox ~]$ swurl --profile sandpit -X POST -d 'Action=CreateGroup&GroupName=Testing111&Version=2010-05-08' --service iam --region us-east-1 'https://iam.amazonaws.com/'
285 |
286 |
287 |
288 | /
289 | Testing111
290 | AGPAUXXXXJQSRHH7KYEP4
291 | arn:aws:iam::012345678901:group/Testing111
292 | 2020-05-27T13:24:48Z
293 |
294 |
295 |
296 | 71f717e8-7d3c-4818-a2c5-4abd2f7991f2
297 |
298 |
299 |
300 | ```
301 |
302 |
--------------------------------------------------------------------------------
/swurl:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | import argparse
3 | import json
4 | import logging
5 | import os
6 | import requests
7 |
8 | from boto3.session import Session
9 | from urllib.parse import urlparse
10 | from requests import Request
11 | from requests.auth import AuthBase
12 | from botocore.auth import SigV4Auth
13 | from botocore.awsrequest import AWSRequest
14 |
15 | class AWSAuth(AuthBase):
16 | def __init__(self, credentials, service, region):
17 | self.credentials = credentials
18 | self.region = region
19 | self.service = service
20 |
21 | def __call__(self, r: Request):
22 | u = urlparse(r.url)
23 | if not u.scheme.startswith('http'):
24 | raise ValueError('invalid uri scheme')
25 | if r.headers.get('host'):
26 | u = u._replace(netloc=r.headers['host'])
27 | if u.port in (80, 443):
28 | u = u._replace(netloc=u.hostname)
29 | if r.headers['host'].startswith(u.hostname):
30 | r.headers['host'] = u.hostname
31 | if not u.path:
32 | u = u._replace(path='/')
33 |
34 | a = AWSRequest(method=r.method.upper(), url=u.geturl(), data=r.body)
35 | SigV4Auth(self.credentials, self.service, self.region).add_auth(a)
36 | r.headers.update(a.headers)
37 | return r
38 |
39 |
40 | def process_args():
41 | parser = argparse.ArgumentParser(description='aws curl', prog='swurl')
42 | parser.add_argument('--profile', default=os.getenv('AWS_PROFILE'),
43 | metavar='profile', choices=Session().available_profiles)
44 | parser.add_argument('--service', default=os.getenv('AWS_SERVICE'))
45 | parser.add_argument('--region', default=os.getenv('AWS_REGION'))
46 | parser.add_argument('--socks', default=os.getenv('SWURL_SOCKS'))
47 | parser.add_argument('--request', '-X', dest='method', default='GET',
48 | choices=['GET', 'POST'])
49 | parser.add_argument('--data', '-d')
50 | parser.add_argument('--header', '-H', action='append')
51 | parser.add_argument('--insecure', '-k', action='store_true')
52 | parser.add_argument('--env', action='store_true')
53 | parser.add_argument('url')
54 | args = vars(parser.parse_args())
55 |
56 | args['headers'] = {}
57 | if args.get('header'):
58 | for header in args['header']:
59 | if not (len(header.split()) == 2 and ':' in header):
60 | raise SystemExit(f'swurl: error: invalid header')
61 | k, v = header.lower().split(':', 1)
62 | args['headers'][k.strip()] = v.strip()
63 | if args['method'] == 'POST':
64 | args['headers']['Content-Type'] = 'application/x-www-form-urlencoded'
65 | return args
66 |
67 |
68 | def printenv(args):
69 | for x in ['profile', 'service', 'region']:
70 | if args.get(x):
71 | print(f'export AWS_{x.upper()}="{args[x]}"')
72 | if args.get('socks'):
73 | print(f'export SWURL_SOCKS="{args["socks"]}"')
74 | exit(0)
75 |
76 |
77 | def main():
78 | args = process_args()
79 | session = Session(profile_name=args['profile'])
80 | creds = session.get_credentials()
81 | method = args['method']
82 | service = args['service']
83 | region = args['region'] or session.region_name
84 | data = args['data']
85 | proxies = None
86 | verify = False if args.get('insecure') else None
87 |
88 | if args.get('data') and args['data'].startswith('@'):
89 | f = ''.join(args['data'].split('@'))
90 | data = ''.join([line.strip() for line in open(f)])
91 |
92 | if args['socks']:
93 | proxies = dict(https=f'socks5h://{args["socks"]}')
94 |
95 | if args.get('env'): printenv(args)
96 |
97 | try:
98 | awsauth = AWSAuth(creds, service, region)
99 | resp = requests.request(
100 | method=method,
101 | url=args['url'],
102 | data=data,
103 | proxies=proxies,
104 | headers=args.get('headers'),
105 | auth=awsauth,
106 | verify=verify)
107 | except Exception as e:
108 | raise SystemExit(f'error: swurl: {e.__class__.__name__}: {e}')
109 | else:
110 | print(resp.text)
111 |
112 |
113 | main()
114 |
--------------------------------------------------------------------------------