├── README.md └── swurl /README.md: -------------------------------------------------------------------------------- 1 | 2 | # swurl 3 | A python tool intended to provide very basic functionality similar to `curl`, allowing you to make signed HTTP requests to AWS service endpoints over `socks5`. 4 | 5 | 6 | 7 | ## Requirements 8 | - awscli configured locally 9 | - python3 and `botocore` pip module 10 | - SSH access to EC2 instance in same VPC with port forwarding ability 11 | 12 | 13 | ## Info 14 | 15 | This tool only exists so I could provide a way to communicate with an endpoint in a private VPC over SOCKS proxy. It started as a simple request to help access a DB instance and soon turned into both a learning exercise and guide. I've documented it below in case others find it useful or helpful. 16 | 17 | **NOTE:** There are other tools out there that already provide `curl`-like functionality that are probably better managed and supported. 18 | 19 | **Some thoughts after working on this:** 20 | - It would be cool if there was consistency between service names and service endpoints. This would allow us to easily derive both the `region` and `service` from the endpoint itself. For example, the service name required when signing requests for AWS for Neptune is `neptune-db` but all cluster and instance endpoints are in format: `{identifier}.{region}.{neptune}.amazonaws.com` 21 | - `botocore` should allow something other than `HTTPS_PROXY` as an option to provide proxy config. Maybe a unique environment variable such as `AWS_PROXY` or a configuration option. Not saying they should ignore the standard env var but what if I only want to proxy `botocore` functions without affecting other applications? 22 | - `socks5` is awesome. Dynamic port forwarding over SSH in general is a powerful tool for accessing AWS resources and services if you don't have a VPN. With a single `ssh` command I have access to my VPC and multiple services which are not publicly exposed e.g. `neptune` and `es`. This can be combined with AWS SSM to achieve access without exposing any public resources. 23 | 24 | 25 | ## Accessing private resources inside a VPC 26 | So, you've launched a new AWS Neptune cluster and now need to query it remotely via the rest API. A few things worth noting: 27 | 28 | - Neptune must be provisioned inside a VPC 29 | - It will only respond to local requests from within the VPC 30 | - Neptune does not expose any public cluster or instance endpoints 31 | - All connections must be over TLS (enforced in `ap-southeast-2`) 32 | - Username/password auth is not an option for neptune so IAM authentication it is 33 | 34 | 35 | ## General strategies to provide access 36 | - Expose a public NLB that terminates SSL and proxies requests to the cluster endpoint 37 | - Expose a public ALB that forwards requests to `haproxy` running on EC2, which then proxies requests to Neptune 38 | - Setup API gateway and expose an endpoint. This triggers a lambda function with permission to query neptune and return the results. Could be set up to accept either IAM for auth or a custom lambda authoriser 39 | - dedicated VPN to the VPC 40 | 41 | See some samples from AWS -> https://github.com/aws-samples/aws-dbs-refarch-graph/tree/master/src/connecting-using-a-load-balancer 42 | 43 | 44 | ## Access using SSH (and SSM?) 45 | 46 | Here we will go over some of the ways you can use SSH to access private resources in your VPC. 47 | 48 | The most common way is via a public-facing bastion you have access to: 49 | ``` 50 | +--------------------------------------------+ 51 | | +-----------------+ VPC | 52 | +----------------+ | | public subnet | +-----------------+ | 53 | | laptop/pc | | | +-----------+ | | private subnet | | 54 | | | | | | | | | +----------+ | | 55 | | +-------+ +-------------------+ |<-------->| neptune | | | 56 | | | app |<--->| SSH tunnel | instance | | | +----------+ | | 57 | | +-------+ +-------------------+ (EC2) | | | +--------+ | | 58 | | | | | | |<--------->| ES | | | 59 | | | | | +-----------+ | | +--------+ | | 60 | +----------------+ | +-----------------+ +-----------------+ | 61 | +--------------------------------------------+ 62 | ``` 63 | 64 | 65 | But, by making use of AWS SSM we can access our neptune cluster without exposing any public resources. Like below: 66 | ``` 67 | +---------------------------------+ 68 | +----------------+ | PRIVATE SUBNET IN VPC | 69 | | laptop/pc | | +-----------+ | 70 | | | | | | +----------+ | 71 | | +--------+ +-------------------+ |<---->| neptune | | 72 | | | app |<-->| SSH tunnel | instance | +----------+ | 73 | | +--------+ +-------------------+ (EC2) | +--------+ | 74 | | | | | |<---->| ES | | 75 | +----------------+ | +-----------+ +--------+ | 76 | +---------------------------------+ 77 | ``` 78 | 79 | For more information on SSH and SSM, see the official documentation here -> https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-enable-ssh-connections.html 80 | 81 | I won't be going over how to set up SSM in this guide but continue reading for details on SSH. 82 | 83 | 84 | ## How to forward all the ports using SSH 85 | 86 | ### Local port forwarding 87 | 88 | One way of achieving this is to setup local port forwarding via SSH. For example: 89 | ``` 90 | $ ssh -f -NT user@bastion.host.com -L 8182:name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182 91 | ``` 92 | This tells SSH to bind to local port `8182` and forward all TCP traffic to the neptune endpoint on port `8182` via the `bastion`. This can be used to securely access unencrypted services (e.g. `HTTP` or `SMTP`) by using SSH to encrypt communication between your machine and the remote server. 93 | 94 | In our case we are required to communicate with `neptune` via TLS. One of the challenges for developers and/or others accessing these endpoints is that local applications might complain about the SSL certificate (provided by the endpoint) not matching the hostname we're connecting to: 95 | ``` 96 | $ curl https://localhost:8182/status -H 'Host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182' 97 | curl: (60) SSL: no alternative certificate subject name matches target host name 'localhost' 98 | More details here: https://curl.haxx.se/docs/sslcerts.html 99 | ``` 100 | From `curl`'s verbose output: 101 | ``` 102 | * Trying ::1:8182... 103 | * TCP_NODELAY set 104 | * Connected to localhost (::1) port 8182 (#0) 105 | * ALPN, offering h2 106 | * ALPN, offering http/1.1 107 | * successfully set certificate verify locations: 108 | * CAfile: /etc/pki/tls/certs/ca-bundle.crt 109 | CApath: none 110 | * TLSv1.3 (OUT), TLS handshake, Client hello (1): 111 | * TLSv1.3 (IN), TLS handshake, Server hello (2): 112 | * TLSv1.2 (IN), TLS handshake, Certificate (11): 113 | * TLSv1.2 (IN), TLS handshake, Server key exchange (12): 114 | * TLSv1.2 (IN), TLS handshake, Server finished (14): 115 | * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): 116 | * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): 117 | * TLSv1.2 (OUT), TLS handshake, Finished (20): 118 | * TLSv1.2 (IN), TLS handshake, Finished (20): 119 | * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 120 | * ALPN, server did not agree to a protocol 121 | * Server certificate: 122 | * subject: CN=*.identifier.ap-southeast-2.neptune.amazonaws.com 123 | * start date: Jan 15 00:00:00 2020 GMT 124 | * expire date: Feb 15 12:00:00 2021 GMT 125 | * subjectAltName does not match localhost 126 | * SSL: no alternative certificate subject name matches target host name 'localhost' 127 | * Closing connection 0 128 | * TLSv1.2 (OUT), TLS alert, close notify (256): 129 | curl: (60) SSL: no alternative certificate subject name matches target host name 'localhost' 130 | More details here: https://curl.haxx.se/docs/sslcerts.html 131 | 132 | curl failed to verify the legitimacy of the server and therefore could not 133 | establish a secure connection to it. To learn more about this situation and 134 | how to fix it, please visit the web page mentioned above. 135 | ``` 136 | So, a quick google search tells us we can "fix" this using the `--insecure` option (and now we're getting output!): 137 | ``` 138 | $ curl -k https://localhost:8182/status -H 'Host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182' 139 | {"requestId":"494c6472-b1b5-42ce-80d0-d7d2a46266cd","code":"AccessDeniedException","detailedMessage":"Missing Authentication Token"} 140 | ``` 141 | **NOTE**: This is not a solution and will not always work. Applications are not required to have functionality allowing you to ignore or bypass SSL hostname verification. Consider having `ssh` bind to a local port on something like `127.0.1.10` and using your `hosts` file if you need to make this work. 142 | 143 | It's still annoying though having to manually specify the `host` header every time we make a request. We also potentially need to implement alternate logic in our code to account for when we use local port forwarding.. 144 | 145 | 146 | ### Dynamic port forwarding (SOCKS) 147 | 148 | As I've come to discover, most if not all AWS services support dynamic port forwarding over SOCKS. This means we can tell `ssh` to bind to a local port and act as a SOCKS proxy server. When we connect to the local port the request is forwarded over the secure tunnel to the bastion and then to the relevant endpoint based on the application protocol (determined by the `hostname` and `port` in our request). We can now send HTTPS requests without having to worry about specifying the `host` header each time! 149 | 150 | Example of setting up dynamic forwarding on local port `8888`: 151 | ``` 152 | $ ssh -f -NT -D 8888 user@bastion.host.com 153 | ``` 154 | We've told `ssh` to go to the background with `-f` and can use `ss` to verify `ssh` is listening on port `8888`: 155 | ``` 156 | $ ss -lntp sport :8888 157 | State Recv-Q Send-Q Local Address:Port Peer Address:Port Process 158 | LISTEN 0 128 127.0.0.1:8888 0.0.0.0:* users:(("ssh",pid=544833,fd=8)) 159 | LISTEN 0 128 [::1]:8888 [::]:* users:(("ssh",pid=544833,fd=5)) 160 | ``` 161 | 162 | So now we can query the cluster endpoint by specifying the `socks5` proxy with `curl`: 163 | ``` 164 | $ curl -x socks5://localhost:8888 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status 165 | curl: (6) Could not resolve host: name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com 166 | ``` 167 | Well. That's annoying.. DNS for the cluster endpoints resolve to private IP addresses: 168 | ``` 169 | $ dig name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com @1.1.1.1 +short 170 | instance20200521052459345800000002.identifier.ap-southeast-2.neptune.amazonaws.com. 171 | 10.0.40.33 172 | ``` 173 | 174 | 175 | ### Using socks5h 176 | 177 | A solution to our problem, implemented by `libcurl`, is `socks5h` (`CURLPROXY_SOCKS5_HOSTNAME`) [0]. The difference between this and regular `socks5` is that we tell the SSH proxy to take care of DNS resolution (in our case, via the bastion). Now we can query the endpoint directly even though we can't resolve DNS locally: 178 | ``` 179 | $ curl -x socks5h://localhost:8888 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status 180 | {"requestId":"494c6472-b1b5-42ce-80d0-d7d2a46266cd","code":"AccessDeniedException","detailedMessage":"Missing Authentication Token"} 181 | ``` 182 | FYI the python `requests` library supports the `socks5h` implementation. Many applications make use of `libcurl` and hopefully more libraries will support the `hostname` implementation of `socks5` in future. 183 | [0] https://curl.haxx.se/libcurl/c/CURLOPT_SOCKS_PROXY.html 184 | 185 | 186 | ## IAM Authentication 187 | 188 | And now, the last piece of the puzzle.. how do we authenticate using our IAM credentials when making HTTP `GET` or `POST` requests? 189 | 190 | Basically, we need to sign our request by attaching authentication information in the `headers` of our HTTP request. This signature is calculated and created using information from our http request along with our AWS `credentials`, the `service` we're querying and the `region` the service is in. 191 | 192 | See here for more information -> https://docs.aws.amazon.com/general/latest/gr/sigv4_signing.html 193 | 194 | After trying out a couple of tools I ran into some issues when specifying non-standard URLs e.g. while using port forwarding with custom `host` headers. Being able to specify a `socks5h` proxy was also a requirement in my case. I ended up putting together `swurl` which makes use of `botocore`'s signing functions. 195 | 196 | 197 | ## Usage examples 198 | 199 | Query `GetCallerIdentity` using `GET` request to AWS STS endpoint: 200 | ``` 201 | [elpy@testbox ~]$ swurl --profile sandpit --service sts --region us-east-1 'https://sts.amazonaws.com/?Action=GetCallerIdentity&Version=2011-06-15' 202 | 203 | 204 | arn:aws:sts::123456789012:assumed-role/elpy-admin/botocore-session-1590482957 205 | AROAUBLYXXXXXXACIWXS2:botocore-session-1590482957 206 | 123456789012 207 | 208 | 209 | 26374e3f-9b12-4526-8649-d19eaf366e02 210 | 211 | 212 | ``` 213 | 214 | Query `GetUser` using `GET` request to IAM endpoint: 215 | ``` 216 | [elpy@testbox ~]$ swurl --profile sandpit --service iam --region us-east-1 'https://iam.amazonaws.com/?Action=GetUser&UserName=elpy&Version=2010-05-08' 217 | 218 | 219 | 220 | / 221 | arn:aws:iam::0123456789012:user/elpy 222 | elpy 223 | AIDAUBXXXXXX47Z2RDQSM 224 | 2020-04-27T11:27:42Z 225 | 226 | 227 | 228 | 69790a00-0484-49b0-b04f-ff466594807b 229 | 230 | 231 | ``` 232 | Querying our `neptune` cluster via the `socks5h` proxy: 233 | ``` 234 | [elpy@testbox ~]$ swurl --socks localhost:8888 --profile sandpit --service neptune-db --region ap-southeast-2 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status 235 | {"status":"healthy","startTime":"Thu May 21 05:30:27 UTC 2020","dbEngineVersion":"1.0.2.2.R2","role":"writer","gremlin":{"version":"tinkerpop-3.4.3"},"sparql":{"version":"sparql-1.1"},"labMode":{"ObjectIndex":"disabled","ReadWriteConflictDetection":"enabled"}} 236 | ``` 237 | 238 | Use the `--env` option to print out the required environment variables so we don't need to keep specifying cli arguments: 239 | ``` 240 | $ swurl --socks localhost:8888 --profile sandpit --service neptune-db --region ap-southeast-2 https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/status --env 241 | export AWS_PROFILE="sandpit" 242 | export AWS_SERVICE="neptune-db" 243 | export AWS_REGION="ap-southeast-2" 244 | export SWURL_SOCKS="localhost:8888" 245 | ``` 246 | 247 | Copy and paste them in, and you can continue without all the arguments: 248 | ``` 249 | [elpy@testbox ~]$ export AWS_PROFILE="sandpit" 250 | [elpy@testbox ~]$ export AWS_SERVICE="neptune-db" 251 | [elpy@testbox ~]$ export AWS_REGION="ap-southeast-2" 252 | [elpy@testbox ~]$ export SWURL_SOCKS="localhost:8888" 253 | [elpy@testbox ~]$ swurl https://name.cluster-identifier.ap-southeast-2.neptune.amazonaws.com:8182/gremlin/status 254 | { 255 | "acceptedQueryCount" : 0, 256 | "runningQueryCount" : 0, 257 | "queries" : [ ] 258 | } 259 | ``` 260 | Querying a non-public elasticsearch cluster: 261 | ``` 262 | [elpy@testbox ~]$ swurl --profile sandpit --socks localhost:8888 --region ap-southeast-2 --service es 'https://elpydev-identifier.ap-southeast-2.es.amazonaws.com/_cluster/health?wait_for_status=yellow&timeout=50s&pretty' 263 | { 264 | "cluster_name" : "824439210008:elpydev", 265 | "status" : "green", 266 | "timed_out" : false, 267 | "number_of_nodes" : 1, 268 | "number_of_data_nodes" : 1, 269 | "discovered_master" : true, 270 | "active_primary_shards" : 1, 271 | "active_shards" : 1, 272 | "relocating_shards" : 0, 273 | "initializing_shards" : 0, 274 | "unassigned_shards" : 0, 275 | "delayed_unassigned_shards" : 0, 276 | "number_of_pending_tasks" : 0, 277 | "number_of_in_flight_fetch" : 0, 278 | "task_max_waiting_in_queue_millis" : 0, 279 | "active_shards_percent_as_number" : 100.0 280 | } 281 | ``` 282 | Make a `POST` request to IAM service to create new group: 283 | ``` 284 | [elpy@testbox ~]$ swurl --profile sandpit -X POST -d 'Action=CreateGroup&GroupName=Testing111&Version=2010-05-08' --service iam --region us-east-1 'https://iam.amazonaws.com/' 285 | 286 | 287 | 288 | / 289 | Testing111 290 | AGPAUXXXXJQSRHH7KYEP4 291 | arn:aws:iam::012345678901:group/Testing111 292 | 2020-05-27T13:24:48Z 293 | 294 | 295 | 296 | 71f717e8-7d3c-4818-a2c5-4abd2f7991f2 297 | 298 | 299 | 300 | ``` 301 | 302 | -------------------------------------------------------------------------------- /swurl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import argparse 3 | import json 4 | import logging 5 | import os 6 | import requests 7 | 8 | from boto3.session import Session 9 | from urllib.parse import urlparse 10 | from requests import Request 11 | from requests.auth import AuthBase 12 | from botocore.auth import SigV4Auth 13 | from botocore.awsrequest import AWSRequest 14 | 15 | class AWSAuth(AuthBase): 16 | def __init__(self, credentials, service, region): 17 | self.credentials = credentials 18 | self.region = region 19 | self.service = service 20 | 21 | def __call__(self, r: Request): 22 | u = urlparse(r.url) 23 | if not u.scheme.startswith('http'): 24 | raise ValueError('invalid uri scheme') 25 | if r.headers.get('host'): 26 | u = u._replace(netloc=r.headers['host']) 27 | if u.port in (80, 443): 28 | u = u._replace(netloc=u.hostname) 29 | if r.headers['host'].startswith(u.hostname): 30 | r.headers['host'] = u.hostname 31 | if not u.path: 32 | u = u._replace(path='/') 33 | 34 | a = AWSRequest(method=r.method.upper(), url=u.geturl(), data=r.body) 35 | SigV4Auth(self.credentials, self.service, self.region).add_auth(a) 36 | r.headers.update(a.headers) 37 | return r 38 | 39 | 40 | def process_args(): 41 | parser = argparse.ArgumentParser(description='aws curl', prog='swurl') 42 | parser.add_argument('--profile', default=os.getenv('AWS_PROFILE'), 43 | metavar='profile', choices=Session().available_profiles) 44 | parser.add_argument('--service', default=os.getenv('AWS_SERVICE')) 45 | parser.add_argument('--region', default=os.getenv('AWS_REGION')) 46 | parser.add_argument('--socks', default=os.getenv('SWURL_SOCKS')) 47 | parser.add_argument('--request', '-X', dest='method', default='GET', 48 | choices=['GET', 'POST']) 49 | parser.add_argument('--data', '-d') 50 | parser.add_argument('--header', '-H', action='append') 51 | parser.add_argument('--insecure', '-k', action='store_true') 52 | parser.add_argument('--env', action='store_true') 53 | parser.add_argument('url') 54 | args = vars(parser.parse_args()) 55 | 56 | args['headers'] = {} 57 | if args.get('header'): 58 | for header in args['header']: 59 | if not (len(header.split()) == 2 and ':' in header): 60 | raise SystemExit(f'swurl: error: invalid header') 61 | k, v = header.lower().split(':', 1) 62 | args['headers'][k.strip()] = v.strip() 63 | if args['method'] == 'POST': 64 | args['headers']['Content-Type'] = 'application/x-www-form-urlencoded' 65 | return args 66 | 67 | 68 | def printenv(args): 69 | for x in ['profile', 'service', 'region']: 70 | if args.get(x): 71 | print(f'export AWS_{x.upper()}="{args[x]}"') 72 | if args.get('socks'): 73 | print(f'export SWURL_SOCKS="{args["socks"]}"') 74 | exit(0) 75 | 76 | 77 | def main(): 78 | args = process_args() 79 | session = Session(profile_name=args['profile']) 80 | creds = session.get_credentials() 81 | method = args['method'] 82 | service = args['service'] 83 | region = args['region'] or session.region_name 84 | data = args['data'] 85 | proxies = None 86 | verify = False if args.get('insecure') else None 87 | 88 | if args.get('data') and args['data'].startswith('@'): 89 | f = ''.join(args['data'].split('@')) 90 | data = ''.join([line.strip() for line in open(f)]) 91 | 92 | if args['socks']: 93 | proxies = dict(https=f'socks5h://{args["socks"]}') 94 | 95 | if args.get('env'): printenv(args) 96 | 97 | try: 98 | awsauth = AWSAuth(creds, service, region) 99 | resp = requests.request( 100 | method=method, 101 | url=args['url'], 102 | data=data, 103 | proxies=proxies, 104 | headers=args.get('headers'), 105 | auth=awsauth, 106 | verify=verify) 107 | except Exception as e: 108 | raise SystemExit(f'error: swurl: {e.__class__.__name__}: {e}') 109 | else: 110 | print(resp.text) 111 | 112 | 113 | main() 114 | --------------------------------------------------------------------------------