├── .gitignore ├── LICENSE ├── README.md ├── decorator ├── event.json ├── geocode.js ├── index.js └── package.json ├── ingestor ├── index.js └── package.json ├── template.yaml └── vpc-flow-log-appender.png /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | .DS_Store 3 | packaged.yaml 4 | node_modules 5 | package-lock.json 6 | parameters.properties 7 | 8 | **/node_modules -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # aws-vpc-flow-log-appender 2 | 3 | aws-vpc-flow-log-appender is a sample project that enriches AWS [VPC Flow Log](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/flow-logs.html) data with additional information, primarily the Security Groups associated with the instances to which requests are flowing. 4 | 5 | This project makes use of several AWS services, including Elasticsearch, Lambda, and Kinesis Firehose. These **must** be setup and configured in the proper sequence for the sample to work as expected. Here, we describe deployment of the Lambda components only. For details on deploying and configuring other services, please see the accompanying [blog post](https://aws.amazon.com/blogs/security/how-to-visualize-and-refine-your-networks-security-by-adding-security-group-ids-to-your-vpc-flow-logs/). 6 | 7 | The following diagram is a representation of the AWS services and components involved in this sample: 8 | 9 | ![VPC Flow Log Appender Services](vpc-flow-log-appender.png) 10 | 11 | **NOTE:** This project makes use of a free tier of the [ipstack](http://ipstack.com/) geolocation service that enforces a montly limit of 10,000 requests. It is not *intended for use in a production environment*. We recommend using one of ipstack's paid plans or another commercial source of IP geolocation data if you wish to run this code in such an environment. 12 | 13 | ## Getting Started 14 | 15 | To get started, clone this repository locally: 16 | 17 | ``` 18 | $ git clone https://github.com/awslabs/aws-vpc-flow-log-appender 19 | ``` 20 | 21 | The repository contains [CloudFormation](https://aws.amazon.com/cloudformation/) templates and source code to deploy and run the sample application. 22 | 23 | ### Prerequisites 24 | 25 | To run the vpc-flow-log-appender sample, you will need to: 26 | 27 | 1. Select an AWS Region into which you will deploy services. Be sure that all required services (AWS Lambda, Amazon Elastisearch Service, AWS CloudWatch, and AWS Kinesis Firehose) are available in the Region you select. 28 | 2. Confirm your [installation of the latest AWS CLI](http://docs.aws.amazon.com/cli/latest/userguide/installing.html) and that [it is properly configured](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-quick-configuration) with credentials that have appropriate access to your account. 29 | 3. [Install aws-sam-cli](https://github.com/awslabs/aws-sam-cli). 30 | 4. [Install Node.js and NPM](https://docs.npmjs.com/getting-started/installing-node). 31 | 32 | ## Configure Geolocation 33 | 34 | If you would like to geolocate the source IP address of traffic in your VPC flow logs, you can configure a free account at ipstack.com. Note that the free tier of this service is *not* intended for production use. 35 | 36 | To sign-up for a free account at ipstack.com, visit https://ipstack.com/signup/free to obtain an API key. 37 | 38 | Once you have obtained your API key, store it in EC2 Systems Manager Parameter Store as follows (replace MY_API_KEY with your own): 39 | 40 | ``` bash 41 | $ aws ssm put-parameter \ 42 | --name ipstack-api-key \ 43 | --value MY_API_KEY \ 44 | --type SecureString 45 | ``` 46 | 47 | ## Preparing to Deploy Lambda 48 | 49 | Before deploying the sample, install several dependencies using NPM: 50 | 51 | ``` bash 52 | $ cd decorator && npm install 53 | $ cd ../ingestor && npm install && cd .. 54 | ``` 55 | 56 | ## Deploy Lambda Functions 57 | 58 | The deployment of our AWS resources is managed by the [AWS SAM CLI](https://github.com/awslabs/aws-sam-cli) using the [AWS Serverless Application Model](https://github.com/awslabs/serverless-application-model) (SAM). 59 | 60 | 1. Create a new S3 bucket from which to deploy our source code (ensure that the bucket is created in the same AWS Region as your network and services will be deployed): 61 | 62 | ``` bash 63 | $ aws s3 mb s3:// 64 | ``` 65 | 66 | 2. Using the Serverless Application Model, package your source code and serverless stack: 67 | 68 | ``` bash 69 | $ sam package --template-file template.yaml \ 70 | --s3-bucket \ 71 | --output-template-file packaged.yaml 72 | ``` 73 | 74 | 3. Once packaging is complete, deploy the stack: 75 | 76 | ``` bash 77 | $ sam deploy --template-file packaged.yaml \ 78 | --stack-name vpc-flow-log-appender \ 79 | --capabilities CAPABILITY_IAM 80 | ``` 81 | 82 | Or to deploy with the geolocation feature turned on: 83 | 84 | ``` bash 85 | $ sam deploy --template-file packaged.yaml \ 86 | --stack-name vpc-flow-log-appender \ 87 | --capabilities CAPABILITY_IAM \ 88 | --parameter-overrides GeolocationEnabled=true 89 | ``` 90 | 91 | 4. Once we have deployed our Lambda functions, configure CloudWatch logs to stream VPC Flow Logs to Elasticsearch as described [here](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ES_Stream.html). 92 | 93 | ## Testing 94 | 95 | In addition to running aws-vpc-flow-log-appender using live VPC Flow Log data from your own environment, we can also leverage the [Kinesis Data Generator](https://awslabs.github.io/amazon-kinesis-data-generator/web/producer.html) to send mock flow log data to our Kinesis Firehose instance. 96 | 97 | To get started, review the [Kinesis Data Generator Help](https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html) and use the included CloudFormation template to create necessary resources. 98 | 99 | When ready: 100 | 101 | 1. Navigate to your Kinesis Data Generator and login. 102 | 103 | 2. Select the Region to which you deployed aws-vpc-flow-log-appender and select the appropriate Stream (e.g. "VPCFlowLogsToElasticSearch"). Set Records per Second to 50. 104 | 105 | 3. Next, we will use the AWS CLI to retrieve several values specific to your AWS Account to generate feasible VPC Flow Log data: 106 | 107 | ``` 108 | # ACCOUNT_ID 109 | $ aws sts get-caller-identity --query 'Account' 110 | 111 | # ENI_ID (e.g. "eni-1a2b3c4d") 112 | $ aws ec2 describe-instances \ 113 | --query 'Reservations[0].Instances[0].NetworkInterfaces[0].NetworkInterfaceId' 114 | ``` 115 | 116 | 4. Finally, we can build a template for KDG using the following. Be sure to replace `<>` and `<>` with the values your captured in step 3 (do not include quotes). 117 | 118 | ``` 119 | 2 <> <> {{internet.ip}} 10.100.2.48 45928 6379 6 {{random.number(1)}} {{random.number(600)} 1493070293 1493070332 ACCEPT OK 120 | ``` 121 | 122 | 5. Returning back to KDG, copy and paste the mock VPC Flow Log data in Template 1. Then click the "Send data" button. 123 | 124 | 6. Stop KDG after a few seconds by clicking "Stop" in the popup. 125 | 126 | 7. After a few minutes, check CloudWatch Logs and your Elasticsearch cluster for data. 127 | 128 | A few notes on the above test procedure: 129 | 130 | * While our example utilizes the ENI ID of an EC2 instance, you may use any ENI available in the AWS Region in which you deployed the sample code. 131 | * Feel free to tweak the mock data template if needed, this is only intended to be an example. 132 | * Do not modify values in double curly braces, these are part of the KDG template and will automatically be filled. 133 | 134 | ## Cleaning Up 135 | 136 | To clean-up the Lambda functions when you are finished with this sample: 137 | 138 | ``` 139 | $ aws cloudformation delete-stack --stack-name vpc-flow-log-appender-dev 140 | ``` 141 | 142 | ## Updates 143 | 144 | * Aug 2 2018 - Updated decorator function and geocode modue to use ipstacks as previous service is now defunct. Amended README to include new instructions on using ipstacks. 145 | * Jun 9 2017 - Fixed issue in which decorator did not return all records to Firehose when geocoder was over 15,000 per hour limit. Instead, will return blank geo data. Added Test methodology. 146 | 147 | ## Authors 148 | 149 | * **Josh Kahn** - *Initial work* 150 | -------------------------------------------------------------------------------- /decorator/event.json: -------------------------------------------------------------------------------- 1 | { 2 | "invocationId": "invoked123", 3 | "deliveryStreamArn": "aws:lambda:events", 4 | "region": "us-east-2", 5 | "records": [ 6 | { 7 | "data": "MiAxMjM0NTY3ODkwMTAgZW5pLTE4NTRmOTQ5IDcyLjIxLjE5Ni42NSAxNzIuMzEuMTYuMjEgMjA2NDEgMjIgNiAyMCA0MjQ5IDE0MTg1MzAwMTAgMTQxODUzMDA3MCBBQ0NFUFQgT0s=", 8 | "recordId": "record1", 9 | "approximateArrivalTimestamp": 1510772160000, 10 | "kinesisRecordMetadata": { 11 | "shardId": "shardId-000000000000", 12 | "partitionKey": "4d1ad2b9-24f8-4b9d-a088-76e9947c317a", 13 | "approximateArrivalTimestamp": "2012-04-23T18:25:43.511Z", 14 | "sequenceNumber": "49546986683135544286507457936321625675700192471156785154", 15 | "subsequenceNumber": "" 16 | } 17 | }, 18 | { 19 | "data": "MiAxMjM0NTY3ODkwMTAgZW5pLTE4NTRmOTQ5IDEwLjMuNDUuNDQgMTcyLjMxLjE2LjIxIDIwNjQxIDIyIDYgMjAgNDI0OSAxNDE4NTMwMDEwIDE0MTg1MzAwNzAgQUNDRVBUIE9L", 20 | "recordId": "record2", 21 | "approximateArrivalTimestamp": 1510772160000, 22 | "kinesisRecordMetadata": { 23 | "shardId": "shardId-000000000000", 24 | "partitionKey": "4d1ad2b9-24f8-4b9d-a088-76e9947c317a", 25 | "approximateArrivalTimestamp": "2012-04-23T18:25:43.511Z", 26 | "sequenceNumber": "49546986683135544286507457936321625675700192471156785154", 27 | "subsequenceNumber": "" 28 | } 29 | } 30 | ] 31 | } -------------------------------------------------------------------------------- /decorator/geocode.js: -------------------------------------------------------------------------------- 1 | /** 2 | 3 | Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 6 | except in compliance with the License. A copy of the License is located at 7 | 8 | http://aws.amazon.com/apache2.0/ 9 | 10 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" 11 | BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | License for the specific language governing permissions and limitations under the License. 13 | 14 | */ 15 | 16 | 'use strict' 17 | 18 | /** 19 | * Simple class to help geocode IP addresses using freegeoip.net. 20 | * 21 | * NOTE: this is intended for demo purposes only. 22 | */ 23 | 24 | const http = require('http'); 25 | const SSM = require('aws-sdk/clients/ssm'); 26 | const axios = require('axios'); 27 | 28 | const serviceHost = 'api.ipstack.com'; 29 | 30 | let ssm = null; 31 | let apiKey = null; 32 | 33 | /** 34 | * 35 | */ 36 | const getApiKey = async() => { 37 | if (!ssm) { ssm = new SSM({ region: process.env.AWS_REGION }) } 38 | 39 | const params = { 40 | Name: process.env.GEOLOCATION_API_KEY_NAME, 41 | WithDecryption: true 42 | } 43 | 44 | let result = await ssm.getParameter(params).promise() 45 | if (result && result.Parameter) { 46 | return result.Parameter.Value 47 | } else { 48 | throw Error(`API key not found in SSM (${process.env.GEOLOCATION_API_KEY_NAME})`) 49 | } 50 | } 51 | 52 | /** 53 | * 54 | * @param {String} ipAddress 55 | */ 56 | module.exports = async (ipAddress) => { 57 | if (!apiKey) { apiKey = await getApiKey() } 58 | 59 | let response = await axios.get(`http://${serviceHost}/${ipAddress}?access_key=${apiKey}`) 60 | console.log(JSON.stringify(response.data)) 61 | console.log(response.data.success) 62 | if (response.status !== 200 || response.data.hasOwnProperty('error')) { 63 | console.warn('[geocode] received bad response: ' +response.statusText); 64 | return Promise.reject(`ipstack - ${response.statusText} - ${JSON.stringify(response.data.error)}`); 65 | } else { 66 | return Promise.resolve(response.data); 67 | } 68 | } -------------------------------------------------------------------------------- /decorator/index.js: -------------------------------------------------------------------------------- 1 | /** 2 | 3 | Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 6 | except in compliance with the License. A copy of the License is located at 7 | 8 | http://aws.amazon.com/apache2.0/ 9 | 10 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" 11 | BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | License for the specific language governing permissions and limitations under the License. 13 | 14 | */ 15 | 16 | 'use strict' 17 | 18 | /** 19 | * This function receives flow log data from Kinesis Firehose and "decorates" 20 | * or enriches that data with additional information. For each flow log entry, 21 | * the function attempts to append the Security Group IDs associated with 22 | * the Elastic Network Interface (ENI) associated with the record as well 23 | * as the location (e.g. country, region) of the requestor. 24 | * 25 | * This function must be deployed before creating the Kinesis Firehose 26 | * instance as part of Elasticsearch Service setup. 27 | * 28 | * VPC --> CloudWatch --> Lambda#ingestor --> Kinesis --> Elasticsearch 29 | * (Flow Logs) Firehose 30 | * + 31 | * Lambda#decorator 32 | * 33 | */ 34 | 35 | const find = require('lodash.find'); 36 | const EC2 = require('aws-sdk/clients/ec2'); 37 | const jmespath = require('jmespath'); 38 | const geocode = require('./geocode'); 39 | 40 | /** 41 | * Regular expression to parse VPC Flow Log format. 42 | */ 43 | const parser = /^(\d) (\d+) (eni-\w+) (\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}) (\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (\d+) (ACCEPT|REJECT) (OK|NODATA|SKIPDATA)/ 44 | 45 | let ec2 = null; 46 | 47 | /** 48 | * Describes the Network Interfaces associated with this account. 49 | * 50 | * @return `Promise` for async processing 51 | */ 52 | const listNetworkInterfaces = async () => { 53 | if (!ec2) { ec2 = new EC2({ region: process.env.AWS_REGION }) } 54 | return ec2.describeNetworkInterfaces().promise(); 55 | }; 56 | 57 | /** 58 | * Builds a listing of Elastic Network Interfaces (ENI) associated with this account and 59 | * returns an Object representing that ENI, specifically its unique identifier, associated 60 | * security groups, and primary private IP address. 61 | * 62 | * Per AWS documentation, we only capture the primary, private IPv4 address of the ENI: 63 | * 64 | * - If your network interface has multiple IPv4 addresses and traffic is sent to a secondary private IPv4 65 | * address, the flow log displays the primary private IPv4 address in the destination IP address field. 66 | * - In the case of both `srcaddr` and `dstaddr` in VPC Flow Logs: the IPv4 address of the network interface 67 | * is always its private IPv4 address. 68 | * 69 | * @see http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/flow-logs.html 70 | * 71 | * Returns structure like: 72 | * [ 73 | * { interfaceId: 'eni-c1a7da8c', 74 | * securityGroupIds: [ 'sg-b2b454d4' ], 75 | * ipAddress: '10.0.1.24' }, 76 | * { interfaceId: 'eni-03cbb94e', 77 | * securityGroupIds: [ 'sg-a3b252c5' ] 78 | * ipAddress: '10.0.2.33'} 79 | * ... 80 | * ] 81 | */ 82 | const buildEniToSecurityGroupMapping = async () => { 83 | let interfaces = await listNetworkInterfaces() 84 | 85 | let mapping = jmespath.search(interfaces, 86 | `NetworkInterfaces[].{ 87 | interfaceId: NetworkInterfaceId, 88 | securityGroupIds: Groups[].GroupId, 89 | ipAddress: PrivateIpAddresses[?Primary].PrivateIpAddress 90 | }`); 91 | 92 | return Promise.resolve(mapping); 93 | } 94 | 95 | /** 96 | * Extracts records from the VPC Flow Log entries passed to the function from 97 | * Kinesis Firehose. Records are matched against expected format of Flow Log 98 | * data and wrapped in an object that indicates whether processing of the 99 | * record was erroneous for future use. 100 | * 101 | * @param oRecords - records from Kinesis Firehose to be processed 102 | */ 103 | const extractRecords = async (records) => { 104 | let result = [] 105 | for(let record of records) { 106 | let flowLogData = Buffer.from(record.data, 'base64').toString('utf8') 107 | let match = parser.exec(flowLogData) 108 | if (match) { 109 | let matched = { 110 | // default vpc flow log data 111 | '@timestamp': new Date(), 112 | 'version': Number(match[1]), 113 | 'account-id': Number(match[2]), 114 | 'interface-id': match[3], 115 | 'srcaddr': match[4], 116 | 'destaddr': match[5], 117 | 'srcport': Number(match[6]), 118 | 'dstport': Number(match[7]), 119 | 'protocol': Number(match[8]), 120 | 'packets': Number(match[9]), 121 | 'bytes': Number(match[10]), 122 | 'start': Number(match[11]), 123 | 'end': Number(match[12]), 124 | 'action': match[13], 125 | 'log-status': match[14] 126 | } 127 | 128 | result.push({ 129 | id: record.recordId, 130 | data: matched, 131 | error: false 132 | }) 133 | } else { 134 | result.push({ 135 | id: record.recordId, 136 | data: record.data, 137 | error: true 138 | }) 139 | } 140 | } 141 | 142 | return Promise.resolve(result) 143 | } 144 | 145 | /** 146 | * Tests if the passed IP address meets RFC1918 guidelines, e.g. private ip address. 147 | * @param {*} ipAddress 148 | */ 149 | const isRfc1918Address = (ipAddress) => { 150 | let re = /(^127\.)|(^10\.)|(^172\.1[6-9]\.)|(^172\.2[0-9]\.)|(^172\.3[0-1]\.)|(^192\.168\.)/; 151 | 152 | return (ipAddress.match(re) !== null); 153 | } 154 | 155 | /** 156 | * Decorates passed VPC Flow Log records with additional data, including security 157 | * group IDs and geolocation of source IP address. 158 | * 159 | * @param records - array of records to be processed 160 | * @param mapping - mapping of ENIs to additional data 161 | * @return `Promise` for async processing 162 | */ 163 | const decorateRecords = async (records, mapping) => { 164 | console.log(`Decorating ${records.length} records`) 165 | 166 | for(let record of records) { 167 | let eniData = find(mapping, { 'interfaceId': record.data['interface-id'] }); 168 | if (eniData) { 169 | record.data['security-group-ids'] = eniData.securityGroupIds; 170 | record.data['direction'] = (record.data['destaddr'] == eniData.ipAddress) ? 'inbound' : 'outbound'; 171 | } else { 172 | console.log(`No ENI data found for interface ${record.data['interface-id']}`); 173 | } 174 | 175 | let srcaddr = record.data['srcaddr']; 176 | let geo = process.env.GEOLOCATION_ENABLED === 'false' 177 | || isRfc1918Address(srcaddr) ? null : await geocode(srcaddr) 178 | 179 | if (geo) console.log(JSON.stringify(geo)) 180 | 181 | // append geo data to existing record 182 | record.data['source-country-code'] = geo ? geo.country_code : '' 183 | record.data['source-country-name'] = geo ? geo.country_name : '' 184 | record.data['source-region-code'] = geo ? geo.region_code : '' 185 | record.data['source-region-name'] = geo ? geo.region_name : '' 186 | record.data['source-city'] = geo ? geo.city : '' 187 | record.data['source-location'] = { 188 | lat: geo ? Number(geo.latitude) : 0, 189 | lon: geo ? Number(geo.longitude) : 0 190 | } 191 | 192 | console.log(JSON.stringify(record)) 193 | } 194 | 195 | console.log(`Finished with ${records.length} records`) 196 | return Promise.resolve(records) 197 | }; 198 | 199 | /** 200 | * Called after decoration is complete, packages the records to be passed 201 | * to Elasticsearch. Record payload is compressed and tagged as appropriate 202 | * (ok or error) for Kinesis Firehose to complete its work. 203 | * 204 | * @param records - records to be packaged 205 | */ 206 | const packageRecords = async (records) => { 207 | let result = [] 208 | let success = 0 209 | let failure = 0 210 | 211 | console.log(`Packaging ${records.length} records`) 212 | 213 | for(let record of records) { 214 | if (record.error) { 215 | result.push({ 216 | recordId: record.id, 217 | result: 'ProcessingFailed', 218 | data: record.data 219 | }) 220 | failure++ 221 | } else { 222 | let payload = Buffer.from(JSON.stringify(record.data), 'utf8').toString('base64') 223 | result.push({ 224 | recordId: record.id, 225 | result: 'Ok', 226 | data: payload 227 | }) 228 | success++ 229 | } 230 | } 231 | 232 | console.log(`Processing completed. Successful records ${success}, Failed records ${failure}.`); 233 | return Promise.resolve(result) 234 | } 235 | 236 | 237 | /** 238 | * 239 | * Main Lambda handler -- builds the ENI mapping and then decorates VPC flow log 240 | * records passed from Firehose. 241 | * 242 | */ 243 | exports.handler = (event, context, callback) => { 244 | console.log(`Received ${event.records.length} records for processing`); 245 | 246 | Promise.all([ buildEniToSecurityGroupMapping(), extractRecords(event.records) ]) 247 | .then( (results) => { 248 | console.log('Finished building ENI to Security Group Mappig and Extracting Records'); 249 | return decorateRecords(results[1], results[0]) 250 | }) 251 | .then( (records) => { 252 | return packageRecords(records) 253 | }) 254 | .then( (records) => { 255 | console.log(`Finished processing records, pushing ${records.length} records to Elasticsearch...`); 256 | callback(null, { records: records }); 257 | }) 258 | .catch( (error) => { 259 | console.error('[ERROR] ' +error); 260 | callback(error); 261 | }) 262 | }; -------------------------------------------------------------------------------- /decorator/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "vpc-flow-log-appender", 3 | "version": "1.0.0", 4 | "description": "Appends security group and additional details for use by Elastic Serach and VPC flow logs", 5 | "main": "index.js", 6 | "author": "jkahn@", 7 | "license": "Apache-2.0", 8 | "repository": "https://github.com/awslabs/aws-vpc-flow-log-appender", 9 | "dependencies": { 10 | "aws-sdk": "^2.290.0", 11 | "axios": "^0.18.0", 12 | "jmespath": "^0.15.0", 13 | "lodash.find": "^4.6.0" 14 | } 15 | } 16 | -------------------------------------------------------------------------------- /ingestor/index.js: -------------------------------------------------------------------------------- 1 | /** 2 | 3 | Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. 4 | 5 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 6 | except in compliance with the License. A copy of the License is located at 7 | 8 | http://aws.amazon.com/apache2.0/ 9 | 10 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" 11 | BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 12 | License for the specific language governing permissions and limitations under the License. 13 | 14 | */ 15 | 16 | 'use strict' 17 | 18 | /** 19 | * This function ingests VPC Flow Logs from the producer and passes them to 20 | * a Kinesis Firehose that will decorate the data and push it to Elasticsearch. 21 | * 22 | * After creating the VPC Flow Logs CloudWatch Log Group, you will need to 23 | * configure deliveru of the logs to this function via CloudWatch. Note that 24 | * the function must be deployed first to Lambda. 25 | * 26 | * Adapted from 27 | * https://github.com/bsnively/aws-big-data-blog/blob/master/aws-blog-vpcflowlogs-athena-quicksight/CloudwatchLogsToFirehose/lambdacode.py 28 | * 29 | * 30 | * VPC --> CloudWatch --> Lambda#ingestor --> Kinesis --> Elasticsearch 31 | * (Flow Logs) Firehose 32 | * + 33 | * Lambda#decorator 34 | * 35 | */ 36 | 37 | const AWS = require('aws-sdk'); 38 | const zlib = require('zlib'); 39 | 40 | /** 41 | * Put records in Kinesis Firehose to be decorated and sent to Elasticsearch. 42 | * 43 | * @param records - JSON records to be pushed to Firehose 44 | */ 45 | const putRecords = (records) => { 46 | var params = { 47 | DeliveryStreamName: process.env.DELIVERY_STREAM_NAME, 48 | Records: records 49 | } 50 | 51 | var firehose = new AWS.Firehose(); 52 | firehose.putRecordBatch(params, (error, data) => { 53 | if (error) { 54 | console.error('[ERROR - putRecordBatch] ' + error); 55 | } 56 | else { 57 | console.log('[Firehose] putRecordBatch successful') 58 | } 59 | }) 60 | }; 61 | 62 | /** 63 | * Creates records to be consumed in Firehose from VPC Flow Log events. 64 | * 65 | * @param events - VPC Flow Log events 66 | * @return `Promise` for async processing 67 | */ 68 | const createRecordsFromEvents = (events) => { 69 | return new Promise( (resolve, reject) => { 70 | var records = []; 71 | 72 | events.forEach( (event) => { 73 | if (event.messageType === 'CONTROL_MESSAGE') { 74 | console.log('Skipping control message'); 75 | return; 76 | } 77 | 78 | var logEvent = { 79 | Data: `${event.message}\n` 80 | } 81 | records.push(logEvent); 82 | 83 | // catch at 500 records and push to firehose 84 | if (records.length > 499) { 85 | putRecords(records); 86 | records = []; 87 | } 88 | }, this); 89 | 90 | resolve(records); 91 | }); 92 | } 93 | 94 | /** 95 | * Asynchronously gunzips a buffer, returning a Promise. 96 | * 97 | * @param buffer - buffer to be gunzipped 98 | * @return `Promise` for async processing 99 | */ 100 | const gunzipPromise = (buffer) => { 101 | return new Promise( (resolve, reject) => { 102 | zlib.gunzip(buffer, (error, result) => { 103 | if (error) { 104 | reject(error); 105 | return; 106 | } 107 | resolve(result); 108 | }); 109 | }) 110 | } 111 | 112 | /** 113 | * 114 | * Main Lambda handler. VPC Flow Log events will enter the funciton in 115 | * the following format: 116 | * 117 | * [ 118 | * { Data: '2 eni-4ff3618a 2.178.18.24 10.100.5.78 23458 7547 6 1 40 1490365304 1490365358 ACCEPT OK\n' }, 119 | * { Data: '2 eni-4ff3618a 190.48.42.140 10.100.5.78 41965 2222 6 1 40 1490365421 1490365478 ACCEPT OK\n' }, 120 | * { Data: '2 eni-4ff3618a 121.217.240.138 10.100.5.78 52627 7547 6 1 40 1490365421 1490365478 ACCEPT OK\n' } 121 | * ] 122 | * 123 | */ 124 | exports.handler = (event, context) => { 125 | var zippedData = Buffer.from(event.awslogs.data, 'base64'); 126 | gunzipPromise(zippedData) 127 | .then( (data) => { 128 | let logData = JSON.parse(data.toString('utf8')); 129 | return createRecordsFromEvents(logData.logEvents); 130 | }) 131 | .then( (records) => { 132 | if (records.length > 0) { 133 | putRecords(records); 134 | } 135 | context.succeed(); 136 | }) 137 | .catch( (error) => { 138 | console.error('[ERROR] ' + error); 139 | context.fail(error); 140 | }) 141 | }; -------------------------------------------------------------------------------- /ingestor/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "vpc-flow-log-appender", 3 | "version": "1.0.0", 4 | "description": "Appends security group and additional details for use by Elastic Serach and VPC flow logs", 5 | "main": "index.js", 6 | "author": "jkahn@", 7 | "license": "Apache-2.0", 8 | "repository": "https://github.com/awslabs/aws-vpc-flow-log-appender" 9 | } 10 | -------------------------------------------------------------------------------- /template.yaml: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 5 | # except in compliance with the License. A copy of the License is located at 6 | # 7 | # http://aws.amazon.com/apache2.0/ 8 | # 9 | # or in the "license" file accompanying this file. This file is distributed on an "AS IS" 10 | # BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 11 | # License for the specific language governing permissions and limitations under the License. 12 | # 13 | 14 | --- 15 | 16 | AWSTemplateFormatVersion: "2010-09-09" 17 | Transform: "AWS::Serverless-2016-10-31" 18 | 19 | Description: "AWS VPC Flow Log Appender services" 20 | 21 | Parameters: 22 | GeolocationEnabled: 23 | Type: String 24 | Default: false 25 | AllowedValues: 26 | - true 27 | - false 28 | 29 | Resources: 30 | # 31 | # -- Data Ingestion -- 32 | # 33 | # Data is received from CloudWatch Logs and pushed to Kinesis Firehose. 34 | # 35 | FlowLogIngestionFunction: 36 | Type: "AWS::Serverless::Function" 37 | Properties: 38 | Handler: "index.handler" 39 | Runtime: "nodejs10.x" 40 | CodeUri: "ingestor/" 41 | Policies: 42 | - Version: "2012-10-17" 43 | Statement: 44 | - 45 | Effect: Allow 46 | Action: 47 | - "firehose:PutRecordBatch" 48 | Resource: !Sub "arn:aws:firehose:${AWS::Region}:${AWS::AccountId}:deliverystream/*" 49 | Environment: 50 | Variables: 51 | DELIVERY_STREAM_NAME: "VPCFlowLogsToElasticSearch" 52 | 53 | # 54 | # -- Decorator -- 55 | # 56 | # Appends VPC Flow Logs with additional data, including instance security groups. 57 | # 58 | FlowLogDecoratorFunction: 59 | Type: AWS::Serverless::Function 60 | Properties: 61 | Handler: index.handler 62 | Runtime: nodejs10.x 63 | Timeout: 120 64 | CodeUri: decorator/ 65 | Policies: 66 | - Version: "2012-10-17" 67 | Statement: 68 | - Effect: Allow 69 | Action: 70 | - "ssm:GetParameter" 71 | Resource: !Sub "arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:parameter/ipstack-api-key" 72 | - Effect: Allow 73 | Action: 74 | - "ec2:Describe*" 75 | Resource: "*" 76 | Environment: 77 | Variables: 78 | GEOLOCATION_ENABLED: !Ref GeolocationEnabled 79 | GEOLOCATION_API_KEY_NAME: ipstack-api-key 80 | 81 | Outputs: 82 | LambdaFlowLogDecorator: 83 | Value: !GetAtt FlowLogDecoratorFunction.Arn 84 | Export: 85 | Name: 'FlowLogDecoratorFunction' 86 | 87 | LambdaFlowLogIngestion: 88 | Value: !GetAtt FlowLogIngestionFunction.Arn 89 | Export: 90 | Name: 'FlowLogIngestionFunction' 91 | -------------------------------------------------------------------------------- /vpc-flow-log-appender.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/amazon-archives/aws-vpc-flow-log-appender/f5c50978d5bbe9f116c3928c96eccd34bd7c8c59/vpc-flow-log-appender.png --------------------------------------------------------------------------------