├── .DS_Store ├── LICENSE ├── README.md ├── Setup-Instructions.md ├── images └── diag.png └── lambda_function └── lambda_python.py /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pauld-splunk/aws-s3-sqs-lambda/0b48d815f34ae232b6293f20f0f984175749294e/.DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | 179 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SQS Lambda Function for Ingesting into Splunk using SQS Based S3 Input 2 | 3 | This repository contains a sample function and instructions on setting up a function to allow a single S3 Bucket to be "split" into multiple SQS notifications for ingest into Splunk based on object name. 4 | 5 | Intended audience: Splunk Admins / AWS Admins that are setting up ingest from AWS S3 into Splunk using the Splunk Add-On for Amazon Web Services (https://splunkbase.splunk.com/app/1876/ ). 6 | 7 | **Overview** 8 | 9 | Many organisations have a centralised S3 Bucket for log collection for multiple sources and accounts. For example, all Config, Cloudtrails and Access Logs logs may be routed into one central bucket for that organisation. The key prefix for these log objects generally provides an easy navigation around each account and log type – for example the object keys are in generally of the format: 10 |
Bucket/AWSLogs/account number/logtype/region/year/month/day/log
11 | 12 | To collect these logs into Splunk, one of the best practice approaches is to use the Splunk Add-On for Amazon Web Services (https://splunkbase.splunk.com/app/1876/), using the “SQS Based S3” input. This input essentially uses an SNS notification on the bucket along with SQS message that the Add-on uses to identify new files in the bucket, which it then reads into Splunk. Please refer to Splunk documentation on setting this up : 13 | https://docs.splunk.com/Documentation/AddOns/released/AWS/SQS-basedS3 14 | 15 | Although this is a very scalable solution, a challenge arises with this logging method when more than one source of logs is being dropped into a bucket, such as Cloud Trail and Config. This is due to the SNS notifications only being able to be triggered with a wild card set at the tail end of the prefix, such as /bucket/account/\*. It is not possible therefore with a centralised logging bucket to separate out one single notification for all Cloud Trails in the bucket, as this would require the notification to be set on bucket/AWSLogs/\*/CloudTrail/\* which is not valid. 16 | 17 | A way around this of course is to set up multiple notifications topics, corresponding SQS queues and an Add-On Input for each account, which over time can be quite complicated and difficult to manage/maintain. An example of this could be where 100 accounts with 3 log types each would result in 300 SNS topics, 300 SQS queues (each with another dead letter queue) and 300 Add-on Inputs. 18 | 19 | There is however another much easier setup and approach that can be taken using Lambda functions. Instead of having separate SNS notifications for each account, one SNS topic for the whole bucket could trigger a lambda function via an SQS queue, which in turn “routes” the notification into other SQS queues depending on the log source, which are then linked to an add-on input of the correct “source type”. Using this approach, one bucket could have multiple accounts and sourcetypes without the need for a large setup of SNS topics, SQS queues and Add-On inputs. With the same example above of 100 accounts and 3 logs, only 1 SNS topic would be needed, with only 4 SQS queues (with each queue having a dead-letter queue). 20 | (It is also possible to go direct from SNS into a Lambda function avoiding 1 more SQS queue, but in the event of a function failure, there no way to retrieve the SNS notification, whereas the queue would still contain the notification) 21 | 22 | Instructions are provided here on setting this up. 23 | 24 | ![SQS Lambda Function overview](images/diag.png) 25 | 26 | The sample function provides a use case where 3 different sources may be available in an S3 bucket. It uses function environment variables to set the queue names for each of the different sources, as well as a default queue for any other object that is put there. The function also can take a exclusion list environment variable to “ignore” certain objects that may also be copied into the bucket but not needed to be sent to Splunk. 27 | 28 | Other use cases may be added to the function, such as sending to different queues based on account numbers. This could enable logs from certain groups of accounts to be sent to different Splunk indexes for security or retention requirements. 29 | -------------------------------------------------------------------------------- /Setup-Instructions.md: -------------------------------------------------------------------------------- 1 | # Setup Instructions 2 | 3 | **Pre-Requisites** 4 | Set-up the Splunk Add-On for Amazon Web Services as per documentation https://docs.splunk.com/Documentation/AddOns/released/AWS/Description. Make sure the AWS policy and roles have been created to allow the Add-on by minimum access to your SQS and AWS S3 bucket. Instructions for this are here - 5 | https://docs.splunk.com/Documentation/AddOns/latest/AWS/ConfigureAWS 6 | https://docs.splunk.com/Documentation/AddOns/released/AWS/ConfigureAWSpermissions 7 | 8 | You will also need to ensure that the Add-On is ready to create new inputs, so please configure the settings as described here: 9 | https://docs.splunk.com/Documentation/AddOns/released/AWS/Setuptheadd-on 10 | 11 | 12 | **Stage 1: Initial S3 Bucket configuration** 13 | 14 | 1) Create a new S3 Bucket, or select an existing bucket that where the logs will be written. (Make a note of the bucket name for later for step 3) 15 | 16 | 2) Create a new SNS topic (make note of the ARN for the next step) 17 | 18 | 3) Edit the SNS Policy - replace it with with the one below (changing the "SNS-topic-ARN" and "bucket-name" to the relevant ones from steps 1 & 2) 19 | 20 |
21 | {
22 | 	"Version": "2008-10-17",
23 | 	"Id": "example-ID",
24 | 	"Statement": [
25 | 				{"Sid": "example-statement-ID",
26 | 				 "Effect": "Allow",
27 | 				 "Principal": {"AWS":"*" },
28 | 				 "Action": ["SNS:Publish"],
29 | 				 "Resource": "SNS-topic-ARN",
30 | 				 "Condition": {"ArnLike": { "aws:SourceArn": "arn:aws:s3:*:*:bucket-name" }}
31 | 				}]
32 | }
33 | 
34 | 35 | 4) Return to S3 selecting the properties of the Bucket, and navigate to the "Events" options. Create a new notification. Select the action as "All object create events", and send a notification via SNS to the one created above in step 2. 36 | 37 | 5) Create a new SQS Queue - name it with ending DLQ (e.g. my-bucket-sqs-dlq) - use all other default settings 38 | 39 | 6) Create another SQS Queue naming it same as above without dlq. Use all defaults, but now select the checkbox "Use Redrive policy" and select the dlq created above. Set retry as 500. 40 | 41 | 7) Subscribe the new SQS Queue (not the DLQ one) to the SNS Topic created in step 2 42 | 43 | 44 | **Stage 2: Set up routing SQS queues** 45 | 46 | You will need to create SQS queues for each of the sources of logs in the S3 Bucket. For example, if you may have CloudTrail, Config, S3 Access logs and VPC Flow logs going into the bucket, you will need to repeat these steps for each source. You may also want to set a default SQS queue to capture any logs that are not categorised. 47 | 48 | 1) Create a new SQS Queue - name it with ending DLQ (e.g. my-CloudTrail-sqs-dlq) - use all other default settings for the queue 49 | 50 | 2) Create another SQS Queue naming it same as above without dlq. Use all defaults, but now select the checkbox "Use Redrive policy" and select the dlq created above. Set retry as 500. 51 | 52 | Repeat steps 1 and 2 for each of the sources 53 | 54 | **Stage 3: Setup Lambda Function** 55 | 56 | 1) Create a new Lambda Function (Author from scratch), using Python 3.8 Runtime, default permissions 57 | 2) Copy the sample function into the inline editor 58 | 3) Navigate to the function permissions, and open the execution role / edit it 59 | 4) In addition to the policies already assigned by default, edit the policy and ensure that the role has permissions to do the following: 60 | - Required permissions for SQS: GetQueueUrl, ReceiveMessage, DeleteMessage, GetQueueAttributes, ListQueues, SendMessage, SendMessageBatch 61 | 5) On the Lambda configuration, click Add Trigger, and select SQS. In the drop down, select the SQS queue created in Stage 1, step 6 62 | 63 | **Stage 4: Setup Lambda Function Environment variables** 64 | 65 | These instructions are for the specific settings already defined in the function example. If you change the template, then these will of course change. 66 | 67 | 1) In the environment variables section for the function, add the following variables: 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 |
VariableValue
EXCLUDELISTEnter a list of objects or partial names to not read into Splunk. The format of this should be as follows: ["object1","object2"] where object1 and object2 are the names/partial names of the object keys to exclusion list. Make sure that the names are in double quotes, and in a list with square brackets. If no exclusion is needed, do not set this variable
CloudTrailQueueNameName of the SQS queue set up in Stage 2 for CloudTrails
ConfigQueueNameName of the SQS queue set up in Stage 2 for Config
vpcflowlogsQueueNameName of the SQS queue set up in Stage 2 for vpc flow logs
defaultQueueNameName of the SQS queue set up in Stage 2 for any other logs not matching any of the others above
77 | 78 | **Stage 5: Setup the inputs on your Splunk Add-on for AWS** 79 | 80 | Follow the instructions in the Splunk documentation to add new inputs for the queues set up in Step 2. Ensure that the sourcetypes are set appropriately to the sourcetype being ingested. 81 | 82 | 83 | Your function should now be ready to execute. 84 | 85 | 86 | 87 | -------------------------------------------------------------------------------- /images/diag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pauld-splunk/aws-s3-sqs-lambda/0b48d815f34ae232b6293f20f0f984175749294e/images/diag.png -------------------------------------------------------------------------------- /lambda_function/lambda_python.py: -------------------------------------------------------------------------------- 1 | ''' Copyright 2020 Splunk Inc. 2 | 3 | Licensed under the Apache License, Version 2.0 (the "License"); 4 | you may not use this file except in compliance with the License. 5 | You may obtain a copy of the License at 6 | 7 | http://www.apache.org/licenses/LICENSE-2.0 8 | 9 | Unless required by applicable law or agreed to in writing, software 10 | distributed under the License is distributed on an "AS IS" BASIS, 11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | See the License for the specific language governing permissions and 13 | limitations under the License. ''' 14 | 15 | 16 | import json 17 | import boto3 18 | import os 19 | sqs = boto3.client('sqs') 20 | def lambda_handler(event, context): 21 | 22 | sqs_records = event["Records"] 23 | 24 | try: 25 | excludelist=json.loads(os.environ['EXCLUDELIST']) 26 | except: 27 | excludelist=[] 28 | 29 | for payload_record in sqs_records: 30 | 31 | body = json.loads(payload_record['body']) 32 | 33 | message = json.loads(body['Message']) 34 | 35 | s3_records = message["Records"] 36 | 37 | 38 | for s3_record in s3_records: 39 | mykey = s3_record["s3"]["object"]["key"] 40 | arn = s3_record["s3"]["bucket"]["arn"] 41 | 42 | skip=0 43 | for item in excludelist: 44 | if (item in mykey): 45 | skip=1 46 | print(f'Not forwarding object <{mykey}> from bucket <{arn}> as it is in exclude list') 47 | 48 | 49 | if ("CloudTrail" in mykey): 50 | forwardQueue = sqs.get_queue_url(QueueName=os.environ['CloudTrailQueueName']) 51 | elif ("Config" in mykey): 52 | forwardQueue = sqs.get_queue_url(QueueName=os.environ['ConfigQueueName']) 53 | elif ("vpcflowlogs" in mykey): 54 | forwardQueue = sqs.get_queue_url(QueueName=os.environ['vpcflowlogsQueueName']) 55 | else: 56 | forwardQueue = sqs.get_queue_url(QueueName=os.environ['defaultQueueName']) 57 | 58 | if skip!=1: 59 | forwardQueueUrl = forwardQueue['QueueUrl'] 60 | sqs.send_message(QueueUrl=forwardQueueUrl, MessageBody=(json.dumps(body))) 61 | 62 | 63 | --------------------------------------------------------------------------------