├── .gitignore ├── .travis.yml ├── CHANGELOG ├── LICENSE-2.0.txt ├── README.md ├── Vagrantfile ├── docs └── dynamodb-table-image.png ├── project ├── AwsLambdaScalaExampleProjectBuild.scala ├── BuildSettings.scala ├── Dependencies.scala ├── build.properties └── plugins.sbt ├── src └── main │ └── scala │ └── com.snowplowanalytics.awslambda │ ├── BucketStrategy.scala │ ├── DynamoDBUtility.scala │ ├── LambdaFunction.scala │ └── SimpleEvent.scala ├── tasks.py └── vagrant ├── .gitignore ├── ansible.hosts ├── peru.yaml ├── up.bash ├── up.guidance └── up.playbooks /.gitignore: -------------------------------------------------------------------------------- 1 | *.class 2 | *.log 3 | 4 | # sbt specific 5 | .cache 6 | .history 7 | .lib/ 8 | dist/* 9 | target/ 10 | lib_managed/ 11 | src_managed/ 12 | project/boot/ 13 | project/plugins/project/ 14 | 15 | # Python 16 | __pycache__/ 17 | *.py[cod] 18 | 19 | # Vagrant 20 | .vagrant 21 | 22 | *.class 23 | *.log 24 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: scala 2 | scala: 3 | - 2.11.7 4 | jdk: 5 | - oraclejdk8 6 | 7 | -------------------------------------------------------------------------------- /CHANGELOG: -------------------------------------------------------------------------------- 1 | Version 0.2.0 (2015-01-25) 2 | -------------------------- 3 | Specify SBT version in build.properties (#8) 4 | Replace .sbt build with Scala build project (#9) 5 | Set build artifact file extension to '.jar' (#7) 6 | Fix travis builds, update to jdk8 (#5) 7 | 8 | Version 0.1.0 (2015-08-20) 9 | -------------------------- 10 | Initial release 11 | -------------------------------------------------------------------------------- /LICENSE-2.0.txt: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # AWS Lambda Scala Example Project 2 | 3 | [ ![Build Status] [travis-image] ] [travis] [ ![Release] [release-image] ] [releases] [ ![License] [license-image] ] [license] 4 | 5 | ## Introduction 6 | 7 | This is an example [AWS Lambda] [aws-lambda] Scala application for processing a [Kinesis] [aws-kinesis] stream of events ([introductory blog post] [blog-post]). It reads the stream of simple JSON events generated by our event generator. Our AWS Lambda function aggregates and buckets events and stores them in [DynamoDB] [aws-dynamodb]. 8 | 9 | This was built by the Data Science team at [Snowplow Analytics] [snowplow], who use AWS Lambda in their projects. 10 | 11 | **Running this requires an Amazon AWS account, and will incur charges.** 12 | 13 | _See also:_ [AWS Lambda Node.js Project][aws-lambda-nodejs-example-project] | [Spark Streaming Example Project][spark-streaming-example-project] 14 | 15 | ## Overview 16 | 17 | We have implemented a super-simple analytics-on-write stream processing job using AWS Lambda. Our AWS Lambda function, written in Scala that runs on the Java8 JVM, reads a Kinesis stream containing events in a JSON format: 18 | 19 | ```json 20 | { 21 | "timestamp": "2015-06-05T12:54:43.064528", 22 | "eventType": "Green", 23 | "id": "4ec80fb1-0963-4e35-8f54-ce760499d974" 24 | } 25 | ``` 26 | 27 | Our job counts the events by `eventType` and aggregates these counts into 1 minute buckets. The job then takes these aggregates and saves them into a table in DynamoDB: 28 | 29 | ![dynamodb-table-image][dynamodb-table-image] 30 | 31 | ## Developer Quickstart 32 | 33 | Assuming git, [Vagrant] [vagrant-install] and [VirtualBox] [virtualbox-install] installed: 34 | 35 | ```bash 36 | host$ git clone https://github.com/snowplow/aws-lambda-scala-example-project.git 37 | host$ cd aws-lambda-scala-example-project 38 | host$ vagrant up && vagrant ssh 39 | guest$ cd /vagrant 40 | guest# sbt assembly 41 | ``` 42 | 43 | ## Tutorial 44 | 45 | You can follow along in [the release blog post] [blog-post] to get the project up and running yourself. 46 | 47 | The following steps assume that you are running inside Vagrant, as per the Developer Quickstart above. 48 | 49 | ### 1. Setting up AWS credentials 50 | 51 | First we need to configure a default AWS profile: 52 | 53 | ```bash 54 | $ aws configure 55 | AWS Access Key ID [None]: ... 56 | AWS Secret Access Key [None]: ... 57 | Default region name [None]: us-east-1 58 | Default output format [None]: json 59 | ``` 60 | 61 | ### 2. Setup Amazon Kinesis, DynamoDB, and IAM Role 62 | 63 | Now we create our Kinesis event stream: 64 | 65 | ```bash 66 | $ inv create_kinesis_stream my-stream 67 | Kinesis Stream [my-stream] not active yet 68 | Kinesis Stream [my-stream] not active yet 69 | Kinesis Stream [my-stream] not active yet 70 | Kinesis successfully created. 71 | ``` 72 | 73 | Now create our DynamoDB table: 74 | 75 | ```bash 76 | $ inv create_dynamodb_table default us-east-1 my-table 77 | ``` 78 | 79 | Now we can create our IAM role. We will be using [CloudFormation] [cloudformation] to make our new role. Using `inv create_role`, we can create it like so: 80 | 81 | ```bash 82 | $ inv create_role 83 | arn:aws:cloudformation:us-east-1:84412349716:stack/LambdaStack/23a341eb0-4162-11e5-9d4f-0150b34c7c 84 | Creating roles 85 | Still creating 86 | Giving Lambda proper permissions 87 | Trying... 88 | Created role 89 | ``` 90 | 91 | ### 3. Build the Scala project jar 92 | 93 | Let's build our Scala project into a fully self contained jar file. 94 | 95 | ```bash 96 | $ sbt assembly 97 | info] Loading project definition from /aws-lambda-scala-example-project/project 98 | [info] Set current project to aws-lambda-scala-example-project (in build file:/aws-lambda-scala-example-project/) 99 | [info] Including from cache: scala-reflect-2.11.4.jar 100 | ... 101 | [warn] Merging 'rootdoc.txt' with strategy 'first' 102 | [warn] Strategy 'discard' was applied to 62 files 103 | [warn] Strategy 'first' was applied to a file 104 | [info] SHA-1: 96401bbad71968267ccea4c479a7d39093ef8988 105 | [info] Packaging /Volumes/DataDrive/dev/aws-lambda-scala-example-project/target/scala-2.11/aws-lambda-scala-example-project-0.2.0.jar ... 106 | [info] Done packaging. 107 | [success] Total time: 59 s, completed 13-Aug-2015 10:40:05 AM 108 | ``` 109 | 110 | ### 4. Upload project jar to Amazon S3. 111 | 112 | We will create a S3 Bucket for the jar file to be picked up by AWS Lambda. We will upload the jar file to the Amazon S3 service using our custom uploader `inv upload_s3`. 113 | 114 | ```bash 115 | $ inv upload_s3 116 | Jar uploaded to S3 aws_scala_lambda_bucket 117 | ``` 118 | 119 | ### 5. Configure AWS Lambda service 120 | 121 | Now that we have built the project, and uploaded the jar file to the AWS Lambda service, we need to configure the Lambda service to watch for event traffic from our AWS Kinesis stream named `my-stream`. 122 | 123 | ```bash 124 | $ inv create_lambda 125 | Creating AWS Lambda function. 126 | { 127 | "FunctionName": "ProcessingKinesisLambdaDynamoDB", 128 | "CodeSize": 38042279, 129 | "MemorySize": 1024, 130 | "FunctionArn": "arn:aws:lambda:us-east-1:842349429716:function:ProcessingKinesisLambdaDynamoDB", 131 | "Handler": "com.snowplowanalytics.awslambda.LambdaFunction::recordHandler", 132 | "Role": "arn:aws:iam::842340234716:role/LambdaStack-LambdaExecRole-7G57P4M2VV5P", 133 | "Timeout": 60, 134 | "LastModified": "2015-08-13T19:39:46.730+0000", 135 | "Runtime": "java8", 136 | "Description": "" 137 | } 138 | ``` 139 | 140 | Now we can associate our Lambda with our Kinesis stream: 141 | 142 | ```bash 143 | $ inv configure_lambda my-stream 144 | Configured AWS Lambda service. 145 | Added Kinesis as event source for Lambda function. 146 | ``` 147 | 148 | ### 6. Sending events to Kinesis 149 | 150 | We need to start sending events to our new Kinesis stream. We have created a helper method to do this - run the below and leave it running: 151 | 152 | ```bash 153 | $ inv generate_events default us-east-1 my-stream 154 | Event sent to Kinesis: {"timestamp": "2015-06-05T12:54:43.064528", "type": "Green", "id": "4ec80fb1-0963-4e35-8f54-ce760499d974"} 155 | Event sent to Kinesis: {"timestamp": "2015-06-05T12:54:43.757797", "type": "Red", "id": "eb84b0d1-f793-4213-8a65-2fb09eab8c5c"} 156 | Event sent to Kinesis: {"timestamp": "2015-06-05T12:54:44.295972", "type": "Yellow", "id": "4654bdc8-86d4-44a3-9920-fee7939e2582"} 157 | ... 158 | ``` 159 | 160 | ### 7. Monitoring your job 161 | 162 | First head over to the AWS Lambda service console, then review the logs in CloudWatch. 163 | 164 | Finally, let's check the data in our DynamoDB table. Make sure you are in the correct AWS region, then click on `my-table` and hit the `Explore Table` button: 165 | 166 | ![dynamodb-table-image][dynamodb-table-image] 167 | 168 | For each **BucketStart** and **EventType** pair, we see a **Count**, plus some **CreatedAt** and **UpdatedAt** metadata for debugging purposes. Our bucket size is 1 minute, and we have 5 discrete event types, hence the matrix of rows that we see. 169 | 170 | ## Roadmap 171 | 172 | * Expanding our analytics-on-write thinking into our new [Icebucket] [icebucket] project 173 | 174 | ## Credits 175 | 176 | * Ian Meyers and his [Amazon Kinesis Aggregators][amazon-kinesis-aggregators] project, a true inspiration for streaming analytics-on-write 177 | 178 | ## Copyright and license 179 | 180 | AWS Lambda Scala Example Project is copyright 2015 Snowplow Analytics Ltd. 181 | 182 | Licensed under the [Apache License, Version 2.0] [license] (the "License"); 183 | you may not use this software except in compliance with the License. 184 | 185 | Unless required by applicable law or agreed to in writing, software 186 | distributed under the License is distributed on an "AS IS" BASIS, 187 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 188 | See the License for the specific language governing permissions and 189 | limitations under the License. 190 | 191 | [travis]: https://travis-ci.org/snowplow/aws-lambda-scala-example-project 192 | [travis-image]: https://travis-ci.org/snowplow/aws-lambda-scala-example-project.png?branch=master 193 | [license-image]: http://img.shields.io/badge/license-Apache--2-blue.svg?style=flat 194 | [license]: http://www.apache.org/licenses/LICENSE-2.0 195 | [release-image]: http://img.shields.io/badge/release-0.2.0-blue.svg?style=flat 196 | [releases]: https://github.com/snowplow/aws-lambda-scala-example-project/releases 197 | [grunt-image]: https://cdn.gruntjs.com/builtwith.png 198 | 199 | [spark-streaming-example-project]: https://github.com/snowplow/spark-streaming-example-project 200 | [aws-lambda-nodejs-example-project]: https://github.com/snowplow/aws-lambda-nodejs-example-project 201 | [vagrant-install]: http://docs.vagrantup.com/v2/installation/index.html 202 | [virtualbox-install]: https://www.virtualbox.org/wiki/Downloads 203 | 204 | [blog-post]: http://snowplowanalytics.com/blog/2015/08/20/aws-lambda-scala-example-project-0.1.0-released/ 205 | [dynamodb-table-image]: /docs/dynamodb-table-image.png?raw=true 206 | 207 | [aws-cloudformation]: http://aws.amazon.com/cloudformation 208 | [aws-lambda]: http://aws.amazon.com/lambda/ 209 | [aws-kinesis]: http://aws.amazon.com/kinesis/ 210 | [aws-dynamodb]: http://aws.amazon.com/dynamodb 211 | [amazon-kinesis-aggregators]: https://github.com/awslabs/amazon-kinesis-aggregators 212 | 213 | [snowplow]: http://snowplowanalytics.com 214 | [icebucket]: https://github.com/snowplow/icebucket 215 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | Vagrant.configure("2") do |config| 2 | 3 | config.vm.box = "ubuntu/trusty64" 4 | config.vm.hostname = "aws-lambda-scala-example-project" 5 | config.ssh.forward_agent = true 6 | 7 | config.vm.provider :virtualbox do |vb| 8 | vb.name = Dir.pwd().split("/")[-1] + "-" + Time.now.to_f.to_i.to_s 9 | vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"] 10 | vb.customize [ "guestproperty", "set", :id, "--timesync-threshold", 10000 ] 11 | # Scala is memory-hungry 12 | vb.memory = 6000 13 | end 14 | 15 | config.vm.provision :shell do |sh| 16 | sh.path = "vagrant/up.bash" 17 | end 18 | 19 | end 20 | -------------------------------------------------------------------------------- /docs/dynamodb-table-image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/snowplow-archive/aws-lambda-scala-example-project/3e1323f5f3f7b3b31a375463edd0eb963a8a6a1e/docs/dynamodb-table-image.png -------------------------------------------------------------------------------- /project/AwsLambdaScalaExampleProjectBuild.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2012-2014 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, and 5 | * you may not use this file except in compliance with the Apache License 6 | * Version 2.0. You may obtain a copy of the Apache License Version 2.0 at 7 | * http://www.apache.org/licenses/LICENSE-2.0. 8 | * 9 | * Unless required by applicable law or agreed to in writing, software 10 | * distributed under the Apache License Version 2.0 is distributed on an "AS 11 | * IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 12 | * implied. See the Apache License Version 2.0 for the specific language 13 | * governing permissions and limitations there under. 14 | */ 15 | 16 | import sbt.Keys._ 17 | import sbt._ 18 | 19 | object AwsLambdaScalaExampleProjectBuild extends Build { 20 | 21 | import BuildSettings._ 22 | import Dependencies._ 23 | 24 | // Configure prompt to show current project. 25 | override lazy val settings = super.settings :+ { 26 | shellPrompt := { s => Project.extract(s).currentProject.id + " > " } 27 | } 28 | 29 | // Define our project, with basic project information and library 30 | // dependencies. 31 | lazy val project = Project("aws-lambda-scala-example-project", file(".")) 32 | .settings(buildSettings: _*) 33 | .settings( 34 | libraryDependencies ++= Seq( 35 | Libraries.awsLambda, 36 | Libraries.awsLambdaEvents, 37 | Libraries.awsSdk, 38 | Libraries.awsSdkCore, 39 | Libraries.jackson, 40 | Libraries.json4s, 41 | Libraries.awsscala 42 | ) 43 | ) 44 | 45 | } 46 | -------------------------------------------------------------------------------- /project/BuildSettings.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2012-2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the 6 | * Apache License Version 2.0. 7 | * You may obtain a copy of the Apache License Version 2.0 at 8 | * http://www.apache.org/licenses/LICENSE-2.0. 9 | * 10 | * Unless required by applicable law or agreed to in writing, 11 | * software distributed under the Apache License Version 2.0 is distributed on 12 | * an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 13 | * express or implied. See the Apache License Version 2.0 for the specific 14 | * language governing permissions and limitations there under. 15 | */ 16 | 17 | import sbt.Keys._ 18 | import sbt._ 19 | 20 | object BuildSettings { 21 | 22 | // Basic settings for our app 23 | lazy val basicSettings = Seq[Setting[_]]( 24 | organization := "com.snowplowanalytics", 25 | version := "0.2.0", 26 | retrieveManaged := true, 27 | description := "AWS Lambda Scala example project", 28 | scalaVersion := "2.11.7", 29 | javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint") 30 | ) 31 | 32 | import sbtassembly.Plugin._ 33 | import AssemblyKeys._ 34 | 35 | lazy val sbtAssemblySettings = assemblySettings ++ Seq( 36 | jarName in assembly := { 37 | name.value + "-" + version.value + ".jar" 38 | }, 39 | 40 | // META-INF discarding 41 | mergeStrategy in assembly := { 42 | case PathList("META-INF", xs@_*) => MergeStrategy.discard 43 | case x => MergeStrategy.first 44 | } 45 | ) 46 | 47 | lazy val buildSettings = basicSettings ++ sbtAssemblySettings 48 | } 49 | -------------------------------------------------------------------------------- /project/Dependencies.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2012-2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the Apache License Version 2.0. 6 | * You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 7 | * 8 | * Unless required by applicable law or agreed to in writing, 9 | * software distributed under the Apache License Version 2.0 is distributed on an 10 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 12 | */ 13 | 14 | import sbt._ 15 | 16 | object Dependencies { 17 | 18 | object V { 19 | val awsLambda = "1.0.0" 20 | val awsSdk = "1.9.34" 21 | val jackson = "2.5.2" 22 | val json4s = "3.2.11" 23 | val awscala = "0.5.+" 24 | } 25 | 26 | object Libraries { 27 | val awsLambda = "com.amazonaws" % "aws-lambda-java-core" % V.awsLambda 28 | val awsLambdaEvents = "com.amazonaws" % "aws-lambda-java-events" % V.awsLambda 29 | val awsSdk = "com.amazonaws" % "aws-java-sdk" % V.awsSdk % "provided" 30 | val awsSdkCore = "com.amazonaws" % "aws-java-sdk-core" % V.awsSdk % "provided" 31 | val jackson = "com.fasterxml.jackson.module" % "jackson-module-scala_2.11" % V.jackson 32 | val json4s = "org.json4s" %% "json4s-jackson" % V.json4s 33 | val awsscala = "com.github.seratch" %% "awscala" % V.awscala 34 | } 35 | 36 | } 37 | -------------------------------------------------------------------------------- /project/build.properties: -------------------------------------------------------------------------------- 1 | sbt.version=0.13.9 2 | -------------------------------------------------------------------------------- /project/plugins.sbt: -------------------------------------------------------------------------------- 1 | addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2") -------------------------------------------------------------------------------- /src/main/scala/com.snowplowanalytics.awslambda/BucketStrategy.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the Apache License Version 2.0. 6 | * You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 7 | * 8 | * Unless required by applicable law or agreed to in writing, 9 | * software distributed under the Apache License Version 2.0 is distributed on an 10 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 12 | */ 13 | package com.snowplowanalytics.awslambda 14 | 15 | // Java 16 | import java.util.Date 17 | import java.text.SimpleDateFormat 18 | 19 | /** 20 | * Object uses downsampling method to create metadata from each 21 | * EventType log record. Parsing the ISO 8601 22 | * datetime stamp to the minute means downsampling aka reducing 23 | * precision. 24 | * 25 | * Bucketing 26 | * A family of aggregations that build buckets, where each bucket 27 | * is associated with a key and an EventType criterion. When the 28 | * aggregation is executed, all the buckets criteria are evaluated 29 | * on every EventType in the context and when a criterion matches, 30 | * the EventType is considered to "fall in" the relevant bucket. 31 | * By the end of the aggregation process, we’ll end up with a 32 | * list of buckets - each one with a set of EventTypes that 33 | * "belong" to it. 34 | * 35 | */ 36 | object BucketStrategy { 37 | 38 | private val BucketToMinuteFormatter = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:00.000") 39 | 40 | /** 41 | * Function to bucket a date based on 42 | * our bucketing strategy. Bucketing 43 | * means downsampling aka reducing 44 | * precision. 45 | * 46 | * @param date The Java Date to bucket 47 | * @return the downsampled date in String 48 | * format 49 | */ 50 | def bucket(date: Date): String = 51 | BucketToMinuteFormatter.format(date) 52 | } 53 | -------------------------------------------------------------------------------- /src/main/scala/com.snowplowanalytics.awslambda/DynamoDBUtility.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the Apache License Version 2.0. 6 | * You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 7 | * 8 | * Unless required by applicable law or agreed to in writing, 9 | * software distributed under the Apache License Version 2.0 is distributed on an 10 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 12 | */ 13 | package com.snowplowanalytics.awslambda 14 | 15 | // Java 16 | import java.util.Date 17 | import java.util.TimeZone 18 | import java.text.SimpleDateFormat 19 | 20 | // AWS 21 | import com.amazonaws.auth.profile.ProfileCredentialsProvider 22 | import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient 23 | import com.amazonaws.services.dynamodbv2.document.{AttributeUpdate, DynamoDB, Item} 24 | 25 | /** 26 | * Object sets up singleton that finds AWS credentials for DynamoDB to access the 27 | * aggregation records table. The utility function below puts items into the 28 | * "AggregateRecords" table. 29 | * 30 | * val dynamoConnection = DynamoUtils.setupDynamoClientConnection(config.awsProfile) 31 | */ 32 | object DynamoDBUtility { 33 | 34 | val dateFormatter = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") 35 | val timezone = TimeZone.getTimeZone("UTC") 36 | 37 | /** 38 | * Function timezone helper 39 | */ 40 | def timeNow(): String = { 41 | dateFormatter.setTimeZone(timezone) 42 | dateFormatter.format(new Date()) 43 | } 44 | 45 | /** 46 | * Function wraps DynamoDB cred setup 47 | * Lambda creds pass IAM to DB Client 48 | */ 49 | def setupDynamoClientConnection(): DynamoDB = { 50 | val dynamoDB = new DynamoDB(new AmazonDynamoDBClient()) 51 | dynamoDB 52 | } 53 | 54 | /** 55 | * Function wraps AWS Java putItem operation to DynamoDB table 56 | */ 57 | def putItem(dynamoDB: DynamoDB, tableName: String, bucketStart: String, eventType: String, createdAt: String, updatedAt: String, count: Int) { 58 | 59 | // AggregateRecords column names 60 | val tablePrimaryKeyName = "BucketStart" 61 | val tableEventTypeSecondaryKeyName = "EventType" 62 | val tableCreatedAtColumnName = "CreatedAt" 63 | val tableUpdatedAtColumnName = "UpdatedAt" 64 | val tableCountColumnName = "Count" 65 | 66 | try { 67 | val time = new Date().getTime - (1 * 24 * 60 * 60 * 1000) 68 | val date = new Date() 69 | date.setTime(time) 70 | dateFormatter.setTimeZone(TimeZone.getTimeZone("UTC")) 71 | val table = dynamoDB.getTable(tableName) 72 | 73 | val item = new Item().withPrimaryKey(tablePrimaryKeyName, bucketStart) 74 | .withString(tableEventTypeSecondaryKeyName, eventType) 75 | .withString(tableCreatedAtColumnName, createdAt) 76 | .withString(tableUpdatedAtColumnName, updatedAt) 77 | .withInt(tableCountColumnName, count) 78 | 79 | // Saving the data to DynamoDB AggregrateRecords table 80 | table.putItem(item) 81 | } catch { 82 | case e: Exception => { 83 | System.err.println("Failed to create item in " + tableName) 84 | System.err.println(e.getMessage) 85 | } 86 | } 87 | } 88 | } -------------------------------------------------------------------------------- /src/main/scala/com.snowplowanalytics.awslambda/LambdaFunction.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the Apache License Version 2.0. 6 | * You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 7 | * 8 | * Unless required by applicable law or agreed to in writing, 9 | * software distributed under the Apache License Version 2.0 is distributed on an 10 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 12 | */ 13 | package com.snowplowanalytics.awslambda 14 | 15 | // AWS 16 | import com.amazonaws.services.lambda.runtime.events.KinesisEvent 17 | import awscala._ 18 | import dynamodbv2._ 19 | 20 | // Scala 21 | import scala.collection.JavaConversions._ 22 | import scala.collection.mutable.Buffer 23 | 24 | class LambdaFunction { 25 | 26 | private val AwsRegion = Region.US_EAST_1 27 | private val AwsTable = "my-table" 28 | 29 | import com.fasterxml.jackson.databind.ObjectMapper 30 | import com.fasterxml.jackson.module.scala.DefaultScalaModule 31 | 32 | val scalaMapper = { 33 | new ObjectMapper().registerModule(new DefaultScalaModule) 34 | } 35 | 36 | def recordHandler(event: KinesisEvent) { 37 | 38 | val convertedRecords = 39 | for { 40 | rec <- event.getRecords 41 | val record = new String(rec.getKinesis.getData.array()) 42 | val event = scalaMapper.readValue(record, classOf[SimpleEvent]) 43 | } yield event 44 | 45 | def aggregateRecords(converted: Buffer[SimpleEvent]) { 46 | val eventArray = converted.groupBy(_.bucket).mapValues(_.map(x => x.eventType)) 47 | val counted = eventArray.mapValues(_.groupBy(identity).mapValues(_.size)) 48 | 49 | implicit val dynamoDB = DynamoDB.at(AwsRegion) 50 | val table: Table = dynamoDB.table(AwsTable).get 51 | 52 | // Stomic increments with addAttributes 53 | for (bucket <- counted) 54 | bucket._2.map( { 55 | case (key, value) => 56 | table.addAttributes(bucket._1, key, Seq("Count" -> value)) 57 | }) 58 | } 59 | aggregateRecords(convertedRecords) 60 | } 61 | } 62 | 63 | -------------------------------------------------------------------------------- /src/main/scala/com.snowplowanalytics.awslambda/SimpleEvent.scala: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. 3 | * 4 | * This program is licensed to you under the Apache License Version 2.0, 5 | * and you may not use this file except in compliance with the Apache License Version 2.0. 6 | * You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 7 | * 8 | * Unless required by applicable law or agreed to in writing, 9 | * software distributed under the Apache License Version 2.0 is distributed on an 10 | * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 11 | * See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 12 | */ 13 | package com.snowplowanalytics.awslambda 14 | 15 | // Java 16 | import java.text.SimpleDateFormat 17 | import java.util.Date 18 | 19 | // json4s 20 | import org.json4s._ 21 | import org.json4s.jackson.JsonMethods._ 22 | 23 | /** 24 | * Companion object for creating a SimpleEvent 25 | * from incoming JSON 26 | */ 27 | object SimpleEvent { 28 | 29 | private val format = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss") 30 | 31 | /** 32 | * Converts date string into Date object 33 | */ 34 | def convertStringToDate(dateString: String): Date = format.parse(dateString) 35 | 36 | /** 37 | * Converts Kinesis ByteArray of JSON data into SimpleEvent objects 38 | */ 39 | def fromJson(byteArray: Array[Byte]): SimpleEvent = { 40 | implicit val formats = DefaultFormats 41 | val newString = new String(byteArray, "UTF-8") 42 | val parsed = parse(newString) 43 | parsed.extract[SimpleEvent] 44 | } 45 | 46 | } 47 | 48 | /** 49 | * Simple Class demonstrating an EventType log consisting of: 50 | * 1. ISO 8601 DateTime Object that will be downsampled 51 | * (see BucketingStrategy.scala file for more details) 52 | * 2. A simple model of colors for this EventType: 53 | * 'Red','Orange','Yellow','Green', or 'Blue' 54 | * example log: {"timestamp": "2015-06-05T13:00:22.540374", "type": "Orange", "id": "018dd633-f4c3-4599-9b44-ebf71a1c519f"} 55 | */ 56 | case class SimpleEvent(id: String, timestamp: String, eventType: String) { 57 | 58 | // Convert timestamp into Time Bucket using Bucketing Strategy 59 | val bucket = BucketStrategy.bucket(SimpleEvent.convertStringToDate(timestamp)) 60 | } 61 | -------------------------------------------------------------------------------- /tasks.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015 Snowplow Analytics Ltd. All rights reserved. 2 | # 3 | # This program is licensed to you under the Apache License Version 2.0, 4 | # and you may not use this file except in compliance with the Apache License Version 2.0. 5 | # You may obtain a copy of the Apache License Version 2.0 at http://www.apache.org/licenses/LICENSE-2.0. 6 | # 7 | # Unless required by applicable law or agreed to in writing, 8 | # software distributed under the Apache License Version 2.0 is distributed on an 9 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 10 | # See the Apache License Version 2.0 for the specific language governing permissions and limitations there under. 11 | 12 | import datetime, json, uuid, time 13 | from functools import partial 14 | from random import choice 15 | from invoke import run, task 16 | import boto 17 | from boto import kinesis 18 | import boto.dynamodb2 19 | from boto.dynamodb2.fields import HashKey, RangeKey, KeysOnlyIndex, GlobalAllIndex 20 | from boto.dynamodb2.table import Table 21 | from boto.dynamodb2.types import NUMBER 22 | import boto.cloudformation 23 | import time 24 | import math, os 25 | from filechunkio import FileChunkIO 26 | 27 | # setting for AWS Lambda and Lambda-Exec-Role 28 | REGION = "us-east-1" 29 | IAM_ROLE_ARN = "" 30 | IAM_ROLE = "" 31 | POLICY = """{ 32 | "Statement":[{ 33 | "Effect":"Allow", 34 | "Action":["*"], 35 | "Resource":["*"]}]}""" 36 | POLICY_NAME = "AdministratorAccess" 37 | STACK_NAME = "LambdaStack" 38 | TEMPLATE_URL = "https://snowplow-hosted-assets.s3.amazonaws.com/third-party/aws-lambda/lambda-admin.template" 39 | CAPABILITIES = ["CAPABILITY_IAM"] 40 | JARFILE = "./target/scala-2.11/aws-lambda-scala-example-project-0.2.0.jar" 41 | S3_BUCKET = "aws_scala_lambda_bucket" 42 | S3_KEY = os.path.basename(JARFILE) 43 | FUNCTION_NAME = "ProcessingKinesisLambdaDynamoDB" 44 | # Selection of EventType values 45 | COLORS = ['Red','Orange','Yellow','Green','Blue'] 46 | # DynamoDB settings 47 | THROUGHPUT_READ = 20 48 | THROUGHPUT_WRITE = 20 49 | 50 | 51 | # AWS Kinesis Data Generator 52 | def picker(seq): 53 | """ 54 | Returns a new function that can be called without arguments 55 | to select and return a random color 56 | """ 57 | return partial(choice, seq) 58 | 59 | def create_event(): 60 | """ 61 | Returns a choice of color and builds and event 62 | """ 63 | event_id = str(uuid.uuid4()) 64 | color_choice = picker(COLORS) 65 | 66 | return (event_id, { 67 | "id": event_id, 68 | "timestamp": datetime.datetime.now().isoformat(), 69 | "eventType": color_choice() 70 | }) 71 | 72 | def write_event(conn, stream_name): 73 | """ 74 | Returns the event and event event_payload 75 | """ 76 | event_id, event_payload = create_event() 77 | event_json = json.dumps(event_payload) 78 | conn.put_record(stream_name, event_json, event_id) 79 | return event_json 80 | 81 | @task 82 | def upload_s3(): 83 | """ 84 | Upload jar file to s3 85 | """ 86 | source_path = JARFILE 87 | source_size = os.stat(source_path).st_size 88 | 89 | # create bucket 90 | import boto 91 | conn = boto.connect_s3() 92 | bucket = conn.create_bucket(S3_BUCKET) 93 | 94 | # upload 95 | c = boto.connect_s3() 96 | b = c.get_bucket(S3_BUCKET) 97 | # Create a multipart upload request 98 | mp = b.initiate_multipart_upload(os.path.basename(source_path)) 99 | 100 | # Use a chunk size of 5 MiB 101 | chunk_size = 5242880 102 | chunk_count = int(math.ceil(source_size / float(chunk_size))) 103 | 104 | # Send the file parts, using FileChunkIO to create a file-like object 105 | # that points to a certain byte range within the original file. We 106 | # set bytes to never exceed the original file size. 107 | for i in range(chunk_count): 108 | offset = chunk_size * i 109 | bytes = min(chunk_size, source_size - offset) 110 | with FileChunkIO(source_path, 'r', offset=offset, 111 | bytes=bytes) as fp: 112 | mp.upload_part_from_file(fp, part_num=i + 1) 113 | 114 | # Finish the upload 115 | mp.complete_upload() 116 | print("Jar uploaded to S3 bucket " + S3_BUCKET) 117 | 118 | @task 119 | def create_role(): 120 | """ 121 | Creates IAM role using CloudFormation for AWS Lambda service 122 | """ 123 | client_cf = boto.cloudformation.connect_to_region(REGION) 124 | response = client_cf.create_stack( 125 | stack_name=STACK_NAME, 126 | template_url=TEMPLATE_URL, 127 | capabilities=CAPABILITIES 128 | ) 129 | print response 130 | time.sleep(7) 131 | print "Creating roles" 132 | time.sleep(7) 133 | print "Still creating" 134 | time.sleep(7) 135 | print "Giving Lambda proper permissions" 136 | # get name of LambdaExecRole 137 | client_iam = boto.connect_iam() 138 | roles = client_iam.list_roles() 139 | list_roles = roles['list_roles_response']['list_roles_result']['roles'] 140 | for i in range(len(list_roles)): 141 | if STACK_NAME+"-LambdaExecRole" in list_roles[i].arn: 142 | IAM_ROLE = list_roles[i].role_name 143 | print "Trying..." 144 | # grants Admin access to LambdaExecRole to access Cloudwatch, DynamoDB, Kinesis 145 | client_iam.put_role_policy(IAM_ROLE, POLICY_NAME, POLICY) 146 | print "Created role" 147 | 148 | 149 | @task 150 | def generate_events(profile, region, stream): 151 | """ 152 | load demo data with python generator script for SimpleEvents 153 | """ 154 | conn = kinesis.connect_to_region(region, profile_name=profile) 155 | while True: 156 | event_json = write_event(conn, stream) 157 | print "Event sent to Kinesis: {}".format(event_json) 158 | 159 | @task 160 | def create_lambda(): 161 | """ 162 | Create aws-lambda-scala-example-project AWS Lambda service 163 | """ 164 | # TODO: switch to use all boto 165 | IAM_ROLE_ARN = get_iam_role_arn() 166 | print("Creating AWS Lambda function.") 167 | run("aws lambda create-function --region {} \ 168 | --function-name {} \ 169 | --code S3Bucket={},S3Key={} \ 170 | --role {} \ 171 | --handler com.snowplowanalytics.awslambda.LambdaFunction::recordHandler \ 172 | --runtime java8 --timeout 60 --memory-size 1024".format(REGION, FUNCTION_NAME, S3_BUCKET, S3_KEY, IAM_ROLE_ARN), pty=True) 173 | 174 | def get_iam_role_arn(): 175 | client_iam = boto.connect_iam() 176 | roles = client_iam.list_roles() 177 | list_roles = roles['list_roles_response']['list_roles_result']['roles'] 178 | for i in range(len(list_roles)): 179 | if STACK_NAME+"-LambdaExecRole" in list_roles[i].arn: 180 | IAM_ROLE_ARN = list_roles[i].arn 181 | return IAM_ROLE_ARN 182 | 183 | @task 184 | def configure_lambda(stream): 185 | """ 186 | Configure Lambda function to use Kinesis 187 | """ 188 | print("Configured AWS Lambda service") 189 | IAM_ROLE_ARN = get_iam_role_arn() 190 | aws_lambda = boto.connect_awslambda() 191 | event_source = kinesis_stream(stream) 192 | response_add_event_source = aws_lambda.add_event_source(event_source, 193 | FUNCTION_NAME, 194 | IAM_ROLE_ARN, 195 | batch_size=100, 196 | parameters=None) 197 | event_source_id = response_add_event_source['UUID'] 198 | 199 | while response_add_event_source['IsActive'] != 'true': 200 | print('Waiting for the event source to become active') 201 | sleep(5) 202 | response_add_event_source = aws_lambda.get_event_source(event_source_id) 203 | 204 | print('Added Kinesis as event source for Lambda function') 205 | 206 | 207 | @task 208 | def create_dynamodb_table(profile, region, table): 209 | """ 210 | DynamoDB table creation with AWS Boto library in Python 211 | """ 212 | connection = boto.dynamodb2.connect_to_region(region, profile_name=profile) 213 | aggregate = Table.create(table, 214 | schema=[ 215 | HashKey("BucketStart"), 216 | RangeKey("EventType"), 217 | ], 218 | throughput={ 219 | 'read': THROUGHPUT_READ, 220 | 'write': THROUGHPUT_WRITE 221 | }, 222 | connection=connection 223 | ) 224 | 225 | @task 226 | def create_kinesis_stream(stream): 227 | """ 228 | Creates our Kinesis stream 229 | """ 230 | kinesis = boto.connect_kinesis() 231 | response = kinesis.create_stream(stream, 1) 232 | pause_until_kinesis_active(stream) 233 | print("Kinesis successfully created") 234 | 235 | def pause_until_kinesis_active(stream): 236 | kinesis = boto.connect_kinesis() 237 | # Wait for Kinesis stream to be active 238 | while kinesis.describe_stream(stream)['StreamDescription']['StreamStatus'] != 'ACTIVE': 239 | print('Kinesis stream [' + stream + '] not active yet') 240 | time.sleep(5) 241 | 242 | def kinesis_stream(stream): 243 | """ 244 | Returns Kinesis stream arn 245 | """ 246 | kinesis = boto.connect_kinesis() 247 | return kinesis.describe_stream(stream)['StreamDescription']['StreamARN'] 248 | 249 | @task 250 | def describe_kinesis_stream(stream): 251 | """ 252 | Prints status Kinesis stream 253 | """ 254 | print("Created: ") 255 | print(kinesis_stream(stream)) 256 | -------------------------------------------------------------------------------- /vagrant/.gitignore: -------------------------------------------------------------------------------- 1 | .peru 2 | oss-playbooks 3 | ansible 4 | 5 | -------------------------------------------------------------------------------- /vagrant/ansible.hosts: -------------------------------------------------------------------------------- 1 | [vagrant] 2 | 127.0.0.1:2222 3 | -------------------------------------------------------------------------------- /vagrant/peru.yaml: -------------------------------------------------------------------------------- 1 | imports: 2 | ansible: ansible 3 | ansible_playbooks: oss-playbooks 4 | 5 | curl module ansible: 6 | # Equivalent of git cloning tags/v1.6.6 but much, much faster 7 | url: https://codeload.github.com/ansible/ansible/zip/69d85c22c7475ccf8169b6ec9dee3ee28c92a314 8 | unpack: zip 9 | export: ansible-69d85c22c7475ccf8169b6ec9dee3ee28c92a314 10 | 11 | git module ansible_playbooks: 12 | url: https://github.com/snowplow/ansible-playbooks.git 13 | # Comment out to fetch a specific rev instead of master: 14 | # rev: xxx 15 | -------------------------------------------------------------------------------- /vagrant/up.bash: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | 4 | vagrant_dir=/vagrant/vagrant 5 | bashrc=/home/vagrant/.bashrc 6 | 7 | echo "========================================" 8 | echo "INSTALLING PERU AND ANSIBLE DEPENDENCIES" 9 | echo "----------------------------------------" 10 | apt-get update 11 | apt-get install -y language-pack-en git unzip libyaml-dev python3-pip python-yaml python-paramiko python-jinja2 12 | 13 | echo "===============" 14 | echo "INSTALLING PERU" 15 | echo "---------------" 16 | sudo pip3 install peru 17 | 18 | echo "=======================================" 19 | echo "CLONING ANSIBLE AND PLAYBOOKS WITH PERU" 20 | echo "---------------------------------------" 21 | cd ${vagrant_dir} && peru sync -v 22 | echo "... done" 23 | 24 | env_setup=${vagrant_dir}/ansible/hacking/env-setup 25 | hosts=${vagrant_dir}/ansible.hosts 26 | 27 | echo "===================" 28 | echo "CONFIGURING ANSIBLE" 29 | echo "-------------------" 30 | touch ${bashrc} 31 | echo "source ${env_setup}" >> ${bashrc} 32 | echo "export ANSIBLE_HOSTS=${hosts}" >> ${bashrc} 33 | echo "... done" 34 | 35 | echo "==========================================" 36 | echo "RUNNING PLAYBOOKS WITH ANSIBLE*" 37 | echo "* no output while each playbook is running" 38 | echo "------------------------------------------" 39 | while read pb; do 40 | su - -c "source ${env_setup} && ${vagrant_dir}/ansible/bin/ansible-playbook ${vagrant_dir}/${pb} --connection=local --inventory-file=${hosts}" vagrant 41 | done <${vagrant_dir}/up.playbooks 42 | 43 | guidance=${vagrant_dir}/up.guidance 44 | 45 | if [ -f ${guidance} ]; then 46 | echo "===========" 47 | echo "PLEASE READ" 48 | echo "-----------" 49 | cat $guidance 50 | fi 51 | -------------------------------------------------------------------------------- /vagrant/up.guidance: -------------------------------------------------------------------------------- 1 | To get started: 2 | vagrant ssh 3 | cd /vagrant 4 | sbt test 5 | -------------------------------------------------------------------------------- /vagrant/up.playbooks: -------------------------------------------------------------------------------- 1 | oss-playbooks/aws-cli-and-psql.yml 2 | oss-playbooks/java8.yml 3 | oss-playbooks/scala-2.11.6.yml 4 | oss-playbooks/sbt-13.8.yml 5 | oss-playbooks/invoke.yml 6 | oss-playbooks/python-filechunkio.yml 7 | --------------------------------------------------------------------------------