├── CODE_OF_CONDUCT.md ├── README.md ├── LICENSE ├── CONTRIBUTING.md └── CloudFormation └── CSVToDynamo.template /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Implementing bulk CSV ingestion to Amazon DynamoDB 2 | 3 | This repository is used in conjunction with the following blog post: [Implementing bulk CSV ingestion to Amazon DynamoDB](https://aws.amazon.com/blogs/database/implementing-bulk-csv-ingestion-to-amazon-dynamodb/) 4 | 5 | You can use your own CSV file or download the test file we provided in this repo. 6 | 7 | Steps to Download CloudFormation template: 8 | 1. Navigate to CloudFormation folder in this repo. 9 | 2. Click on CSVToDynamo.template. 10 | 3. Click on the Raw button. 11 | 4. Save Page As > Remove any file extensions so that the file reads like "CSVToDynamo.template". Click save. 12 | 13 | 14 | ## License 15 | 16 | This library is licensed under the MIT-0 License. See the LICENSE file. 17 | 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /CloudFormation/CSVToDynamo.template: -------------------------------------------------------------------------------- 1 | { 2 | "AWSTemplateFormatVersion": "2010-09-09", 3 | "Metadata": { 4 | 5 | }, 6 | "Parameters" : { 7 | "BucketName": { 8 | "Description": "Name of the S3 bucket you will deploy the CSV file to", 9 | "Type": "String", 10 | "ConstraintDescription": "must be a valid bucket name." 11 | }, 12 | "FileName": { 13 | "Description": "Name of the S3 file (including suffix)", 14 | "Type": "String", 15 | "ConstraintDescription": "Valid S3 file name." 16 | }, 17 | "DynamoDBTableName": { 18 | "Description": "Name of the dynamoDB table you will use", 19 | "Type": "String", 20 | "ConstraintDescription": "must be a valid dynamoDB name." 21 | } 22 | }, 23 | "Resources": { 24 | "DynamoDBTable":{ 25 | "Type": "AWS::DynamoDB::Table", 26 | "Properties":{ 27 | "TableName": {"Ref" : "DynamoDBTableName"}, 28 | "BillingMode": "PAY_PER_REQUEST", 29 | "AttributeDefinitions":[ 30 | { 31 | "AttributeName": "uuid", 32 | "AttributeType": "S" 33 | } 34 | ], 35 | "KeySchema":[ 36 | { 37 | "AttributeName": "uuid", 38 | "KeyType": "HASH" 39 | } 40 | ], 41 | "Tags":[ 42 | { 43 | "Key": "Name", 44 | "Value": {"Ref" : "DynamoDBTableName"} 45 | } 46 | ] 47 | } 48 | }, 49 | "LambdaRole" : { 50 | "Type" : "AWS::IAM::Role", 51 | "Properties" : { 52 | "AssumeRolePolicyDocument": { 53 | "Version" : "2012-10-17", 54 | "Statement" : [ 55 | { 56 | "Effect" : "Allow", 57 | "Principal" : { 58 | "Service" : ["lambda.amazonaws.com","s3.amazonaws.com"] 59 | }, 60 | "Action" : [ 61 | "sts:AssumeRole" 62 | ] 63 | } 64 | ] 65 | }, 66 | "Path" : "/", 67 | "ManagedPolicyArns":["arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole","arn:aws:iam::aws:policy/AWSLambdaInvocation-DynamoDB","arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"], 68 | "Policies": [{ 69 | "PolicyName": "policyname", 70 | "PolicyDocument": { 71 | "Version": "2012-10-17", 72 | "Statement": [{ 73 | "Effect": "Allow", 74 | "Resource": "*", 75 | "Action": [ 76 | "dynamodb:PutItem", 77 | "dynamodb:BatchWriteItem" 78 | ] 79 | }] 80 | } 81 | }] 82 | } 83 | }, 84 | "CsvToDDBLambdaFunction": { 85 | "Type": "AWS::Lambda::Function", 86 | "Properties": { 87 | "Handler": "index.lambda_handler", 88 | "Role": { 89 | "Fn::GetAtt": [ 90 | "LambdaRole", 91 | "Arn" 92 | ] 93 | }, 94 | "Code": { 95 | "ZipFile": { 96 | "Fn::Join": [ 97 | "\n", 98 | [ 99 | "import json", 100 | "import boto3", 101 | "import os", 102 | "import csv", 103 | "import codecs", 104 | "import sys", 105 | "", 106 | "s3 = boto3.resource('s3')", 107 | "dynamodb = boto3.resource('dynamodb')", 108 | "", 109 | "bucket = os.environ['bucket']", 110 | "key = os.environ['key']", 111 | "tableName = os.environ['table']", 112 | "", 113 | "def lambda_handler(event, context):", 114 | "", 115 | "", 116 | " #get() does not store in memory", 117 | " try:", 118 | " obj = s3.Object(bucket, key).get()['Body']", 119 | " except Exception as error:", 120 | " print(error)", 121 | " print(\"S3 Object could not be opened. Check environment variable. \")", 122 | " try:", 123 | " table = dynamodb.Table(tableName)", 124 | " except Exception as error:", 125 | " print(error)", 126 | " print(\"Error loading DynamoDB table. Check if table was created correctly and environment variable.\")", 127 | "", 128 | " batch_size = 100", 129 | " batch = []", 130 | "", 131 | " #DictReader is a generator; not stored in memory", 132 | " for row in csv.DictReader(codecs.getreader('utf-8-sig')(obj)):", 133 | " if len(batch) >= batch_size:", 134 | " write_to_dynamo(batch)", 135 | " batch.clear()", 136 | "", 137 | " batch.append(row)", 138 | "", 139 | " if batch:", 140 | " write_to_dynamo(batch)", 141 | "", 142 | " return {", 143 | " 'statusCode': 200,", 144 | " 'body': json.dumps('Uploaded to DynamoDB Table')", 145 | " }", 146 | "", 147 | "", 148 | "def write_to_dynamo(rows):", 149 | " try:", 150 | " table = dynamodb.Table(tableName)", 151 | " except Exception as error:", 152 | " print(error)", 153 | " print(\"Error loading DynamoDB table. Check if table was created correctly and environment variable.\")", 154 | "", 155 | " try:", 156 | " with table.batch_writer() as batch:", 157 | " for i in range(len(rows)):", 158 | " batch.put_item(", 159 | " Item=rows[i]", 160 | " )", 161 | " except Exception as error:", 162 | " print(error)", 163 | " print(\"Error executing batch_writer\")" 164 | ] 165 | ] 166 | } 167 | }, 168 | "Runtime": "python3.7", 169 | "Timeout": 900, 170 | "MemorySize": 3008, 171 | "Environment" : { 172 | "Variables" : {"bucket" : { "Ref" : "BucketName" }, "key" : { "Ref" : "FileName" },"table" : { "Ref" : "DynamoDBTableName" }} 173 | } 174 | } 175 | }, 176 | 177 | "S3Bucket": { 178 | "DependsOn" : ["CsvToDDBLambdaFunction","BucketPermission"], 179 | "Type": "AWS::S3::Bucket", 180 | "Properties": { 181 | 182 | "BucketName": {"Ref" : "BucketName"}, 183 | "AccessControl": "BucketOwnerFullControl", 184 | "NotificationConfiguration":{ 185 | "LambdaConfigurations":[ 186 | { 187 | "Event":"s3:ObjectCreated:*", 188 | "Function":{ 189 | "Fn::GetAtt": [ 190 | "CsvToDDBLambdaFunction", 191 | "Arn" 192 | ] 193 | } 194 | } 195 | ] 196 | } 197 | } 198 | }, 199 | "BucketPermission":{ 200 | "Type": "AWS::Lambda::Permission", 201 | "Properties":{ 202 | "Action": "lambda:InvokeFunction", 203 | "FunctionName":{"Ref" : "CsvToDDBLambdaFunction"}, 204 | "Principal": "s3.amazonaws.com", 205 | "SourceAccount": {"Ref":"AWS::AccountId"} 206 | } 207 | } 208 | }, 209 | "Outputs" : { 210 | 211 | } 212 | } 213 | --------------------------------------------------------------------------------