├── CHANGELOG.rst ├── terraform ├── requirements.txt ├── SAMPLE-env.list ├── terraform-plugins │ └── terraform-plugins.tf ├── dynamodb │ ├── s3 │ │ └── s3.tf │ ├── securitygroup │ │ └── securitygroup.tf │ └── main.tf ├── Dockerfile ├── teardown_historical.sh ├── install_historical.sh └── infra │ ├── s3 │ └── s3.tf │ └── securitygroup │ └── securitygroup.tf ├── .coveragerc ├── .pylintrc ├── mkdocs ├── requirements-docs.txt ├── docs │ ├── troubleshooting.md │ ├── img │ │ ├── cw-events.png │ │ ├── historical.jpg │ │ ├── iam-setup.jpg │ │ ├── historical-s3.jpg │ │ └── historical-overview.jpg │ ├── extra.css │ ├── index.md │ ├── installation │ │ ├── index.md │ │ ├── iam.md │ │ ├── terraform.md │ │ └── configuration.md │ └── architecture.md ├── custom_theme │ └── img │ │ └── favicon.ico └── mkdocs.yml ├── historical ├── historical-cookiecutter │ ├── historical_{{cookiecutter.technology_slug}} │ │ ├── {{cookiecutter.technology_slug}} │ │ │ ├── __init__.py │ │ │ ├── differ.py │ │ │ ├── conftest.py │ │ │ ├── poller.py │ │ │ ├── models.py │ │ │ └── collector.py │ │ ├── requirements.txt │ │ ├── serverless_configs │ │ │ ├── prod.yml │ │ │ └── test.yml │ │ ├── package.json │ │ ├── README.md │ │ └── serverless.yaml │ └── cookiecutter.json ├── __init__.py ├── s3 │ ├── __init__.py │ ├── differ.py │ ├── poller.py │ ├── models.py │ └── collector.py ├── vpc │ ├── __init__.py │ ├── differ.py │ ├── models.py │ ├── poller.py │ └── collector.py ├── common │ ├── __init__.py │ ├── exceptions.py │ ├── extensions.py │ ├── accounts.py │ ├── util.py │ ├── cloudwatch.py │ ├── sqs.py │ └── proxy.py ├── tests │ ├── __init__.py │ ├── pynamodb_settings.py │ ├── test_cloudwatch.py │ ├── factories.py │ └── conftest.py ├── security_group │ ├── __init__.py │ ├── differ.py │ ├── models.py │ ├── poller.py │ └── collector.py ├── __about__.py ├── cli.py ├── mapping │ └── __init__.py ├── constants.py ├── attributes.py └── models.py ├── setup.cfg ├── .travis.yml ├── README.md ├── setup.py ├── tox.ini └── .gitignore /CHANGELOG.rst: -------------------------------------------------------------------------------- 1 | Changelog 2 | ========= -------------------------------------------------------------------------------- /terraform/requirements.txt: -------------------------------------------------------------------------------- 1 | historical>=0.4.10 2 | -------------------------------------------------------------------------------- /.coveragerc: -------------------------------------------------------------------------------- 1 | [report] 2 | include = historical/*.py 3 | -------------------------------------------------------------------------------- /.pylintrc: -------------------------------------------------------------------------------- 1 | [MESSAGES CONTROL] 2 | disable=C0301,R0913,W1202,W1203,R0903,R0201,R0801 3 | -------------------------------------------------------------------------------- /mkdocs/requirements-docs.txt: -------------------------------------------------------------------------------- 1 | mkdocs 2 | mkdocs-bootswatch 3 | pymdown-extensions 4 | -------------------------------------------------------------------------------- /mkdocs/docs/troubleshooting.md: -------------------------------------------------------------------------------- 1 | # Troubleshooting 2 | This doc will be updated in the future. 3 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /mkdocs/docs/img/cw-events.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/docs/img/cw-events.png -------------------------------------------------------------------------------- /mkdocs/docs/img/historical.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/docs/img/historical.jpg -------------------------------------------------------------------------------- /mkdocs/docs/img/iam-setup.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/docs/img/iam-setup.jpg -------------------------------------------------------------------------------- /mkdocs/docs/img/historical-s3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/docs/img/historical-s3.jpg -------------------------------------------------------------------------------- /mkdocs/custom_theme/img/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/custom_theme/img/favicon.ico -------------------------------------------------------------------------------- /mkdocs/docs/img/historical-overview.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Netflix-Skunkworks/historical/master/mkdocs/docs/img/historical-overview.jpg -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/requirements.txt: -------------------------------------------------------------------------------- 1 | git+https://github.com/Netflix-Skunkworks/historical.git#egg=historical -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | description-file = README.md 3 | 4 | [wheel] 5 | universal = 0 6 | 7 | [egg_info] 8 | tag_build = 9 | tag_date = 0 10 | tag_svn_revision = 0 11 | -------------------------------------------------------------------------------- /historical/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | -------------------------------------------------------------------------------- /historical/s3/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.s3 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | -------------------------------------------------------------------------------- /terraform/SAMPLE-env.list: -------------------------------------------------------------------------------- 1 | AWS_ACCESS_KEY_ID=INSERTHERE 2 | AWS_SECRET_ACCESS_KEY=INSERTHERE 3 | AWS_SESSION_TOKEN=INSERTHERE 4 | TECH=s3|securitygroup 5 | TF_S3_BUCKET=INSERTHERE 6 | PRIMARY_REGION=INSERTHERE 7 | SECONDARY_REGIONS=INSERTHERE,INSERTHERE,INSERTHERE,INSERTHERE 8 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "3.6" 4 | 5 | before_install: 6 | - sudo rm -f /etc/boto.cfg 7 | 8 | install: 9 | - pip install tox-travis 10 | 11 | 12 | matrix: 13 | include: 14 | - env: 15 | - env: TOXENV=linters 16 | 17 | script: 18 | - tox 19 | -------------------------------------------------------------------------------- /terraform/terraform-plugins/terraform-plugins.tf: -------------------------------------------------------------------------------- 1 | // Use this file to pin versions for Terraform plugns: 2 | provider "aws" { 3 | region = "us-west-2" 4 | 5 | version = "1.39" 6 | } 7 | 8 | provider "local" { 9 | version = "1.1" 10 | } 11 | 12 | provider "null" { 13 | version = "1.0" 14 | } 15 | -------------------------------------------------------------------------------- /historical/vpc/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.vpc 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | -------------------------------------------------------------------------------- /historical/common/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | -------------------------------------------------------------------------------- /historical/tests/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.tests 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | .. author:: Kevin Glisson 8 | """ 9 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/serverless_configs/prod.yml: -------------------------------------------------------------------------------- 1 | accountId: {{cookiecutter.prod_account_id}} 2 | accountName: {{cookiecutter.prod_account_name}} 3 | pythonRequirements: 4 | dockerizePip: true 5 | invalidateCaches: true 6 | prune: 7 | automatic: true 8 | number: 3 -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/serverless_configs/test.yml: -------------------------------------------------------------------------------- 1 | accountId: {{cookiecutter.test_account_id}} 2 | accountName: {{cookiecutter.test_account_name}} 3 | pythonRequirements: 4 | dockerizePip: true 5 | invalidateCaches: true 6 | prune: 7 | automatic: true 8 | number: 3 -------------------------------------------------------------------------------- /historical/security_group/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.security_group 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/cookiecutter.json: -------------------------------------------------------------------------------- 1 | { 2 | "technology_name": "ELB", 3 | "technology_slug": "{{cookiecutter.technology_name.lower().replace(' ', '_').replace('-', '_')}}", 4 | "email": "kevin@example.com", 5 | "author": "Kevin Glisson", 6 | "team": "Team Rocket", 7 | "version": "0.1.0", 8 | "test_account_name": "test", 9 | "test_account_id": "", 10 | "prod_account_name": "prod", 11 | "prod_account_id": "", 12 | "_extensions": ["historical.common.extensions.HistoricalExtension"] 13 | } -------------------------------------------------------------------------------- /terraform/dynamodb/s3/s3.tf: -------------------------------------------------------------------------------- 1 | // S3 SPECIFIC VARIABLES: 2 | // Set the default values for the Read and Write capacities to your environment's needs 3 | variable "CURRENT_TABLE" { 4 | default = "HistoricalS3CurrentTable" 5 | } 6 | 7 | variable "CURRENT_TABLE_READ_CAP" { 8 | default = 100 9 | } 10 | 11 | variable "CURRENT_TABLE_WRITE_CAP" { 12 | default = 100 13 | } 14 | 15 | variable "DURABLE_TABLE" { 16 | default = "HistoricalS3DurableTable" 17 | } 18 | 19 | variable "DURABLE_TABLE_READ_CAP" { 20 | default = 100 21 | } 22 | 23 | variable "DURABLE_TABLE_WRITE_CAP" { 24 | default = 100 25 | } 26 | -------------------------------------------------------------------------------- /historical/common/exceptions.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.exceptions 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | 9 | 10 | class DurableItemIsMissingException(Exception): 11 | """Exception for if a Durable Item is missing but should be found.""" 12 | 13 | pass 14 | 15 | 16 | class MissingProxyConfigurationException(Exception): 17 | """Exception if the Proxy is missing the proper configuration on how to operate.""" 18 | 19 | pass 20 | -------------------------------------------------------------------------------- /terraform/dynamodb/securitygroup/securitygroup.tf: -------------------------------------------------------------------------------- 1 | // SECURITY GROUP SPECIFIC VARIABLES: 2 | // Set the default values for the Read and Write capacities to your environment's needs 3 | variable "CURRENT_TABLE" { 4 | default = "HistoricalSecurityGroupCurrentTable" 5 | } 6 | 7 | variable "CURRENT_TABLE_READ_CAP" { 8 | default = 100 9 | } 10 | 11 | variable "CURRENT_TABLE_WRITE_CAP" { 12 | default = 100 13 | } 14 | 15 | variable "DURABLE_TABLE" { 16 | default = "HistoricalSecurityGroupDurableTable" 17 | } 18 | 19 | variable "DURABLE_TABLE_READ_CAP" { 20 | default = 100 21 | } 22 | 23 | variable "DURABLE_TABLE_WRITE_CAP" { 24 | default = 100 25 | } 26 | -------------------------------------------------------------------------------- /historical/tests/pynamodb_settings.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=E0401,C0103 2 | """ 3 | .. module: historical.tests.pynamodb_settings.py 4 | :platform: Unix 5 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 6 | :license: Apache, see LICENSE for more details. 7 | .. author:: Mike Grima 8 | """ 9 | import requests 10 | 11 | 12 | # This is a temporary file that is present to make PynamoDB work properly on unit tests. 13 | # This issue has more details: https://github.com/pynamodb/PynamoDB/issues/558 14 | # and will be fixed when this PR is merged: https://github.com/pynamodb/PynamoDB/pull/559 15 | 16 | session_cls = requests.Session 17 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Historical 2 | [![Build Status](https://travis-ci.org/Netflix-Skunkworks/historical.svg?branch=master)](https://travis-ci.org/Netflix-Skunkworks/historical) 3 | [![Coverage Status](https://coveralls.io/repos/github/Netflix-Skunkworks/historical/badge.svg?branch=master)](https://coveralls.io/github/Netflix-Skunkworks/historical?branch=master) 4 | [![PyPI version](https://badge.fury.io/py/historical.svg)](https://badge.fury.io/py/historical) 5 | 6 | ## THIS PROJECT IS ARCHIVED AND NO LONGER IN DEVELOPMENT 7 | 8 | Please review the documentation that is hosted here: [https://netflix-skunkworks.github.io/historical](https://netflix-skunkworks.github.io/historical). 9 | 10 | [![Historical Logo](mkdocs/docs/img/historical.jpg)](https://netflix-skunkworks.github.io/historical) 11 | -------------------------------------------------------------------------------- /historical/__about__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | from __future__ import absolute_import, division, print_function 9 | 10 | __all__ = [ 11 | "__title__", "__summary__", "__uri__", "__version__", "__author__", 12 | "__email__", "__license__", "__copyright__", 13 | ] 14 | 15 | __title__ = "historical" 16 | __summary__ = ("Historical tracking of AWS resource configuration.") 17 | __uri__ = "https://github.com/Netflix-Skunkworks/historical" 18 | 19 | __version__ = "0.4.10" 20 | 21 | __author__ = "The Historical developers" 22 | __email__ = "security@netflix.com" 23 | 24 | __license__ = "Apache License, Version 2.0" 25 | __copyright__ = f"Copyright 2017 {__author__}" 26 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "historical-deploy", 3 | "version": "0.1.0", 4 | "description": "A collection of AWS Lambda functions for collecting and storing AWS configuration data.", 5 | "main": "index.js", 6 | "repository": { 7 | "type": "git", 8 | "url": "git+https://github.com/Netflix-Skunkworks/historical.git" 9 | }, 10 | "keywords": [ 11 | "python", 12 | "aws", 13 | "lambda", 14 | "serverless" 15 | ], 16 | "author": "{{cookiecutter.email}}", 17 | "license": "Apache", 18 | "bugs": { 19 | "url": "https://github.com/Netflix-Skunkworks/historical/issues" 20 | }, 21 | "homepage": "https://github.com/Netflix-Skunkworks/historical/#readme", 22 | "dependencies": { 23 | "serverless-prune-plugin": "^1.1.1", 24 | "serverless-python-requirements": "^2.2.1" 25 | } 26 | } 27 | -------------------------------------------------------------------------------- /historical/cli.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.cli 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import os 9 | import logging 10 | 11 | import click 12 | import click_log 13 | from cookiecutter.main import cookiecutter # pylint: disable=E0401 14 | 15 | from historical.__about__ import __version__ 16 | 17 | LOG = logging.getLogger('historical') 18 | click_log.basic_config(LOG) 19 | 20 | 21 | @click.group() 22 | @click_log.simple_verbosity_option(LOG) 23 | @click.version_option(version=__version__) 24 | def cli(): 25 | """Historical commandline for managing historical functions.""" 26 | pass 27 | 28 | 29 | @cli.command() 30 | def new(): 31 | """Creates a new historical technology.""" 32 | dir_path = os.path.dirname(os.path.realpath(__file__)) 33 | cookiecutter(os.path.join(dir_path, 'historical-cookiecutter/')) 34 | -------------------------------------------------------------------------------- /mkdocs/docs/extra.css: -------------------------------------------------------------------------------- 1 | .navbar .dropdown-menu>li>a, .navbar .dropdown-menu>li>a:focus { 2 | font-weight: 400; 3 | } 4 | 5 | .navbar-default { 6 | background-color: #00526E; 7 | font-weight: 400; 8 | } 9 | 10 | .navbar-default .navbar-nav>.active>a, .navbar-default .navbar-nav>.active>a:hover, .navbar-default .navbar-nav>.active>a:focus { 11 | background-color: #32748B; 12 | font-family: "Helvetica Neue",Helvetica,Arial,sans-serif; 13 | font-weight: 400; 14 | } 15 | 16 | .navbar-default .navbar-nav>li>a:hover, .navbar-default .navbar-nav>li>a:focus { 17 | background-color: #003142; 18 | font-family: "Helvetica Neue",Helvetica,Arial,sans-serif; 19 | font-weight: 400; 20 | } 21 | 22 | body { 23 | font-family: "Helvetica Neue",Helvetica,Arial,sans-serif; 24 | font-size: 1.65em; 25 | } 26 | 27 | h1, h2, h3, h4, h5, h6 { 28 | font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; 29 | font-weight: 500; 30 | } 31 | 32 | table { 33 | font-size: 15px; 34 | } -------------------------------------------------------------------------------- /mkdocs/mkdocs.yml: -------------------------------------------------------------------------------- 1 | site_name: Historical 2 | repo_url: https://github.com/Netflix-Skunkworks/historical/ 3 | repo_name: GitHub 4 | edit_uri: "" 5 | nav: 6 | - Welcome: index.md 7 | - Architecture: architecture.md 8 | - Installation and Configuration: 9 | - Instructions: installation/ 10 | - Prerequisites: installation/#prerequisites 11 | - IAM Setup: installation/iam.md 12 | - Terraform: installation/terraform.md 13 | - Configuration Reference: installation/configuration.md 14 | - Prepare Docker Container: installation/#prepare-docker-container 15 | - Installation: installation/#installation 16 | - Uninstallation: installation/#uninstallation 17 | - Troubleshooting: troubleshooting.md 18 | theme: 19 | name: yeti 20 | custom_dir: custom_theme/ 21 | extra_css: 22 | - extra.css 23 | 24 | # There are many available formatting extensions available, please read: 25 | # https://facelessuser.github.io/pymdown-extensions/ 26 | markdown_extensions: 27 | - toc: 28 | permalink: True 29 | - pymdownx.tilde 30 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/README.md: -------------------------------------------------------------------------------- 1 | ## ⚡ Historical Deploy 2 | 3 | [![serverless](http://public.serverless.com/badges/v3.svg)](http://www.serverless.com) 4 | 5 | ## About 6 | These are the serverless configuration files needed to various pieces of historical infrastructure. These are configuration files only. Historical itself is located at: 7 | 8 | https://github.com/Netflix-Skunkworks/historical 9 | 10 | 11 | ## Monitoring 12 | 13 | All of the functions are wrapped with the `RavenLambdaWrapper`. This decorator forwards lambda 14 | telemetry to a [Sentry](https://sentry.io) instance. This will have no effect unless you specify `SENTRY_DSN` 15 | in the Lambda's environment variables. 16 | 17 | 18 | ### Deployment 19 | 20 | Install python requirements: 21 | 22 | pip install -r requirements.txt 23 | 24 | Run the tests: 25 | 26 | py.test 27 | 28 | Get the serverless package: 29 | 30 | npm install serverless 31 | 32 | Fetch AWS credentials. 33 | 34 | Deploy package 35 | 36 | sls deploy --region us-east-1 --stage | 37 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/differ.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: {{cookiecutter.technology_slug}}.differ 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: {{cookiecutter.author}} <{{cookiecutter.email}}> 7 | """ 8 | import logging 9 | 10 | from raven_python_lambda import RavenLambdaWrapper 11 | 12 | from historical.common.dynamodb import process_dynamodb_differ_record 13 | from .models import Durable{{cookiecutter.technology_slug | titlecase}}Model 14 | 15 | logging.basicConfig() 16 | log = logging.getLogger('historical') 17 | log.setLevel(logging.WARNING) 18 | 19 | 20 | @RavenLambdaWrapper() 21 | def handler(event, context): 22 | """ 23 | Historical security group event differ. 24 | 25 | Listens to the Historical current table and determines if there are differences that need to be persisted in the 26 | historical record. 27 | """ 28 | for record in event['Records']: 29 | process_dynamodb_differ_record(record, Durable{{cookiecutter.technology_slug | titlecase}}Model) 30 | -------------------------------------------------------------------------------- /historical/mapping/__init__.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.mapping 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | 9 | import os 10 | 11 | from historical.security_group.models import CurrentSecurityGroupModel, DurableSecurityGroupModel 12 | from historical.s3.models import CurrentS3Model, DurableS3Model 13 | from historical.vpc.models import CurrentVPCModel, DurableVPCModel 14 | 15 | # The HISTORICAL_TECHNOLOGY variable MUST be equal to that of an existing model's 'tech' Meta field. 16 | HISTORICAL_TECHNOLOGY = os.environ.get('HISTORICAL_TECHNOLOGY') 17 | 18 | # Current Table Mapping: 19 | CURRENT_MAPPING = { 20 | CurrentSecurityGroupModel.Meta.tech: CurrentSecurityGroupModel, 21 | CurrentS3Model.Meta.tech: CurrentS3Model, 22 | CurrentVPCModel.Meta.tech: CurrentVPCModel 23 | } 24 | 25 | # Durable Table Mapping: 26 | DURABLE_MAPPING = { 27 | DurableSecurityGroupModel.Meta.tech: DurableSecurityGroupModel, 28 | DurableS3Model.Meta.tech: DurableS3Model, 29 | DurableVPCModel.Meta.tech: DurableVPCModel 30 | } 31 | -------------------------------------------------------------------------------- /historical/common/extensions.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.extensions 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | from jinja2.ext import Extension 9 | 10 | 11 | def titlecase(input_str): 12 | """Transforms a string to titlecase.""" 13 | return "".join([x.title() for x in input_str.split('_')]) 14 | 15 | 16 | class HistoricalExtension(Extension): 17 | """Extension class for Cookiecutters.""" 18 | 19 | def __init__(self, environment): 20 | """Instantiates the Historical Extension 21 | 22 | :param environment: 23 | """ 24 | super(HistoricalExtension, self).__init__(environment) 25 | environment.filters['titlecase'] = titlecase 26 | 27 | def parse(self, parser): 28 | """If any of the :attr:`tags` matched this method is called with the 29 | parser as first argument. The token the parser stream is pointing at 30 | is the name token that matched. This method has to return one or a 31 | list of multiple nodes. 32 | """ 33 | raise NotImplementedError() 34 | -------------------------------------------------------------------------------- /historical/vpc/differ.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.vpc.differ 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | import logging 9 | 10 | from raven_python_lambda import RavenLambdaWrapper 11 | 12 | from historical.common.dynamodb import process_dynamodb_differ_record 13 | from historical.common.util import deserialize_records 14 | from historical.constants import LOGGING_LEVEL 15 | from historical.vpc.models import CurrentVPCModel, DurableVPCModel 16 | 17 | logging.basicConfig() 18 | LOG = logging.getLogger('historical') 19 | LOG.setLevel(LOGGING_LEVEL) 20 | 21 | 22 | @RavenLambdaWrapper() 23 | def handler(event, context): # pylint: disable=W0613 24 | """ 25 | Historical security group event differ. 26 | 27 | Listens to the Historical current table and determines if there are differences that need to be persisted in the 28 | historical record. 29 | """ 30 | # De-serialize the records: 31 | records = deserialize_records(event['Records']) 32 | 33 | for record in records: 34 | process_dynamodb_differ_record(record, CurrentVPCModel, DurableVPCModel) 35 | -------------------------------------------------------------------------------- /terraform/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM amazonlinux:1 2 | 3 | MAINTAINER Netflix OSS 4 | 5 | COPY requirements.txt /installer/requirements.txt 6 | COPY terraform-plugins /installer/terraform-plugins 7 | 8 | ARG TERRAFORM_VERSION=0.11.10 9 | 10 | RUN \ 11 | yum install python36 python36-devel gcc-c++ make zip unzip git jq aws-cli -y \ 12 | && curl https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -o terraform_installer.zip -s \ 13 | && unzip /terraform_installer.zip \ 14 | && cd /installer/terraform-plugins \ 15 | && /terraform init \ 16 | && mv .terraform/plugins/linux_amd64/* ./ \ 17 | && rm -Rf .terraform 18 | 19 | # ENVIRONMENT VARIABLES: 20 | ENV TECH="" 21 | ENV TF_S3_BUCKET="" 22 | ENV PRIMARY_REGION="" 23 | ENV SECONDARY_REGIONS="" 24 | 25 | # AWS CREDS: 26 | ENV AWS_ACCESS_KEY_ID="" 27 | ENV AWS_SECRET_ACCESS_KEY="" 28 | ENV AWS_SESSION_TOKEN="" 29 | 30 | # Do these later to help with caching: 31 | COPY install_historical.sh /installer/install_historical.sh 32 | COPY teardown_historical.sh /installer/teardown_historical.sh 33 | COPY dynamodb /installer/dynamodb 34 | COPY infra /installer/infra 35 | RUN chmod +x /installer/*.sh 36 | 37 | WORKDIR "/installer" 38 | ENTRYPOINT ["/installer/install_historical.sh"] 39 | -------------------------------------------------------------------------------- /historical/security_group/differ.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.security_group.differ 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | import logging 9 | 10 | from raven_python_lambda import RavenLambdaWrapper 11 | 12 | from historical.common.dynamodb import process_dynamodb_differ_record 13 | from historical.common.util import deserialize_records 14 | from historical.security_group.models import CurrentSecurityGroupModel, DurableSecurityGroupModel 15 | from historical.constants import LOGGING_LEVEL 16 | 17 | logging.basicConfig() 18 | LOG = logging.getLogger('historical') 19 | LOG.setLevel(LOGGING_LEVEL) 20 | 21 | 22 | @RavenLambdaWrapper() 23 | def handler(event, context): # pylint: disable=W0613 24 | """ 25 | Historical security group event differ. 26 | 27 | Listens to the Historical current table and determines if there are differences that need to be persisted in the 28 | historical record. 29 | """ 30 | # De-serialize the records: 31 | records = deserialize_records(event['Records']) 32 | 33 | for record in records: 34 | process_dynamodb_differ_record(record, CurrentSecurityGroupModel, DurableSecurityGroupModel) 35 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/conftest.py: -------------------------------------------------------------------------------- 1 | from historical.tests.conftest import * 2 | 3 | 4 | @pytest.fixture(scope='function') 5 | def {{cookiecutter.technology_slug}}s(ec2): 6 | """Creates {{cookiecutter.technology_slug}}s.""" 7 | # TODO create aws item 8 | # Example:: 9 | # yield ec2.create_vpc( 10 | # CidrBlock='192.168.1.1/32', 11 | # AmazonProvidedIpv6CidrBlock=True, 12 | # InstanceTenancy='default' 13 | # )['Vpc'] 14 | yield 15 | 16 | 17 | @pytest.fixture(scope='function') 18 | def current_{{cookiecutter.technology_slug}}_table(): 19 | from .models import Current{{cookiecutter.technology_slug | titlecase}}Model 20 | mock_dynamodb2().start() 21 | yield Current{{cookiecutter.technology_slug | titlecase}}Model.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 22 | mock_dynamodb2().stop() 23 | 24 | 25 | @pytest.fixture(scope='function') 26 | def durable_{{cookiecutter.technology_slug}}_table(): 27 | from .models import Durable{{cookiecutter.technology_slug | titlecase}}Model 28 | mock_dynamodb2().start() 29 | yield Durable{{cookiecutter.technology_slug | titlecase}}Model.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 30 | mock_dynamodb2().stop() -------------------------------------------------------------------------------- /historical/s3/differ.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.s3.differ 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import logging 9 | 10 | from raven_python_lambda import RavenLambdaWrapper 11 | 12 | from historical.common.util import deserialize_records 13 | from historical.constants import LOGGING_LEVEL 14 | from historical.s3.models import CurrentS3Model, DurableS3Model 15 | from historical.common.dynamodb import process_dynamodb_differ_record 16 | 17 | logging.basicConfig() 18 | LOG = logging.getLogger('historical') 19 | LOG.setLevel(LOGGING_LEVEL) 20 | 21 | # Path to where in the dict the ephemeral field is -- starting with "root['M'][PathInConfigDontForgetDataType]..." 22 | # EPHEMERAL_PATHS = [] 23 | 24 | 25 | @RavenLambdaWrapper() 26 | def handler(event, context): # pylint: disable=W0613 27 | """ 28 | Historical S3 event differ. 29 | 30 | Listens to the Historical current table and determines if there are differences that need to be persisted in the 31 | historical record. 32 | """ 33 | # De-serialize the records: 34 | records = deserialize_records(event['Records']) 35 | 36 | for record in records: 37 | process_dynamodb_differ_record(record, CurrentS3Model, DurableS3Model) 38 | -------------------------------------------------------------------------------- /historical/constants.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.constants 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | .. author:: Kevin Glisson 8 | """ 9 | import logging 10 | import os 11 | 12 | LOG_LEVELS = { 13 | 'CRITICAL': logging.CRITICAL, 14 | 'ERROR': logging.ERROR, 15 | 'WARNING': logging.WARNING, 16 | 'INFO': logging.INFO, 17 | 'DEBUG': logging.DEBUG 18 | } 19 | 20 | 21 | def extract_log_level_from_environment(k, default): 22 | """Gets the log level from the environment variable.""" 23 | return LOG_LEVELS.get(os.environ.get(k)) or int(os.environ.get(k, default)) 24 | 25 | 26 | # 24 hours in seconds is the default 27 | TTL_EXPIRY = int(os.environ.get('TTL_EXPIRY', 86400)) 28 | 29 | # By default, don't randomize the pollers (tasker or collector -- same env var): 30 | RANDOMIZE_POLLER = int(os.environ.get('RANDOMIZE_POLLER', 0)) 31 | 32 | CURRENT_REGION = os.environ.get('AWS_DEFAULT_REGION', 'us-east-1') 33 | HISTORICAL_ROLE = os.environ.get('HISTORICAL_ROLE', 'Historical') 34 | POLL_REGIONS = os.environ.get('POLL_REGIONS', 'us-east-1').split(",") 35 | PROXY_REGIONS = os.environ.get('PROXY_REGIONS', 'us-east-1').split(",") 36 | REGION_ATTR = os.environ.get('REGION_ATTR', 'Region') 37 | SIMPLE_DURABLE_PROXY = os.environ.get('SIMPLE_DURABLE_PROXY', False) 38 | LOGGING_LEVEL = extract_log_level_from_environment('LOGGING_LEVEL', logging.INFO) 39 | EVENT_TOO_BIG_FLAG = 'event_too_big' 40 | -------------------------------------------------------------------------------- /historical/common/accounts.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.accounts 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | import os 10 | 11 | from swag_client.backend import SWAGManager 12 | from swag_client.util import parse_swag_config_options 13 | 14 | 15 | def parse_boolean(value): 16 | """Simple function to get a boolean value from string.""" 17 | if not value: 18 | return False 19 | 20 | if str(value).lower() == 'true': 21 | return True 22 | 23 | return False 24 | 25 | 26 | def get_historical_accounts(): 27 | """Fetches valid accounts from SWAG if enabled or a list accounts.""" 28 | if os.environ.get('SWAG_BUCKET', False): 29 | swag_opts = { 30 | 'swag.type': 's3', 31 | 'swag.bucket_name': os.environ['SWAG_BUCKET'], 32 | 'swag.data_file': os.environ.get('SWAG_DATA_FILE', 'accounts.json'), 33 | 'swag.region': os.environ.get('SWAG_REGION', 'us-east-1') 34 | } 35 | swag = SWAGManager(**parse_swag_config_options(swag_opts)) 36 | search_filter = f"[?provider=='aws' && owner=='{os.environ['SWAG_OWNER']}' && account_status!='deleted'" 37 | 38 | if parse_boolean(os.environ.get('TEST_ACCOUNTS_ONLY')): 39 | search_filter += " && environment=='test'" 40 | 41 | search_filter += ']' 42 | 43 | accounts = swag.get_service_enabled('historical', search_filter=search_filter) 44 | else: 45 | accounts = [{'id': account_id} for account_id in os.environ['ENABLED_ACCOUNTS'].split(',')] 46 | 47 | return accounts 48 | -------------------------------------------------------------------------------- /historical/common/util.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.util 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import json 9 | 10 | 11 | def deserialize_records(records): 12 | """ 13 | This properly deserializes records depending on where they came from: 14 | - SQS 15 | - SNS 16 | """ 17 | native_records = [] 18 | for record in records: 19 | parsed = json.loads(record['body']) 20 | 21 | # Is this a DynamoDB stream event? 22 | if isinstance(parsed, str): 23 | native_records.append(json.loads(parsed)) 24 | 25 | # Is this a subscription message from SNS? If so, skip it: 26 | elif parsed.get('Type') == 'SubscriptionConfirmation': 27 | continue 28 | 29 | # Is this from SNS (cross-region request -- SNS messages wrapped in SQS message) -- or an SNS proxied message? 30 | elif parsed.get('Message'): 31 | native_records.append(json.loads(parsed['Message'])) 32 | 33 | else: 34 | native_records.append(parsed) 35 | 36 | return native_records 37 | 38 | 39 | def pull_tag_dict(data): 40 | """This will pull out a list of Tag Name-Value objects, and return it as a dictionary. 41 | 42 | :param data: The dict collected from the collector. 43 | :returns dict: A dict of the tag names and their corresponding values. 44 | """ 45 | # If there are tags, set them to a normal dict, vs. a list of dicts: 46 | tags = data.pop('Tags', {}) or {} 47 | if tags: 48 | proper_tags = {} 49 | for tag in tags: 50 | proper_tags[tag['Key']] = tag['Value'] 51 | 52 | tags = proper_tags 53 | 54 | return tags 55 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """ 2 | Historical 3 | ========== 4 | 5 | Allows for the tracking of AWS configuration data across accounts/regions/technologies. 6 | 7 | """ 8 | import os.path 9 | 10 | from setuptools import find_packages, setup 11 | 12 | ROOT = os.path.realpath(os.path.join(os.path.dirname(__file__))) 13 | 14 | about = {} 15 | with open(os.path.join(ROOT, "historical", "__about__.py")) as f: 16 | exec(f.read(), about) 17 | 18 | 19 | install_requires = [ 20 | 'boto3>=1.9.47', 21 | 'cloudaux>=1.4.14', 22 | 'click>=6.7', 23 | 'pynamodb>=3.3.1', 24 | 'deepdiff>=3.3.0', 25 | 'raven-python-lambda>=0.1.7', 26 | 'marshmallow>=2.13.5', 27 | 'swag-client==0.4.3', 28 | 'python-dateutil==2.6.1', 29 | 'Jinja2==2.10' 30 | ] 31 | 32 | tests_require = [ 33 | 'cookiecutter==1.6.0', 34 | 'pytest==3.1.3', 35 | 'pytest-cov>=2.5.1', 36 | 'mock==2.0.0', 37 | 'moto>=1.3.2', 38 | 'coveralls==1.1', 39 | 'factory-boy==2.9.2', 40 | 'tox==3.4.0', 41 | ] 42 | 43 | 44 | setup( 45 | name=about["__title__"], 46 | version=about["__version__"], 47 | author=about["__author__"], 48 | author_email=about["__email__"], 49 | url=about["__uri__"], 50 | description=about["__summary__"], 51 | long_description='See README.md', 52 | packages=find_packages(), 53 | include_package_data=True, 54 | zip_safe=False, 55 | install_requires=install_requires, 56 | extras_require={ 57 | 'tests': tests_require 58 | }, 59 | entry_points={ 60 | 'console_scripts': [ 61 | 'historical = historical.cli:cli', 62 | ] 63 | }, 64 | keywords=['aws', 'account_management'], 65 | classifiers=[ 66 | 'Programming Language :: Python', 67 | 'Programming Language :: Python :: 3', 68 | 'Programming Language :: Python :: 3.6', 69 | ], 70 | ) 71 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = py36,linters 3 | 4 | [testenv] 5 | usedevelop = True 6 | passenv = TRAVIS TRAVIS_* 7 | deps = 8 | git+https://github.com/mikegrima/moto.git@instanceprofiles#egg=moto 9 | .[tests] 10 | mock 11 | pytest 12 | coveralls 13 | 14 | setenv = 15 | COVERAGE_FILE = test-reports/{envname}/.coverage 16 | PYTEST_ADDOPTS = --junitxml=test-reports/{envname}/junit.xml -vv 17 | # Fix for PynamoDB Vendored Requests: 18 | PYNAMODB_CONFIG = historical/tests/pynamodb_settings.py 19 | commands = 20 | pytest {posargs} --ignore=historical/historical-cookiecutter historical 21 | coveralls 22 | 23 | [testenv:linters] 24 | basepython = python3 25 | usedevelop = true 26 | deps = 27 | {[testenv:flake8]deps} 28 | {[testenv:pylint]deps} 29 | {[testenv:setuppy]deps} 30 | {[testenv:bandit]deps} 31 | commands = 32 | {[testenv:flake8]commands} 33 | {[testenv:pylint]commands} 34 | {[testenv:setuppy]commands} 35 | {[testenv:bandit]commands} 36 | 37 | [testenv:flake8] 38 | basepython = python3 39 | skip_install = true 40 | deps = 41 | flake8 42 | flake8-docstrings>=0.2.7 43 | flake8-import-order>=0.9 44 | commands = 45 | flake8 historical setup.py test 46 | 47 | [testenv:pylint] 48 | basepython = python3 49 | skip_install = false 50 | deps = 51 | pyflakes 52 | pylint 53 | commands = 54 | pylint --rcfile={toxinidir}/.pylintrc historical 55 | 56 | [testenv:setuppy] 57 | basepython = python3 58 | skip_install = true 59 | deps = 60 | commands = 61 | python setup.py check -m -s 62 | 63 | [testenv:bandit] 64 | basepython = python3 65 | skip_install = true 66 | deps = 67 | bandit 68 | commands = 69 | bandit --ini tox.ini -r historical 70 | 71 | [bandit] 72 | skips = B101 73 | 74 | [flake8] 75 | ignore = E501,I100,D205,D400,D401,I202,R0913,C901 76 | exclude = 77 | *.egg-info, 78 | *.pyc, 79 | .cache, 80 | .coverage.*, 81 | .gradle, 82 | .tox, 83 | build, 84 | dist, 85 | htmlcov.* 86 | *-cookiecutter 87 | historical/tests/factories.py 88 | max-complexity = 10 89 | import-order-style = google 90 | application-import-names = flake8 91 | 92 | [pytest] 93 | norecursedirs=.* 94 | -------------------------------------------------------------------------------- /mkdocs/docs/index.md: -------------------------------------------------------------------------------- 1 |
2 |

Historical

3 |

4 | 5 |

THIS PROJECT IS NO LONGER IN DEVELOPMENT. THIS IS ONLY HERE FOR ARCHIVAL PURPOSES ONLY!!

6 | 7 | Historical is a serverless application that tracks and reacts to AWS resource modifications anywhere in 8 | your environment. Historical achieves this by describing AWS resources when they are changed, and keeping the history of those changes along with the the CloudTrail context of those changes. 9 | 10 | Historical persists data in two places: 11 | 12 | - A "Current" DynamoDB table, which is a cache of the current state of AWS resources 13 | - A "Durable" DynamoDB table, which stores the change history of AWS resources 14 | 15 | Historical enables downstream consumers to react to changes in the AWS environment 16 | without the need to directly describe the resource. This greatly increases speed of reaction, reduces IAM permission complexity, and also avoids rate limiting. 17 | 18 | ## How it works 19 | Historical leverages AWS CloudWatch Events. Events trigger a "Collector" Lambda function to describe the AWS resource that changed, and saves the configuration into a DynamoDB table. From this, a "Differ" Lambda function checks if the resource has changed from what was previously known about that resource. If the item has changed, a new change record is saved, which then enables downstream consumers the ability to react to changes in the environment as the environment effectively changes over time. 20 | 21 | The CloudTrail context on the change is preserved in the change history. 22 | 23 | ## Current Technologies Implemented 24 | 25 | - ### S3 26 | - ### Security Groups 27 | - ### IAM (In active development -- Coming Soon!) 28 | 29 | ## Architecture 30 | Please review the [Architecture](architecture.md) documentation for an in-depth description of the components involved. 31 | 32 | ## Installation & Configuration 33 | Please review the [Installation & Configuration](installation/) documentation for details. 34 | 35 | ## Troubleshooting 36 | Please review the [Troubleshooting](troubleshooting.md) doc if you are experiencing issues. 37 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/poller.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: {{cookiecutter.technology_slug}}.poller 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: {{cookiecutter.author}} <{{cookiecutter.email}}> 7 | """ 8 | import os 9 | import logging 10 | 11 | from botocore.exceptions import ClientError 12 | 13 | from raven_python_lambda import RavenLambdaWrapper 14 | # from cloudaux.aws.ec2 import describe_security_groups 15 | 16 | # from historical.constants import CURRENT_REGION, HISTORICAL_ROLE 17 | from .models import {{cookiecutter.technology_slug}}_polling_schema 18 | from historical.common.accounts import get_historical_accounts 19 | from historical.common.kinesis import produce_events 20 | 21 | logging.basicConfig() 22 | log = logging.getLogger("historical") 23 | log.setLevel(logging.INFO) 24 | 25 | 26 | @RavenLambdaWrapper() 27 | def handler(event, context): 28 | """ 29 | Historical {{cookiecutter.technology_name}} event poller. 30 | 31 | This poller is run at a set interval in order to ensure that changes do not go undetected by historical. 32 | 33 | Historical pollers generate `polling events` which simulate changes. These polling events contain configuration 34 | data such as the account/region defining where the collector should attempt to gather data from. 35 | """ 36 | log.debug('Running poller. Configuration: {}'.format(event)) 37 | 38 | for account in get_historical_accounts(): 39 | try: 40 | # TODO describe all items 41 | # Example:: 42 | # 43 | # groups = describe_security_groups( 44 | # account_number=account['id'], 45 | # assume_role=HISTORICAL_ROLE, 46 | # region=CURRENT_REGION 47 | # ) 48 | # events = [security_group_polling_schema.serialize(account['id'], g) for g in groups['SecurityGroups']] 49 | events = [] 50 | produce_events(events, os.environ.get('HISTORICAL_STREAM', 'Historical{{cookiecutter.technology_slug | titlecase }}PollerStream')) 51 | log.debug('Finished generating polling events. Account: {} Events Created: {}'.format(account['id'], len(events))) 52 | except ClientError as e: 53 | log.warning('Unable to generate events for account. AccountId: {account_id} Reason: {reason}'.format( 54 | account_id=account['id'], 55 | reason=e 56 | )) 57 | 58 | -------------------------------------------------------------------------------- /historical/tests/test_cloudwatch.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.tests.test_s3 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | 9 | import json 10 | 11 | from historical.tests.factories import ( 12 | CloudwatchEventFactory, 13 | DetailFactory, 14 | serialize 15 | ) 16 | 17 | 18 | def test_filter_request_parameters(): 19 | """Tests that specific elements can be pulled out of the Request Parameters in the CloudWatch Event.""" 20 | from historical.common.cloudwatch import filter_request_parameters 21 | event = CloudwatchEventFactory( 22 | detail=DetailFactory( 23 | requestParameters={'GroupId': 'sg-4e386e31'} 24 | ) 25 | ) 26 | data = json.loads(json.dumps(event, default=serialize)) 27 | assert filter_request_parameters('GroupId', data) == 'sg-4e386e31' 28 | 29 | 30 | def test_get_user_identity(): 31 | """Tests that the User Identity can be pulled out of the CloudWatch Event.""" 32 | from historical.common.cloudwatch import get_user_identity 33 | event = CloudwatchEventFactory() 34 | data = json.loads(json.dumps(event, default=serialize)) 35 | assert get_user_identity(data) 36 | 37 | 38 | def test_get_principal(): 39 | """Tests that the Principal object can be pulled out of the CloudWatch Event.""" 40 | from historical.common.cloudwatch import get_principal 41 | event = CloudwatchEventFactory() 42 | data = json.loads(json.dumps(event, default=serialize)) 43 | assert get_principal(data) == 'joe@example.com' 44 | 45 | 46 | def test_get_region(): 47 | """Tests that the Region can be pulled out of the CloudWatch Event.""" 48 | from historical.common.cloudwatch import get_region 49 | event = CloudwatchEventFactory() 50 | data = json.loads(json.dumps(event, default=serialize)) 51 | assert get_region(data) == 'us-east-1' 52 | 53 | 54 | def test_get_event_time(): 55 | """Tests that the Event Time can be pulled out of the CloudWatch Event.""" 56 | from historical.common.cloudwatch import get_event_time 57 | event = CloudwatchEventFactory() 58 | data = json.loads(json.dumps(event, default=serialize)) 59 | assert get_event_time(data) 60 | 61 | 62 | def test_get_account_id(): 63 | """Tests that the Account ID can be pulled out of the CloudWatch Event.""" 64 | from historical.common.cloudwatch import get_account_id 65 | event = CloudwatchEventFactory() 66 | data = json.loads(json.dumps(event, default=serialize)) 67 | assert get_account_id(data) == '123456789012' 68 | -------------------------------------------------------------------------------- /terraform/teardown_historical.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | [ -z "$TECH" ] && echo "Need to set TECH -- one of [s3, securitygroup]" && exit 1; 4 | [ -z "$TF_S3_BUCKET" ] && echo "Need to set TF_S3_BUCKET -- the S3 bucket to use for Terraform" && exit 1; 5 | [ -z "$PRIMARY_REGION" ] && echo "Need to set PRIMARY_REGION." && exit 1; 6 | 7 | # AWS ENV VARS: 8 | [ -z "$AWS_ACCESS_KEY_ID" ] && echo "Need to set the AWS_ACCESS_KEY_ID" && exit 1; 9 | [ -z "$AWS_SECRET_ACCESS_KEY" ] && echo "Need to set the AWS_SECRET_ACCESS_KEY" && exit 1; 10 | [ -z "$AWS_SESSION_TOKEN" ] && echo "Need to set the AWS_SESSION_TOKEN" && exit 1; 11 | 12 | # Copy the requirements.txt file over: 13 | WORKING_DIR=$( pwd ) 14 | 15 | # Make an empty file to make Terraform happy: 16 | touch ${WORKING_DIR}/infra/lambda.zip 17 | 18 | # Tear down the stacks first: 19 | cd ${WORKING_DIR}/infra 20 | cp ${TECH}/${TECH}.tf ./ 21 | 22 | # Start the Terraform work: 23 | echo "[@] Now tearing down the infrastructure for each region -- starting with the PRIMARY REGION: ${PRIMARY_REGION}..." 24 | IFS=',' 25 | ALL_REGIONS=$PRIMARY_REGION,$SECONDARY_REGIONS 26 | for region in $ALL_REGIONS; 27 | do 28 | echo "[-->] Initializing Terraform for ${region}..." 29 | /terraform init -plugin-dir=/installer/terraform-plugins -backend-config "bucket=$TF_S3_BUCKET" -backend-config "key=terraform/$TECH/INFRA/$region" 30 | if [ $? -ne 0 ]; then 31 | echo "[X] Terraform init has failed!!" 32 | exit 1 33 | fi 34 | 35 | echo "[-->] Tearing down the stack now..." 36 | TF_VAR_REGION=${region} /terraform destroy -auto-approve 37 | if [ $? -ne 0 ]; then 38 | echo "[X] Terraform stack destroy has failed!! -- Sometimes this needs be run multiple times due to eventual consistency." 39 | exit 1 40 | fi 41 | echo "[+] Completed tearing down stack in ${region}." 42 | 43 | # Clear out the existing Terraform data: 44 | rm -Rf .terraform/ 45 | done 46 | 47 | echo "[-->] Initializing Terraform for DynamoDB work..." 48 | cd ${WORKING_DIR}/dynamodb 49 | # Copy the tech template into the local directory for Terraform to tear down the tech's DynamoDB components: 50 | cp ${TECH}/${TECH}.tf ./ 51 | /terraform init -plugin-dir=/installer/terraform-plugins -backend-config "bucket=$TF_S3_BUCKET" -backend-config "key=terraform/$TECH/DYNAMODB" 52 | if [ $? -ne 0 ]; then 53 | echo "[X] Terraform init has failed!!" 54 | exit 1 55 | fi 56 | echo "[-->] Tearing down the DynamoDB stack..." 57 | /terraform destroy -auto-approve 58 | if [ $? -ne 0 ]; then 59 | echo "[X] Terraform stack destroy has failed!!" 60 | exit 1 61 | fi 62 | echo "[+] Completed tearing down DynamoDB." 63 | 64 | echo "[@] DONE" 65 | -------------------------------------------------------------------------------- /historical/common/cloudwatch.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.cloudwatch 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | from datetime import datetime 10 | 11 | from historical.constants import CURRENT_REGION 12 | 13 | 14 | def filter_request_parameters(field_name, msg, look_in_response=False): 15 | """ 16 | From an event, extract the field name from the message. 17 | Different API calls put this information in different places, so check a few places. 18 | """ 19 | val = msg['detail'].get(field_name, None) 20 | try: 21 | if not val: 22 | val = msg['detail'].get('requestParameters', {}).get(field_name, None) 23 | 24 | # If we STILL didn't find it -- check if it's in the response element (default off) 25 | if not val and look_in_response: 26 | if msg['detail'].get('responseElements'): 27 | val = msg['detail']['responseElements'].get(field_name, None) 28 | 29 | # Just in case... We didn't find the value, so just make it None: 30 | except AttributeError: 31 | val = None 32 | 33 | return val 34 | 35 | 36 | def get_user_identity(event): 37 | """Gets event identity from event.""" 38 | return event['detail'].get('userIdentity', {}) 39 | 40 | 41 | def get_principal(event): 42 | """Gets principal id from the event""" 43 | user_identity = get_user_identity(event) 44 | return user_identity.get('principalId', '').split(':')[-1] 45 | 46 | 47 | def get_region(event): 48 | """Get region from event details.""" 49 | return event['detail'].get('awsRegion', CURRENT_REGION) 50 | 51 | 52 | def get_event_time(event): 53 | """Gets the event time from an event""" 54 | return datetime.strptime(event['detail']['eventTime'], "%Y-%m-%dT%H:%M:%SZ") 55 | 56 | 57 | def get_account_id(event): 58 | """Gets the account id from an event""" 59 | return event['account'] 60 | 61 | 62 | def get_collected_details(event): 63 | """Gets collected details if the technology's poller already described the given asset""" 64 | return event['detail'].get('collected') 65 | 66 | 67 | def get_historical_base_info(event): 68 | """Gets the base details from the CloudWatch Event.""" 69 | data = { 70 | 'principalId': get_principal(event), 71 | 'userIdentity': get_user_identity(event), 72 | 'accountId': event['account'], 73 | 'userAgent': event['detail'].get('userAgent'), 74 | 'sourceIpAddress': event['detail'].get('sourceIPAddress'), 75 | 'requestParameters': event['detail'].get('requestParameters') 76 | } 77 | 78 | if event['detail'].get('eventTime'): 79 | data['eventTime'] = event['detail']['eventTime'] 80 | 81 | if event['detail'].get('eventSource'): 82 | data['eventSource'] = event['detail']['eventSource'] 83 | 84 | if event['detail'].get('eventName'): 85 | data['eventName'] = event['detail']['eventName'] 86 | 87 | return data 88 | -------------------------------------------------------------------------------- /terraform/install_historical.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | [ -z "$TECH" ] && echo "Need to set TECH -- one of [s3, securitygroup]" && exit 1; 4 | [ -z "$TF_S3_BUCKET" ] && echo "Need to set TF_S3_BUCKET -- the S3 bucket to use for Terraform" && exit 1; 5 | [ -z "$PRIMARY_REGION" ] && echo "Need to set PRIMARY_REGION." && exit 1; 6 | 7 | # AWS ENV VARS: 8 | [ -z "$AWS_ACCESS_KEY_ID" ] && echo "Need to set the AWS_ACCESS_KEY_ID" && exit 1; 9 | [ -z "$AWS_SECRET_ACCESS_KEY" ] && echo "Need to set the AWS_SECRET_ACCESS_KEY" && exit 1; 10 | [ -z "$AWS_SESSION_TOKEN" ] && echo "Need to set the AWS_SESSION_TOKEN" && exit 1; 11 | 12 | # Copy the requirements.txt file over: 13 | WORKING_DIR=$( pwd ) 14 | echo "[-->] Copying the requirements file over to the build dir..." 15 | mkdir build 16 | cp requirements.txt build/ 17 | 18 | # Navigate to the build dir: 19 | cd build/ 20 | BUILD_SOURCE_DIR=$( pwd ) 21 | 22 | # Make the venv: 23 | echo "[...] Building the venv..." 24 | python36 -m venv venv 25 | source venv/bin/activate 26 | 27 | # Packaging: 28 | ZIP_NAME="historical-${TECH}.zip" 29 | echo "[...] Building the Lambda..." 30 | pip install -r requirements.txt -t ./artifacts 31 | echo "[...] Zipping the Lambda..." 32 | cd artifacts 33 | zip -r ${ZIP_NAME} . 34 | cd ${BUILD_SOURCE_DIR} 35 | 36 | # Make a sym link and place it in the Terraform infra dir for later reference. 37 | cd ${WORKING_DIR} 38 | ln -s ${WORKING_DIR}/build/artifacts/${ZIP_NAME} ${WORKING_DIR}/infra/lambda.zip 39 | 40 | # Start the Terraform work: 41 | echo "[-->] Initializing Terraform for DynamoDB work..." 42 | cd ./dynamodb 43 | # Copy the tech template into the local directory for Terraform to set up the tech's DynamoDB components: 44 | cp ${TECH}/${TECH}.tf ./ 45 | /terraform init -plugin-dir=/installer/terraform-plugins -backend-config "bucket=$TF_S3_BUCKET" -backend-config "key=terraform/$TECH/DYNAMODB" 46 | if [ $? -ne 0 ]; then 47 | echo "[X] Terraform init has failed!!" 48 | exit 1 49 | fi 50 | echo "[-->] Applying the DynamoDB template..." 51 | /terraform apply -auto-approve 52 | if [ $? -ne 0 ]; then 53 | echo "[X] Terraform application has failed!!" 54 | exit 1 55 | fi 56 | echo "[+] Completed applying Terraform for DynamoDB" 57 | 58 | echo "[@] Now deploying the rest of the infrastructure for each region -- starting with the PRIMARY REGION: ${PRIMARY_REGION}..." 59 | # Copy the tech template into the local directory for Terraform to set up the tech's complete infrastructure components: 60 | cd ${WORKING_DIR}/infra 61 | cp ${TECH}/${TECH}.tf ./ 62 | 63 | IFS=',' 64 | ALL_REGIONS=$PRIMARY_REGION,$SECONDARY_REGIONS 65 | for region in $ALL_REGIONS; 66 | do 67 | echo "[-->] Initializing Terraform for ${region}..." 68 | /terraform init -plugin-dir=/installer/terraform-plugins -backend-config "bucket=$TF_S3_BUCKET" -backend-config "key=terraform/$TECH/INFRA/$region" 69 | if [ $? -ne 0 ]; then 70 | echo "[X] Terraform init has failed!!" 71 | exit 1 72 | fi 73 | 74 | echo "[-->] Applying the template now..." 75 | TF_VAR_REGION=${region} /terraform apply -auto-approve 76 | if [ $? -ne 0 ]; then 77 | echo "[X] Terraform application has failed!!" 78 | exit 1 79 | fi 80 | echo "[+] Completed applying template in ${region}." 81 | 82 | # Clear out the existing Terraform data: 83 | rm -Rf .terraform/ 84 | done 85 | 86 | echo "[@] DONE" 87 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/models.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: {{cookiecutter.technology_slug}}.models 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: {{cookiecutter.author}} <{{cookiecutter.email}}> 7 | """ 8 | from marshmallow import Schema, fields, post_dump 9 | 10 | from pynamodb.models import Model 11 | from pynamodb.indexes import GlobalSecondaryIndex, AllProjection 12 | from pynamodb.attributes import UnicodeAttribute, NumberAttribute, ListAttribute 13 | 14 | from historical.constants import CURRENT_REGION 15 | from historical.models import ( 16 | HistoricalPollingEventDetail, 17 | HistoricalPollingBaseModel, 18 | DurableHistoricalModel, 19 | CurrentHistoricalModel, 20 | AWSHistoricalMixin 21 | ) 22 | 23 | 24 | class {{cookiecutter.technology_slug | titlecase}}Model(object): 25 | # TODO add attributes specific to technology 26 | Tags = ListAttribute() 27 | 28 | 29 | class Durable{{cookiecutter.technology_slug | titlecase}}Model(Model, DurableHistoricalModel, AWSHistoricalMixin, {{cookiecutter.technology_slug | titlecase}}Model): 30 | class Meta: 31 | table_name = 'Historical{{cookiecutter.technology_slug | titlecase}}DurableTable' 32 | region = CURRENT_REGION 33 | 34 | 35 | class Current{{cookiecutter.technology_slug | titlecase}}Model(Model, CurrentHistoricalModel, AWSHistoricalMixin, {{cookiecutter.technology_slug | titlecase}}Model): 36 | class Meta: 37 | table_name = 'Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable' 38 | region = CURRENT_REGION 39 | 40 | 41 | class ViewIndex(GlobalSecondaryIndex): 42 | class Meta: 43 | projection = AllProjection() 44 | region = CURRENT_REGION 45 | 46 | view = NumberAttribute(default=0, hash_key=True) 47 | 48 | 49 | class {{cookiecutter.technology_slug | titlecase}}PollingRequestParamsModel(Schema): 50 | # TODO add technology_slug validation fields 51 | owner_id = fields.Str(dump_to='ownerId', load_from='ownerId', required=True) 52 | 53 | 54 | class {{cookiecutter.technology_slug | titlecase}}PollingEventDetail(HistoricalPollingEventDetail): 55 | @post_dump 56 | def add_required_{{cookiecutter.technology_slug}}_polling_data(self, data): 57 | data['eventSource'] = 'historical.ec2.poller' 58 | data['eventName'] = 'HistoricalPoller' 59 | return data 60 | 61 | 62 | class {{cookiecutter.technology_slug | titlecase}}PollingEventModel(HistoricalPollingBaseModel): 63 | detail = fields.Nested({{cookiecutter.technology_slug | titlecase}}PollingEventDetail, required=True) 64 | 65 | @post_dump() 66 | def dump_security_group_polling_event_data(self, data): 67 | data['version'] = '1' 68 | return data 69 | 70 | # TODO add technology_slug specific fields 71 | def serialize(self, account, group): 72 | return self.dumps({ 73 | 'account': account, 74 | 'detail': { 75 | 'request_parameters': { 76 | 'groupId': group['GroupId'] 77 | } 78 | } 79 | }).data 80 | 81 | 82 | {{cookiecutter.technology_slug}}_polling_schema = {{cookiecutter.technology_slug | titlecase}}PollingEventModel(strict=True) 83 | -------------------------------------------------------------------------------- /historical/attributes.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.attributes 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | import json 10 | import decimal 11 | 12 | from pynamodb.attributes import Attribute, BooleanAttribute, ListAttribute, MapAttribute, NumberAttribute 13 | 14 | import pynamodb 15 | from pynamodb.constants import NUMBER, STRING 16 | 17 | DATETIME_FORMAT = '%Y-%m-%dT%H:%M:%SZ' 18 | 19 | 20 | class HistoricalUnicodeAttribute(Attribute): 21 | """A Historical unicode attribute. 22 | Replaces '' with '' during serialization and correctly deserialize '' to '' 23 | """ 24 | 25 | attr_type = STRING 26 | 27 | def serialize(self, value): 28 | """Returns a unicode string""" 29 | if value is None or not len(value): # pylint: disable=C1801 30 | return '' 31 | return value 32 | 33 | def deserialize(self, value): 34 | """Strips out the `` placeholders with empty strings.""" 35 | if value == '': 36 | return '' 37 | return value 38 | 39 | 40 | class EventTimeAttribute(Attribute): 41 | """An attribute for storing a UTC Datetime or iso8601 string.""" 42 | 43 | attr_type = STRING 44 | 45 | def serialize(self, value): 46 | """Takes a datetime object and returns a string""" 47 | if isinstance(value, str): 48 | return value 49 | return value.strftime(DATETIME_FORMAT) 50 | 51 | 52 | def decimal_default(obj): 53 | """Properly parse out the Decimal datatypes into proper int/float types.""" 54 | if isinstance(obj, decimal.Decimal): 55 | if obj % 1: 56 | return float(obj) 57 | return int(obj) 58 | raise TypeError 59 | 60 | 61 | # pylint: disable=R1705,C0200 62 | def fix_decimals(obj): 63 | """Removes the stupid Decimals 64 | 65 | See: https://github.com/boto/boto3/issues/369#issuecomment-302137290 66 | """ 67 | if isinstance(obj, list): 68 | for i in range(len(obj)): 69 | obj[i] = fix_decimals(obj[i]) 70 | return obj 71 | 72 | elif isinstance(obj, dict): 73 | for key, value in obj.items(): 74 | obj[key] = fix_decimals(value) 75 | return obj 76 | 77 | elif isinstance(obj, decimal.Decimal): 78 | if obj % 1 == 0: 79 | return int(obj) 80 | else: 81 | return float(obj) 82 | 83 | else: 84 | return obj 85 | 86 | 87 | class HistoricalDecimalAttribute(Attribute): 88 | """A number attribute""" 89 | 90 | attr_type = NUMBER 91 | 92 | def serialize(self, value): 93 | """Encode numbers as JSON""" 94 | return json.dumps(value, default=decimal_default) 95 | 96 | def deserialize(self, value): 97 | """Decode numbers from JSON""" 98 | return json.loads(value) 99 | 100 | 101 | pynamodb.attributes.SERIALIZE_CLASS_MAP = { 102 | dict: MapAttribute(), 103 | list: ListAttribute(), 104 | set: ListAttribute(), 105 | bool: BooleanAttribute(), 106 | float: NumberAttribute(), 107 | int: NumberAttribute(), 108 | str: HistoricalUnicodeAttribute(), 109 | decimal.Decimal: HistoricalDecimalAttribute() 110 | } 111 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | # Created by https://www.gitignore.io/api/python,visualstudiocode,node,serverless 3 | 4 | .idea 5 | *.cert 6 | *.key 7 | 8 | ### Node ### 9 | # Logs 10 | logs 11 | *.log 12 | npm-debug.log* 13 | yarn-debug.log* 14 | yarn-error.log* 15 | 16 | # Runtime data 17 | pids 18 | *.pid 19 | *.seed 20 | *.pid.lock 21 | 22 | # Directory for instrumented libs generated by jscoverage/JSCover 23 | lib-cov 24 | 25 | # Coverage directory used by tools like istanbul 26 | coverage 27 | 28 | # nyc test coverage 29 | .nyc_output 30 | 31 | # Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files) 32 | .grunt 33 | 34 | # Bower dependency directory (https://bower.io/) 35 | bower_components 36 | 37 | # node-waf configuration 38 | .lock-wscript 39 | 40 | # Compiled binary addons (http://nodejs.org/api/addons.html) 41 | build/Release 42 | 43 | # Dependency directories 44 | node_modules/ 45 | jspm_packages/ 46 | 47 | # Typescript v1 declaration files 48 | typings/ 49 | 50 | # Optional npm cache directory 51 | .npm 52 | 53 | # Optional eslint cache 54 | .eslintcache 55 | 56 | # Optional REPL history 57 | .node_repl_history 58 | 59 | # Output of 'npm pack' 60 | *.tgz 61 | 62 | # Yarn Integrity file 63 | .yarn-integrity 64 | 65 | # dotenv environment variables file 66 | .env 67 | 68 | 69 | ### Python ### 70 | # Byte-compiled / optimized / DLL files 71 | __pycache__/ 72 | *.py[cod] 73 | *$py.class 74 | 75 | # C extensions 76 | *.so 77 | 78 | # Distribution / packaging 79 | .Python 80 | env/ 81 | build/ 82 | develop-eggs/ 83 | dist/ 84 | downloads/ 85 | eggs/ 86 | .eggs/ 87 | lib/ 88 | lib64/ 89 | parts/ 90 | sdist/ 91 | var/ 92 | wheels/ 93 | *.egg-info/ 94 | .installed.cfg 95 | *.egg 96 | 97 | # PyInstaller 98 | # Usually these files are written by a python script from a template 99 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 100 | *.manifest 101 | *.spec 102 | 103 | # Installer logs 104 | pip-log.txt 105 | pip-delete-this-directory.txt 106 | 107 | # Unit test / coverage reports 108 | htmlcov/ 109 | .tox/ 110 | .coverage 111 | .coverage.* 112 | .cache 113 | nosetests.xml 114 | coverage.xml 115 | *,cover 116 | .hypothesis/ 117 | 118 | # Translations 119 | *.mo 120 | *.pot 121 | 122 | # Django stuff: 123 | local_settings.py 124 | 125 | # Flask stuff: 126 | instance/ 127 | .webassets-cache 128 | 129 | # Scrapy stuff: 130 | .scrapy 131 | 132 | # Sphinx documentation 133 | docs/_build/ 134 | 135 | # PyBuilder 136 | target/ 137 | 138 | # Jupyter Notebook 139 | .ipynb_checkpoints 140 | 141 | # pyenv 142 | .python-version 143 | 144 | # celery beat schedule file 145 | celerybeat-schedule 146 | 147 | # SageMath parsed files 148 | *.sage.py 149 | 150 | # dotenv 151 | 152 | # virtualenv 153 | .venv 154 | venv/ 155 | ENV/ 156 | venv 157 | 158 | # Spyder project settings 159 | .spyderproject 160 | .spyproject 161 | 162 | # Rope project settings 163 | .ropeproject 164 | 165 | # mkdocs documentation 166 | /site 167 | 168 | ### Serverless ### 169 | # Ignore build directory 170 | .serverless 171 | .requirements 172 | 173 | *.test.yml 174 | serverless.yml 175 | serverless_configs/* 176 | test-reports/* 177 | 178 | ### VisualStudioCode ### 179 | .vscode 180 | 181 | # End of https://www.gitignore.io/api/python,visualstudiocode,node,serverless 182 | 183 | .DS_Store 184 | .DS_Store/ 185 | 186 | ### MKDOCS ### 187 | mkdocs/site/ 188 | 189 | # Terraform 190 | env.list 191 | -------------------------------------------------------------------------------- /historical/security_group/models.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.security_group.models 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | from marshmallow import fields, post_dump 9 | 10 | from pynamodb.attributes import UnicodeAttribute 11 | 12 | from historical.constants import CURRENT_REGION 13 | from historical.models import AWSHistoricalMixin, CurrentHistoricalModel, DurableHistoricalModel,\ 14 | HistoricalPollingBaseModel, HistoricalPollingEventDetail 15 | 16 | VERSION = 1 17 | 18 | 19 | class SecurityGroupModel: 20 | """Security Group specific fields for DynamoDB.""" 21 | 22 | GroupId = UnicodeAttribute() 23 | GroupName = UnicodeAttribute() 24 | VpcId = UnicodeAttribute(null=True) 25 | Region = UnicodeAttribute() 26 | 27 | 28 | class DurableSecurityGroupModel(DurableHistoricalModel, AWSHistoricalMixin, SecurityGroupModel): 29 | """The Durable Table model for Security Groups.""" 30 | 31 | class Meta: 32 | """Table details""" 33 | 34 | table_name = 'HistoricalSecurityGroupDurableTable' 35 | region = CURRENT_REGION 36 | tech = 'securitygroup' 37 | 38 | 39 | class CurrentSecurityGroupModel(CurrentHistoricalModel, AWSHistoricalMixin, SecurityGroupModel): 40 | """The Current Table model for Security Groups.""" 41 | 42 | class Meta: 43 | """Table details""" 44 | 45 | table_name = 'HistoricalSecurityGroupCurrentTable' 46 | region = CURRENT_REGION 47 | tech = 'securitygroup' 48 | 49 | 50 | class SecurityGroupPollingEventDetail(HistoricalPollingEventDetail): 51 | """Schema that provides the required fields for mimicking the CloudWatch Event for Polling.""" 52 | 53 | region = fields.Str(required=True, load_from='awsRegion', dump_to='awsRegion') 54 | 55 | @post_dump 56 | def add_required_security_group_polling_data(self, data): 57 | """Adds the required data to the JSON. 58 | 59 | :param data: 60 | :return: 61 | """ 62 | data['eventSource'] = 'historical.ec2.poller' 63 | data['eventName'] = 'PollSecurityGroups' 64 | return data 65 | 66 | 67 | class SecurityGroupPollingEventModel(HistoricalPollingBaseModel): 68 | """This is the Marshmallow schema for a Polling event. This is made to look like a CloudWatch Event.""" 69 | 70 | detail = fields.Nested(SecurityGroupPollingEventDetail, required=True) 71 | 72 | @post_dump() 73 | def dump_security_group_polling_event_data(self, data): 74 | """Adds the required data to the JSON. 75 | 76 | :param data: 77 | :return: 78 | """ 79 | data['version'] = '1' 80 | return data 81 | 82 | def serialize(self, account, group, region): 83 | """Serializes the JSON for the Polling Event Model. 84 | 85 | :param account: 86 | :param group: 87 | :param region: 88 | :return: 89 | """ 90 | return self.dumps({ 91 | 'account': account, 92 | 'detail': { 93 | 'request_parameters': { 94 | 'groupId': group['GroupId'] 95 | }, 96 | 'region': region, 97 | 'collected': group 98 | } 99 | }).data 100 | 101 | 102 | SECURITY_GROUP_POLLING_SCHEMA = SecurityGroupPollingEventModel(strict=True) 103 | -------------------------------------------------------------------------------- /historical/common/sqs.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.sqs 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import logging 9 | import uuid 10 | import random 11 | 12 | import boto3 13 | 14 | from historical.constants import CURRENT_REGION 15 | 16 | logging.basicConfig() 17 | LOG = logging.getLogger('historical') 18 | LOG.setLevel(logging.INFO) 19 | 20 | 21 | def chunks(event_list, chunk_size): 22 | """Yield successive n-sized chunks from the event list.""" 23 | for i in range(0, len(event_list), chunk_size): 24 | yield event_list[i:i + chunk_size] 25 | 26 | 27 | def get_queue_url(queue_name): 28 | """Get the URL of the SQS queue to send events to.""" 29 | client = boto3.client("sqs", CURRENT_REGION) 30 | queue = client.get_queue_url(QueueName=queue_name) 31 | 32 | return queue["QueueUrl"] 33 | 34 | 35 | def make_sqs_record(event, delay_seconds=0): 36 | """Get a dict with the components required for SQS""" 37 | return { 38 | "Id": uuid.uuid4().hex, 39 | "DelaySeconds": delay_seconds, 40 | "MessageBody": event 41 | } 42 | 43 | 44 | def get_random_delay(max_seconds): 45 | """Gets a randomized number between 0 and the max number in seconds for 46 | how long a message in SQS should be delayed. 47 | 48 | 900 seconds (15 min) is the maximum permitted by SQS. 49 | :param max_seconds: 50 | :return: 51 | """ 52 | return random.randint(0, max_seconds) # nosec 53 | 54 | 55 | def produce_events(events, queue_url, batch_size=10, randomize_delay=0): 56 | """ 57 | Efficiently sends events to the SQS event queue. 58 | 59 | Note: SQS has a max size of 10 items. Please be aware that this can make the messages go past size -- even 60 | with shrinking messages! 61 | 62 | Events can get randomized delays, maximum of 900 seconds. Set that in `randomize_delay` 63 | :param events: 64 | :param queue_url: 65 | :param batch_size: 66 | :param randomize_delay: 67 | """ 68 | client = boto3.client('sqs', region_name=CURRENT_REGION) 69 | 70 | for chunk in chunks(events, batch_size): 71 | records = [make_sqs_record(event, delay_seconds=get_random_delay(randomize_delay)) for event in chunk] 72 | 73 | client.send_message_batch(Entries=records, QueueUrl=queue_url) 74 | 75 | 76 | def group_records_by_type(records, update_events): 77 | """Break records into two lists; create/update events and delete events. 78 | 79 | :param records: 80 | :param update_events: 81 | :return update_records, delete_records: 82 | """ 83 | update_records, delete_records = [], [] 84 | for record in records: 85 | if record.get("detail-type", "") == "Scheduled Event": 86 | LOG.error("[X] Received a Scheduled Event in the Queue... Please check that your environment is set up" 87 | " correctly.") 88 | continue 89 | 90 | # Ignore SQS junk messages (like subscription notices and things): 91 | if not record.get("detail"): 92 | continue 93 | 94 | # Do not capture error events: 95 | if not record["detail"].get("errorCode"): 96 | if record['detail']['eventName'] in update_events: 97 | update_records.append(record) 98 | else: 99 | delete_records.append(record) 100 | 101 | return update_records, delete_records 102 | -------------------------------------------------------------------------------- /historical/vpc/models.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.vpc.models 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | from marshmallow import fields, post_dump, Schema 10 | 11 | from pynamodb.attributes import BooleanAttribute, UnicodeAttribute 12 | 13 | from historical.constants import CURRENT_REGION 14 | from historical.models import ( 15 | AWSHistoricalMixin, 16 | CurrentHistoricalModel, 17 | DurableHistoricalModel, 18 | HistoricalPollingBaseModel, 19 | HistoricalPollingEventDetail, 20 | ) 21 | 22 | 23 | VERSION = 1 24 | 25 | 26 | class VPCModel: 27 | """VPC specific fields for DynamoDB.""" 28 | 29 | VpcId = UnicodeAttribute() 30 | State = UnicodeAttribute() 31 | CidrBlock = UnicodeAttribute() 32 | IsDefault = BooleanAttribute() 33 | Name = UnicodeAttribute(null=True) 34 | Region = UnicodeAttribute() 35 | 36 | 37 | class DurableVPCModel(DurableHistoricalModel, AWSHistoricalMixin, VPCModel): 38 | """The Durable Table model for VPC.""" 39 | 40 | class Meta: 41 | """Table details""" 42 | 43 | table_name = 'HistoricalVPCDurableTable' 44 | region = CURRENT_REGION 45 | tech = 'vpc' 46 | 47 | 48 | class CurrentVPCModel(CurrentHistoricalModel, AWSHistoricalMixin, VPCModel): 49 | """The Current Table model for VPC.""" 50 | 51 | class Meta: 52 | """Table details""" 53 | 54 | table_name = 'HistoricalVPCCurrentTable' 55 | region = CURRENT_REGION 56 | tech = 'vpc' 57 | 58 | 59 | class VPCPollingRequestParamsModel(Schema): 60 | """Schema with the required fields for the Poller to instruct the Collector to fetch VPC details.""" 61 | 62 | vpc_id = fields.Str(dump_to='vpcId', load_from='vpcId', required=True) 63 | owner_id = fields.Str(dump_to='ownerId', load_from='ownerId', required=True) 64 | 65 | 66 | class VPCPollingEventDetail(HistoricalPollingEventDetail): 67 | """Schema that provides the required fields for mimicking the CloudWatch Event for Polling.""" 68 | 69 | @post_dump 70 | def add_required_vpc_polling_data(self, data): 71 | """Adds the required data to the JSON. 72 | 73 | :param data: 74 | :return: 75 | """ 76 | data['eventSource'] = 'historical.ec2.poller' 77 | data['eventName'] = 'PollVpc' 78 | return data 79 | 80 | 81 | class VPCPollingEventModel(HistoricalPollingBaseModel): 82 | """This is the Marshmallow schema for a Polling event. This is made to look like a CloudWatch Event.""" 83 | 84 | detail = fields.Nested(VPCPollingEventDetail, required=True) 85 | 86 | @post_dump() 87 | def dump_vpc_polling_event_data(self, data): 88 | """Adds the required data to the JSON. 89 | 90 | :param data: 91 | :return: 92 | """ 93 | data['version'] = '1' 94 | return data 95 | 96 | def serialize(self, account, group): 97 | """Serializes the JSON for the Polling Event Model. 98 | 99 | :param account: 100 | :param group: 101 | :return: 102 | """ 103 | return self.dumps({ 104 | 'account': account, 105 | 'detail': { 106 | 'request_parameters': { 107 | 'vpcId': group['VpcId'] 108 | } 109 | } 110 | }).data 111 | 112 | 113 | VPC_POLLING_SCHEMA = VPCPollingEventModel(strict=True) 114 | -------------------------------------------------------------------------------- /historical/s3/poller.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.s3.poller 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import os 9 | import logging 10 | 11 | from botocore.exceptions import ClientError 12 | 13 | from cloudaux.aws.s3 import list_buckets 14 | 15 | from raven_python_lambda import RavenLambdaWrapper 16 | 17 | from historical.common.sqs import get_queue_url, produce_events 18 | from historical.common.util import deserialize_records 19 | from historical.constants import CURRENT_REGION, HISTORICAL_ROLE, LOGGING_LEVEL, RANDOMIZE_POLLER 20 | from historical.models import HistoricalPollerTaskEventModel 21 | from historical.s3.models import S3_POLLING_SCHEMA 22 | from historical.common.accounts import get_historical_accounts 23 | 24 | logging.basicConfig() 25 | LOG = logging.getLogger("historical") 26 | LOG.setLevel(LOGGING_LEVEL) 27 | 28 | 29 | @RavenLambdaWrapper() 30 | def poller_tasker_handler(event, context): # pylint: disable=W0613 31 | """ 32 | Historical S3 Poller Tasker. 33 | 34 | The Poller is run at a set interval in order to ensure that changes do not go undetected by Historical. 35 | 36 | Historical pollers generate `polling events` which simulate changes. These polling events contain configuration 37 | data such as the account/region defining where the collector should attempt to gather data from. 38 | 39 | This is the entry point. This will task subsequent Poller lambdas to list all of a given resource in a select few 40 | AWS accounts. 41 | """ 42 | LOG.debug('[@] Running Poller Tasker...') 43 | 44 | queue_url = get_queue_url(os.environ.get('POLLER_TASKER_QUEUE_NAME', 'HistoricalS3PollerTasker')) 45 | poller_task_schema = HistoricalPollerTaskEventModel() 46 | 47 | events = [poller_task_schema.serialize_me(account['id'], CURRENT_REGION) for account in get_historical_accounts()] 48 | 49 | try: 50 | produce_events(events, queue_url, randomize_delay=RANDOMIZE_POLLER) 51 | except ClientError as exc: 52 | LOG.error(f'[X] Unable to generate poller tasker events! Reason: {exc}') 53 | 54 | LOG.debug('[@] Finished tasking the pollers.') 55 | 56 | 57 | @RavenLambdaWrapper() 58 | def poller_processor_handler(event, context): # pylint: disable=W0613 59 | """ 60 | Historical S3 Poller Processor. 61 | 62 | This will receive events from the Poller Tasker, and will list all objects of a given technology for an 63 | account/region pair. This will generate `polling events` which simulate changes. These polling events contain 64 | configuration data such as the account/region defining where the collector should attempt to gather data from. 65 | """ 66 | LOG.debug('[@] Running Poller...') 67 | 68 | queue_url = get_queue_url(os.environ.get('POLLER_QUEUE_NAME', 'HistoricalS3Poller')) 69 | 70 | records = deserialize_records(event['Records']) 71 | 72 | for record in records: 73 | # Skip accounts that have role assumption errors: 74 | try: 75 | # List all buckets in the account: 76 | all_buckets = list_buckets(account_number=record['account_id'], 77 | assume_role=HISTORICAL_ROLE, 78 | session_name="historical-cloudwatch-s3list", 79 | region=record['region'])["Buckets"] 80 | 81 | events = [S3_POLLING_SCHEMA.serialize_me(record['account_id'], bucket) for bucket in all_buckets] 82 | produce_events(events, queue_url, randomize_delay=RANDOMIZE_POLLER) 83 | except ClientError as exc: 84 | LOG.error(f"[X] Unable to generate events for account. Account Id: {record['account_id']} Reason: {exc}") 85 | 86 | LOG.debug(f"[@] Finished generating polling events for account: {record['account_id']}. Events Created:" 87 | f" {len(record['account_id'])}") 88 | -------------------------------------------------------------------------------- /historical/vpc/poller.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.vpc.poller 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | import os 10 | import logging 11 | 12 | from botocore.exceptions import ClientError 13 | 14 | from raven_python_lambda import RavenLambdaWrapper 15 | from cloudaux.aws.ec2 import describe_vpcs 16 | 17 | from historical.constants import HISTORICAL_ROLE, LOGGING_LEVEL, POLL_REGIONS, RANDOMIZE_POLLER 18 | from historical.common.util import deserialize_records 19 | from historical.vpc.models import VPC_POLLING_SCHEMA 20 | from historical.models import HistoricalPollerTaskEventModel 21 | from historical.common.accounts import get_historical_accounts 22 | from historical.common.sqs import get_queue_url, produce_events 23 | 24 | logging.basicConfig() 25 | LOG = logging.getLogger("historical") 26 | LOG.setLevel(LOGGING_LEVEL) 27 | 28 | 29 | @RavenLambdaWrapper() 30 | def poller_tasker_handler(event, context): # pylint: disable=W0613 31 | """ 32 | Historical VPC Poller Tasker. 33 | 34 | The Poller is run at a set interval in order to ensure that changes do not go undetected by Historical. 35 | 36 | Historical pollers generate `polling events` which simulate changes. These polling events contain configuration 37 | data such as the account/region defining where the collector should attempt to gather data from. 38 | 39 | This is the entry point. This will task subsequent Poller lambdas to list all of a given resource in a select few 40 | AWS accounts. 41 | """ 42 | LOG.debug('[@] Running Poller Tasker...') 43 | 44 | queue_url = get_queue_url(os.environ.get('POLLER_TASKER_QUEUE_NAME', 'HistoricalVPCPollerTasker')) 45 | poller_task_schema = HistoricalPollerTaskEventModel() 46 | 47 | events = [] 48 | for account in get_historical_accounts(): 49 | for region in POLL_REGIONS: 50 | events.append(poller_task_schema.serialize_me(account['id'], region)) 51 | 52 | try: 53 | produce_events(events, queue_url, randomize_delay=RANDOMIZE_POLLER) 54 | except ClientError as exc: 55 | LOG.error(f'[X] Unable to generate poller tasker events! Reason: {exc}') 56 | 57 | LOG.debug('[@] Finished tasking the pollers.') 58 | 59 | 60 | @RavenLambdaWrapper() 61 | def poller_processor_handler(event, context): # pylint: disable=W0613 62 | """ 63 | Historical Security Group Poller Processor. 64 | 65 | This will receive events from the Poller Tasker, and will list all objects of a given technology for an 66 | account/region pair. This will generate `polling events` which simulate changes. These polling events contain 67 | configuration data such as the account/region defining where the collector should attempt to gather data from. 68 | """ 69 | LOG.debug('[@] Running Poller...') 70 | 71 | queue_url = get_queue_url(os.environ.get('POLLER_QUEUE_NAME', 'HistoricalVPCPoller')) 72 | 73 | records = deserialize_records(event['Records']) 74 | 75 | for record in records: 76 | # Skip accounts that have role assumption errors: 77 | try: 78 | vpcs = describe_vpcs( 79 | account_number=record['account_id'], 80 | assume_role=HISTORICAL_ROLE, 81 | region=record['region'] 82 | ) 83 | 84 | events = [VPC_POLLING_SCHEMA.serialize(record['account_id'], v) for v in vpcs] 85 | produce_events(events, queue_url, randomize_delay=RANDOMIZE_POLLER) 86 | LOG.debug(f"[@] Finished generating polling events. Account: {record['account_id']}/{record['region']} " 87 | f"Events Created: {len(events)}") 88 | except ClientError as exc: 89 | LOG.error(f"[X] Unable to generate events for account/region. Account Id/Region: {record['account_id']}" 90 | f"/{record['region']} Reason: {exc}") 91 | -------------------------------------------------------------------------------- /historical/s3/models.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.s3.models 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | from marshmallow import fields, post_dump, Schema 9 | from pynamodb.attributes import UnicodeAttribute 10 | 11 | from historical.constants import CURRENT_REGION 12 | from historical.models import AWSHistoricalMixin, CurrentHistoricalModel, DurableHistoricalModel, \ 13 | HistoricalPollingBaseModel, HistoricalPollingEventDetail 14 | 15 | 16 | # The schema version -- TODO: Get this from CloudAux 17 | VERSION = 9 18 | 19 | 20 | class S3Model: 21 | """S3 specific fields for DynamoDB.""" 22 | 23 | BucketName = UnicodeAttribute() 24 | Region = UnicodeAttribute() 25 | 26 | 27 | class DurableS3Model(DurableHistoricalModel, AWSHistoricalMixin, S3Model): 28 | """The Durable Table model for S3.""" 29 | 30 | class Meta: 31 | """Table Details""" 32 | 33 | table_name = 'HistoricalS3DurableTable' 34 | region = CURRENT_REGION 35 | tech = 's3' 36 | 37 | 38 | class CurrentS3Model(CurrentHistoricalModel, AWSHistoricalMixin, S3Model): 39 | """The Current Table model for S3.""" 40 | 41 | class Meta: 42 | """Table Details""" 43 | 44 | table_name = 'HistoricalS3CurrentTable' 45 | region = CURRENT_REGION 46 | tech = 's3' 47 | 48 | 49 | class S3PollingRequestParamsModel(Schema): 50 | """Schema with the required fields for the Poller to instruct the Collector to fetch S3 details.""" 51 | 52 | bucket_name = fields.Str(dump_to="bucketName", load_from="bucketName", required=True) 53 | creation_date = fields.Str(dump_to="creationDate", load_from="creationDate", required=True) 54 | 55 | 56 | class S3PollingEventDetail(HistoricalPollingEventDetail): 57 | """Schema that provides the required fields for mimicking the CloudWatch Event for Polling.""" 58 | 59 | request_parameters = fields.Nested(S3PollingRequestParamsModel, dump_to="requestParameters", 60 | load_from="requestParameters", required=True) 61 | event_source = fields.Str(load_only=True, load_from="eventSource", required=True) 62 | event_name = fields.Str(load_only=True, load_from="eventName", required=True) 63 | 64 | @post_dump 65 | def add_required_s3_polling_data(self, data): 66 | """Adds the required data to the JSON. 67 | 68 | :param data: 69 | :return: 70 | """ 71 | data["eventSource"] = "historical.s3.poller" 72 | data["eventName"] = "PollS3" 73 | 74 | return data 75 | 76 | 77 | class S3PollingEventModel(HistoricalPollingBaseModel): 78 | """This is the Marshmallow schema for a Polling event. This is made to look like a CloudWatch Event.""" 79 | 80 | detail = fields.Nested(S3PollingEventDetail, required=True) 81 | version = fields.Str(load_only=True, required=True) 82 | 83 | @post_dump() 84 | def dump_s3_polling_event_data(self, data): 85 | """Adds the required data to the JSON. 86 | 87 | :param data: 88 | :return: 89 | """ 90 | data["version"] = "1" 91 | 92 | return data 93 | 94 | def serialize_me(self, account, bucket_details): 95 | """Serializes the JSON for the Polling Event Model. 96 | 97 | :param account: 98 | :param bucket_details: 99 | :return: 100 | """ 101 | return self.dumps({ 102 | "account": account, 103 | "detail": { 104 | "request_parameters": { 105 | "bucket_name": bucket_details["Name"], 106 | "creation_date": bucket_details["CreationDate"].replace( 107 | tzinfo=None, microsecond=0).isoformat() + "Z" 108 | } 109 | } 110 | }).data 111 | 112 | 113 | S3_POLLING_SCHEMA = S3PollingEventModel(strict=True) 114 | -------------------------------------------------------------------------------- /mkdocs/docs/installation/index.md: -------------------------------------------------------------------------------- 1 | # Installation & Configuration 2 | **Note: Some assembly is required.** 3 | 4 | There are many components that make up Historical. Included is a Docker container that you can use to run Terraform for installation. 5 | 6 | Please review each section below in order to ensure that all aspects of the installation go smoothly. This is important because there are _many_ components that have to be configured correctly for Historical to operate properly. 7 | 8 | ## Architecture 9 | Before reading this installation guide, please become familiar with the Historical architecture. This will assist you in making the proper configuration for Historical. [You can review that here](../architecture.md). 10 | 11 | ## Prerequisites 12 | Historical requires the following prerequisites: 13 | 14 | 1. An AWS account that is dedicated for Historical (this is highly recommended). 15 | 1. CloudTrail must be enabled for **ALL** accounts and **ALL** regions. 16 | 1. CloudWatch Event Buses must be configured to route **ALL** CloudWatch Events to the Historical account. [Please review and follow the AWS documentation for sending and receiving events between AWS accounts before continuing](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatchEvents-CrossAccountEventDelivery.html). 17 | - This diagram outlines how CloudWatch Event Buses should be configured: 18 | 19 | 1. You will need to create IAM roles in all the accounts to monitor first. This requires your own orchestration to complete. See the IAM section below for details. 20 | 1. Historical makes use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client) to define which AWS accounts Historical is enabled for. SWAG must be properly configured for Historical to operate. Alternatively, you can specify the AWS Account IDs that Historical will examine via an environment variable. However, it is _highly recommended_ that you make use of SWAG. 21 | 22 | ## IAM Setup 23 | Please review the [IAM Role setup guide here](iam.md) for instructions. 24 | 25 | ## Terraform 26 | A set of **sample** [Terraform](https://terraform.io) templates are included to assist with the roll-out of the infrastructure. This is intended to be run within a Docker container (code also included). The Docker container will: 27 | 28 | This is used for both installation and uninstallation. [Please review the documentation in detail here](terraform.md). 29 | 30 | ## Configuration and Environment Variables 31 | **IMPORTANT:** There are many environment variables and configuration details that are required to be set. [Please review this page for details on this](configuration.md). 32 | 33 | ## Prepare Docker Container 34 | Once you have made the necessary changes to your Terraform configuration files, you need to build the Docker container. You will need to build your Docker container. 35 | 36 | 1. Please [install Docker](https://www.docker.com/get-started) if you haven't already. 37 | 1. Navigate to the `historical/terraform` directory. 38 | 1. In a terminal, run `docker build . -t historical_installer` 39 | 40 | At this point, you now have a Docker container with all the required components to deploy Historical. _If you need to make any adjustments, you will need to re-build your container._ 41 | 42 | ## Installation 43 | Terraform requires a lot of permissions. You will need a very powerful AWS administrative role with lots of permissions to execute the Docker. 44 | 45 | 1. Get credentials from an IAM role with administrative permissions. 46 | 1. Make a copy of `terraform/SAMPLE-env.list` to `terraform/env.list` 47 | 1. Open `terraform/env.list`, and fill in the values. ALL values must be supplied and correct. See the [configuration documentation](configuration.md#docker-installer-specific-fields) for reference. 48 | 1. In a terminal, navigate to `terraform/` 49 | 1. Run Docker! `docker run --env-file ./env.list -t historical_installer` 50 | 51 | Hopefully this works! 52 | 53 | ## Uninstallation 54 | Like for installation, you will need a lot of permissions. You will need a very powerful AWS administrative role with lots of permissions to execute the Docker. 55 | 56 | 1. Get credentials from an IAM role with administrative permissions. 57 | 1. Use the `terraform/env.list` values used for installation. 58 | 1. In a terminal, navigate to `terraform/` 59 | 1. Run Docker! `docker run --env-file ./env.list --entrypoint /installer/teardown_historical.sh -t historical_installer` 60 | 61 | This *might* fail the first time it runs. This is because Terraform doesn't wait long enough for all the resources to be deleted in the primary region. Try running it again if it fails the first time. 62 | 63 | If it's still failing, you may need to find the resources that are failing to delete and manually delete them. 64 | 65 | Please note: Depending on how active the Lambda functions are, the CloudWatch Event Log groups may still be present after stack deletion. You will need to manually delete these in each primary and secondary regions. 66 | 67 | Hopefully this works well for you! 68 | 69 | ## Troubleshooting 70 | Please review the [Troubleshooting](../troubleshooting) doc if you are experiencing issues. 71 | -------------------------------------------------------------------------------- /mkdocs/docs/installation/iam.md: -------------------------------------------------------------------------------- 1 | # Historical IAM Role Setup Guide 2 | 3 | IAM roles need to be configured for Historical to properly inventory all of your accounts. The following must be created: 4 | 5 | 1. The `HistoricalLambdaProfile` role which is used to launch the Historical Lambda functions. 6 | 1. The `Historical` role which the `HistoricalLambdaProfile` will assume to describe and collect details from the account in question. 7 | 8 | The architecture for this looks like this: 9 | 10 | 11 | ## Instructions 12 | 13 | ### Lambda Role 14 | 15 | 1. In the Historical account, create the `HistoricalLambdaProfile` IAM Role. This role needs to permit the `lambda.amazonaws.com` Service Principal access to it. Here is an example: 16 | 17 | *Trust Policy*: 18 | 19 | { 20 | "Version": "2012-10-17", 21 | "Statement": [ 22 | { 23 | "Effect": "Allow", 24 | "Principal": { 25 | "Service": "lambda.amazonaws.com" 26 | }, 27 | "Action": "sts:AssumeRole" 28 | } 29 | ] 30 | } 31 | 32 | 1. This role is being executed by AWS Lambda and requires the `AWSLambdaBasicExecutionRole` _AWS managed policy_ attached to it. This managed policy gives the Lambda access to write to CloudWatch Logs. VPC permissions are not required because Historical does not make use of ENIs or Security Groups. 33 | 34 | 1. The role then needs a set of _Inline Policies_ to grant it access to the resources required for the Lambda function to access the Historical resources. Please make a new Inline Policy named `HistoricalLambdaPerms` as follows (substitute `HISTORICAL-ACCOUNT-NUMBER-HERE` with the AWS account ID of the Historical account): 35 | 36 | { 37 | "Version": "2012-10-17", 38 | "Statement": [ 39 | { 40 | "Sid": "SQS", 41 | "Effect": "Allow", 42 | "Action": [ 43 | "sqs:DeleteMessage", 44 | "sqs:GetQueueAttributes", 45 | "sqs:GetQueueUrl", 46 | "sqs:ReceiveMessage", 47 | "sqs:SendMessage" 48 | ], 49 | "Resource": "arn:aws:sqs:*:HISTORICAL-ACCOUNT-NUMBER-HERE:Historical*" 50 | }, 51 | { 52 | "Sid": "SNS", 53 | "Effect": "Allow", 54 | "Action": "sns:Publish", 55 | "Resource": "arn:aws:sns:*:HISTORICAL-ACCOUNT-NUMBER-HERE:Historical*" 56 | }, 57 | { 58 | "Sid": "STS", 59 | "Effect": "Allow", 60 | "Action": "sts:AssumeRole", 61 | "Resource": "arn:aws:iam::*:role/Historical" 62 | }, 63 | { 64 | "Sid": "DynamoDB", 65 | "Effect": "Allow", 66 | "Action": [ 67 | "dynamodb:BatchGetItem", 68 | "dynamodb:BatchWriteItem", 69 | "dynamodb:DeleteItem", 70 | "dynamodb:DescribeStream", 71 | "dynamodb:DescribeTable", 72 | "dynamodb:GetItem", 73 | "dynamodb:GetRecords", 74 | "dynamodb:GetShardIterator", 75 | "dynamodb:ListStreams", 76 | "dynamodb:PutItem", 77 | "dynamodb:Query", 78 | "dynamodb:Scan", 79 | "dynamodb:UpdateItem" 80 | ], 81 | "Resource": "arn:aws:dynamodb:*:HISTORICAL-ACCOUNT-NUMBER-HERE:table/Historical*" 82 | } 83 | ] 84 | } 85 | 86 | 87 | ### Destination Account Roles 88 | 89 | You will mostly likely need your own orchestration to roll this out. This will need to be rolled out to ALL accounts that you are inventorying with Historical. 90 | 91 | The role is named `Historical` and has the following configuration details: 92 | 93 | 1. Trust Policy (substitute `HISTORICAL-ACCOUNT-NUMBER-HERE` with the AWS account ID of the Historical account): 94 | 95 | { 96 | "Version": "2012-10-17", 97 | "Statement": [ 98 | { 99 | "Effect": "Allow", 100 | "Principal": { 101 | "AWS": "arn:aws:iam::HISTORICAL-ACCOUNT-NUMBER-HERE:role/HistoricalLambdaProfile" 102 | }, 103 | "Action": "sts:AssumeRole", 104 | "Condition": {} 105 | } 106 | ] 107 | } 108 | 109 | 1. The `Historical` role needs read access to your resources. Simply attach the `ReadOnlyAccess` _AWS managed policy_ to the role and that is all. 110 | 111 | 1. Duplicate this role to all of your accounts via your own orchestration and automation. 112 | 113 | ## Next Steps 114 | [Please return to the Installation documentation](../). 115 | -------------------------------------------------------------------------------- /historical/security_group/poller.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.security_group.poller 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | import os 10 | import logging 11 | 12 | from botocore.exceptions import ClientError 13 | 14 | from raven_python_lambda import RavenLambdaWrapper 15 | from cloudaux.aws.ec2 import describe_security_groups 16 | 17 | from historical.common.sqs import get_queue_url, produce_events 18 | from historical.common.util import deserialize_records 19 | from historical.constants import HISTORICAL_ROLE, LOGGING_LEVEL, POLL_REGIONS, RANDOMIZE_POLLER 20 | from historical.models import HistoricalPollerTaskEventModel 21 | from historical.security_group.models import SECURITY_GROUP_POLLING_SCHEMA 22 | from historical.common.accounts import get_historical_accounts 23 | 24 | logging.basicConfig() 25 | LOG = logging.getLogger("historical") 26 | LOG.setLevel(LOGGING_LEVEL) 27 | 28 | 29 | @RavenLambdaWrapper() 30 | def poller_tasker_handler(event, context): # pylint: disable=W0613 31 | """ 32 | Historical Security Group Poller Tasker. 33 | 34 | The Poller is run at a set interval in order to ensure that changes do not go undetected by Historical. 35 | 36 | Historical pollers generate `polling events` which simulate changes. These polling events contain configuration 37 | data such as the account/region defining where the collector should attempt to gather data from. 38 | 39 | This is the entry point. This will task subsequent Poller lambdas to list all of a given resource in a select few 40 | AWS accounts. 41 | """ 42 | LOG.debug('[@] Running Poller Tasker...') 43 | 44 | queue_url = get_queue_url(os.environ.get('POLLER_TASKER_QUEUE_NAME', 'HistoricalSecurityGroupPollerTasker')) 45 | poller_task_schema = HistoricalPollerTaskEventModel() 46 | 47 | events = [] 48 | for account in get_historical_accounts(): 49 | for region in POLL_REGIONS: 50 | events.append(poller_task_schema.serialize_me(account['id'], region)) 51 | 52 | try: 53 | produce_events(events, queue_url, randomize_delay=RANDOMIZE_POLLER) 54 | except ClientError as exc: 55 | LOG.error(f'[X] Unable to generate poller tasker events! Reason: {exc}') 56 | 57 | LOG.debug('[@] Finished tasking the pollers.') 58 | 59 | 60 | @RavenLambdaWrapper() 61 | def poller_processor_handler(event, context): # pylint: disable=W0613 62 | """ 63 | Historical Security Group Poller Processor. 64 | 65 | This will receive events from the Poller Tasker, and will list all objects of a given technology for an 66 | account/region pair. This will generate `polling events` which simulate changes. These polling events contain 67 | configuration data such as the account/region defining where the collector should attempt to gather data from. 68 | """ 69 | LOG.debug('[@] Running Poller...') 70 | 71 | collector_poller_queue_url = get_queue_url(os.environ.get('POLLER_QUEUE_NAME', 'HistoricalSecurityGroupPoller')) 72 | takser_queue_url = get_queue_url(os.environ.get('POLLER_TASKER_QUEUE_NAME', 'HistoricalSecurityGroupPollerTasker')) 73 | 74 | poller_task_schema = HistoricalPollerTaskEventModel() 75 | records = deserialize_records(event['Records']) 76 | 77 | for record in records: 78 | # Skip accounts that have role assumption errors: 79 | try: 80 | # Did we get a NextToken? 81 | if record.get('NextToken'): 82 | LOG.debug(f"[@] Received pagination token: {record['NextToken']}") 83 | groups = describe_security_groups( 84 | account_number=record['account_id'], 85 | assume_role=HISTORICAL_ROLE, 86 | region=record['region'], 87 | MaxResults=200, 88 | NextToken=record['NextToken'] 89 | ) 90 | else: 91 | groups = describe_security_groups( 92 | account_number=record['account_id'], 93 | assume_role=HISTORICAL_ROLE, 94 | region=record['region'], 95 | MaxResults=200 96 | ) 97 | 98 | # FIRST THINGS FIRST: Did we get a `NextToken`? If so, we need to enqueue that ASAP because 99 | # 'NextToken`s expire in 60 seconds! 100 | if groups.get('NextToken'): 101 | logging.debug(f"[-->] Pagination required {groups['NextToken']}. Tasking continuation.") 102 | produce_events( 103 | [poller_task_schema.serialize_me(record['account_id'], record['region'], 104 | next_token=groups['NextToken'])], 105 | takser_queue_url 106 | ) 107 | 108 | # Task the collector to perform all the DDB logic -- this will pass in the collected data to the 109 | # collector in very small batches. 110 | events = [SECURITY_GROUP_POLLING_SCHEMA.serialize(record['account_id'], g, record['region']) 111 | for g in groups['SecurityGroups']] 112 | produce_events(events, collector_poller_queue_url, batch_size=3) 113 | 114 | LOG.debug(f"[@] Finished generating polling events. Account: {record['account_id']}/{record['region']} " 115 | f"Events Created: {len(events)}") 116 | except ClientError as exc: 117 | LOG.error(f"[X] Unable to generate events for account/region. Account Id/Region: {record['account_id']}" 118 | f"/{record['region']} Reason: {exc}") 119 | -------------------------------------------------------------------------------- /mkdocs/docs/installation/terraform.md: -------------------------------------------------------------------------------- 1 | # Historical Terraform Setup 2 | 3 | A set of **sample** [Terraform](https://terraform.io) templates are included to assist with the roll-out of the infrastructure. This is intended to be run within a Docker container (code also included). The Docker container will: 4 | 5 | 1. Package the Historical Lambda code 6 | 1. Run the Terraform templates to provision all of the infrastructure 7 | 8 | This is all run within an [Amazon Linux](https://hub.docker.com/_/amazonlinux/) Docker container. Amazon Linux is required because Historical's dependencies make use of statically linked libraries, which will fail to run in the Lambda environment unless the binaries are built on Amazon Linux. 9 | 10 | You can also use this to uninstall Historical from your environment as well. 11 | 12 | **Please review each section below, as the details are very important:** 13 | 14 | ### Structure 15 | The Terraform templates are split into multiple components: 16 | 17 | 1. **Terraform Plugins** (located in terraform/terraform-plugins) 18 | 1. **DynamoDB** (located in terraform/dynamodb) 19 | 1. **Infrastructure** (located in terraform/infra) 20 | 21 | #### Terraform Backend Configuration 22 | We make the assumption that the Terraform backend is on S3. As such, you will need an S3 bucket that resides in the Historical AWS account. It is __highly recommended__ that you configure the Historical Terraform S3 bucket with versioning enabled. This is needed should there ever be an issue with the Terraform state. 23 | 24 | **NOTE:** For __ALL__ Terraform `main.tf` template files, at the top of the template file is a backend region configuration. It looks like this: 25 | 26 | terraform { 27 | backend "s3" { 28 | // Set this to where your Terraform S3 bucket is located (using us-west-2 as the example): 29 | region = "us-west-2" 30 | } 31 | } 32 | 33 | You will need to set the region to where your Terraform S3 bucket resides. In our examples, we are making use of `us-west-2`. 34 | 35 | #### Terraform Plugins 36 | This is a Terraform template that is executed in the Docker `build` step. This is done to pin the Terraform plugins to the Docker container so that they need not be re-downloaded later. It is important to keep the version numbers in this doc in sync with the rest of the templates. 37 | 38 | #### DynamoDB Templates 39 | This is used to construct the Global DynamoDB tables used by Historical. This is structured as follows: 40 | 41 | 1. `main.tf` - This is the main template with the components required to build out the Global DynamoDB tables for a given Historical stack. The sample included makes an **ASSUMPTION** that you will be utilizing `us-west-2` as your _PRIMARY REGION_, and `us-east-1` and `eu-west-1` as your _SECONDARY REGIONS_. 42 | - **You will need to modify this template accordingly to change the defaults set.** 43 | - This is used for ALL stacks. If you want to specify different primary and secondary regions for a given AWS resource type, then you will need to make your own modifications to the installation scripting to leverage different templates. 44 | 1. Per-resource type stack configurations. Included are details for S3 and Security Groups. There is a Terraform template for each resource type. This is where you can configure the read and write capacities for the tables. 45 | - **You will need to modify these templates accordingly to change the defaults set.** 46 | - By default the tables are configured with a read and write capacity of `100` units. Change this as necessary. 47 | 48 | When the installation scripts run, it copies over the resource type configuration to the same directory as the `main.tf` template. Terraform is then able to build out the infrastructure for a given resource type. 49 | 50 | #### Infrastructure 51 | This is organized similar to the DynamoDB templates. This must be executed _after_ the DynamoDB templates on installation and _before_ the DynamoDB templates on tear-down (for uninstallation should you need to tear down the stack). This is structured as follows: 52 | 53 | 1. `main.tf` - This is the main template with most of the infrastructure components identified. Very few (or no) changes need to be made here. 54 | - This is used for ALL stacks. 55 | 1. `off-regions.tf` - This outlines all of the off-region components that are required. This file has a duplicate of every region off-regions' components. Unfortunately, because Terraform lacks a great way to perform loops and iterations, we duplicate the configuration for each region. This makes the file very large and painful to edit. The sample included makes an **ASSUMPTION** that you will be utilizing `us-west-2` as your _PRIMARY REGION_, and `us-east-1` and `eu-west-1` as your _SECONDARY REGIONS_. Thus, all other regions are the off-regions in our sample. You will need to alter this should you want to change the regions for your deployment. 56 | - **You will need to modify this template accordingly to change the defaults set.** 57 | - This is used for ALL stacks. If you want to specify different primary, secondary, and off-regions for a given AWS resource type, then you will need to make your own modifications to the installation scripting to leverage different templates. 58 | 1. Per-resource type stack configurations. Included are details for S3 and Security Groups. There is a Terraform template for each resource type. This is where you need to configure a number of details. 59 | - _Most_ of the defaults values are fine and should not be changed. 60 | - You will need to set the `PRIMARY_REGION`, and `POLLING_REGIONS` variables accordingly. With the exception of S3, the `POLLING_REGIONS` should include the primary and secondary regions in the list. 61 | - **You will need to review all of the variables and comments in the template to understand what they mean how they should be set. If you change the defaults, you will need to make updates as necessary.** 62 | 63 | 64 | ## Configuration and Environment Variables 65 | **IMPORTANT:** There are many environment variables and configuration details that are required to be set. [Please review this page for details on this](configuration.md). 66 | 67 | 68 | ## Next Steps 69 | Once you have thoroughly reviewed this section, please return back to the [installation documentation](../). 70 | -------------------------------------------------------------------------------- /terraform/infra/s3/s3.tf: -------------------------------------------------------------------------------- 1 | // Declare variables for the S3 Stack: 2 | variable "PRIMARY_REGION" { 3 | default = "us-west-2" // Change this for your infrastructure 4 | } 5 | 6 | // Define the regions to place the poller infrastructure here: 7 | variable "POLLING_REGIONS" { 8 | type = "list" 9 | 10 | default = ["us-west-2"] // Change this for your infrastructure 11 | } 12 | 13 | // Define the CloudWatch Event configuration: 14 | data "null_data_source" "cwe_config" { 15 | inputs = { 16 | off_regions_sns_name = "HistoricalS3CWEForwarder" 17 | 18 | rule_name = "HistoricalS3CloudWatchEventRule" 19 | rule_desc = "EventRule forwarding S3 Bucket changes." 20 | 21 | poller_rule_name = "HistoricalS3PollerEventRule" 22 | poller_rule_desc = "EventRule for Polling S3." 23 | poller_rule_rate = "rate(6 hours)" 24 | 25 | sqs_poller_tasker_queue = "HistoricalS3PollerTasker" 26 | sqs_event_queue = "HistoricalS3Events" 27 | sqs_poller_collector_queue = "HistoricalS3Poller" 28 | differ_queue = "HistoricalS3Differ" 29 | 30 | rule_target_name = "HistoricalS3EventsToSQS" 31 | 32 | // Event Syntax: 33 | event_pattern = < 7 | """ 8 | import logging 9 | 10 | from botocore.exceptions import ClientError 11 | from pynamodb.exceptions import DeleteError 12 | 13 | from raven_python_lambda import RavenLambdaWrapper 14 | 15 | from cloudaux.aws.ec2 import describe_vpcs 16 | 17 | from historical.common.sqs import group_records_by_type 18 | from historical.constants import CURRENT_REGION, HISTORICAL_ROLE, LOGGING_LEVEL 19 | from historical.common import cloudwatch 20 | from historical.common.util import deserialize_records, pull_tag_dict 21 | from historical.vpc.models import CurrentVPCModel, VERSION 22 | 23 | logging.basicConfig() 24 | LOG = logging.getLogger('historical') 25 | LOG.setLevel(LOGGING_LEVEL) 26 | 27 | 28 | UPDATE_EVENTS = [ 29 | 'CreateVpc', 30 | 'ModifyVpcAttribute', 31 | 'PollVpc' 32 | ] 33 | 34 | DELETE_EVENTS = [ 35 | 'DeleteVpc' 36 | ] 37 | 38 | 39 | def get_arn(vpc_id, region, account_id): 40 | """Creates a vpc ARN.""" 41 | return f'arn:aws:ec2:{region}:{account_id}:vpc/{vpc_id}' 42 | 43 | 44 | def describe_vpc(record): 45 | """Attempts to describe vpc ids.""" 46 | account_id = record['account'] 47 | vpc_name = cloudwatch.filter_request_parameters('vpcName', record) 48 | vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 49 | 50 | try: 51 | if vpc_id and vpc_name: # pylint: disable=R1705 52 | return describe_vpcs( 53 | account_number=account_id, 54 | assume_role=HISTORICAL_ROLE, 55 | region=CURRENT_REGION, 56 | Filters=[ 57 | { 58 | 'Name': 'vpc-id', 59 | 'Values': [vpc_id] 60 | } 61 | ] 62 | ) 63 | elif vpc_id: 64 | return describe_vpcs( 65 | account_number=account_id, 66 | assume_role=HISTORICAL_ROLE, 67 | region=CURRENT_REGION, 68 | VpcIds=[vpc_id] 69 | ) 70 | else: 71 | raise Exception('[X] Describe requires VpcId.') 72 | except ClientError as exc: 73 | if exc.response['Error']['Code'] == 'InvalidVpc.NotFound': 74 | return [] 75 | raise exc 76 | 77 | 78 | def create_delete_model(record): 79 | """Create a vpc model from a record.""" 80 | data = cloudwatch.get_historical_base_info(record) 81 | 82 | vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 83 | 84 | arn = get_arn(vpc_id, cloudwatch.get_region(record), record['account']) 85 | 86 | LOG.debug(F'[-] Deleting Dynamodb Records. Hash Key: {arn}') 87 | 88 | # tombstone these records so that the deletion event time can be accurately tracked. 89 | data.update({ 90 | 'configuration': {} 91 | }) 92 | 93 | items = list(CurrentVPCModel.query(arn, limit=1)) 94 | 95 | if items: 96 | model_dict = items[0].__dict__['attribute_values'].copy() 97 | model_dict.update(data) 98 | model = CurrentVPCModel(**model_dict) 99 | model.save() 100 | return model 101 | 102 | return None 103 | 104 | 105 | def capture_delete_records(records): 106 | """Writes all of our delete events to DynamoDB.""" 107 | for record in records: 108 | model = create_delete_model(record) 109 | if model: 110 | try: 111 | model.delete(condition=(CurrentVPCModel.eventTime <= record['detail']['eventTime'])) 112 | except DeleteError: 113 | LOG.warning(f'[?] Unable to delete VPC. VPC does not exist. Record: {record}') 114 | else: 115 | LOG.warning(f'[?] Unable to delete VPC. VPC does not exist. Record: {record}') 116 | 117 | 118 | def get_vpc_name(vpc): 119 | """Fetches VPC Name (as tag) from VPC.""" 120 | for tag in vpc.get('Tags', []): 121 | if tag['Key'].lower() == 'name': 122 | return tag['Value'] 123 | 124 | return None 125 | 126 | 127 | def capture_update_records(records): 128 | """Writes all updated configuration info to DynamoDB""" 129 | for record in records: 130 | data = cloudwatch.get_historical_base_info(record) 131 | vpc = describe_vpc(record) 132 | 133 | if len(vpc) > 1: 134 | raise Exception(f'[X] Multiple vpcs found. Record: {record}') 135 | 136 | if not vpc: 137 | LOG.warning(f'[?] No vpc information found. Record: {record}') 138 | continue 139 | 140 | vpc = vpc[0] 141 | 142 | # determine event data for vpc 143 | LOG.debug(f'Processing vpc. VPC: {vpc}') 144 | data.update({ 145 | 'VpcId': vpc.get('VpcId'), 146 | 'arn': get_arn(vpc['VpcId'], cloudwatch.get_region(record), data['accountId']), 147 | 'configuration': vpc, 148 | 'State': vpc.get('State'), 149 | 'IsDefault': vpc.get('IsDefault'), 150 | 'CidrBlock': vpc.get('CidrBlock'), 151 | 'Name': get_vpc_name(vpc), 152 | 'Region': cloudwatch.get_region(record), 153 | 'version': VERSION 154 | }) 155 | 156 | data['Tags'] = pull_tag_dict(vpc) 157 | 158 | LOG.debug(f'[+] Writing DynamoDB Record. Records: {data}') 159 | 160 | current_revision = CurrentVPCModel(**data) 161 | current_revision.save() 162 | 163 | 164 | @RavenLambdaWrapper() 165 | def handler(event, context): # pylint: disable=W0613 166 | """ 167 | Historical vpc event collector. 168 | This collector is responsible for processing Cloudwatch events and polling events. 169 | """ 170 | records = deserialize_records(event['Records']) 171 | 172 | # Split records into two groups, update and delete. 173 | # We don't want to query for deleted records. 174 | update_records, delete_records = group_records_by_type(records, UPDATE_EVENTS) 175 | capture_delete_records(delete_records) 176 | 177 | # filter out error events 178 | update_records = [e for e in update_records if not e['detail'].get('errorCode')] # pylint: disable=C0103 179 | 180 | # group records by account for more efficient processing 181 | LOG.debug(f'[@] Update Records: {records}') 182 | 183 | capture_update_records(update_records) 184 | -------------------------------------------------------------------------------- /terraform/dynamodb/main.tf: -------------------------------------------------------------------------------- 1 | terraform { 2 | backend "s3" { 3 | // Set this to where your Terraform S3 bucket is located (using us-west-2 as the example): 4 | region = "us-west-2" 5 | } 6 | } 7 | // ---------------------------- 8 | 9 | // ---------------------------- 10 | // Set up AWS for the primary region (this one is the main account where most API calls will be based from): 11 | provider "aws" { 12 | version = "1.39" 13 | 14 | // Set the region to where you need it: us-west-2 is the example: 15 | "region" = "us-west-2" 16 | } 17 | 18 | // Alias providers for the specifc tables: 19 | provider "aws" { 20 | version = "1.39" 21 | 22 | // This is the PRIMARY REGION ALIAS (us-west-2 is the example): 23 | "alias" = "us-west-2" 24 | "region" = "us-west-2" 25 | } 26 | 27 | // Create aliases for the SECONDARY REGIONS: 28 | provider "aws" { 29 | version = "1.39" 30 | 31 | "alias" = "us-east-1" 32 | "region" = "us-east-1" 33 | } 34 | 35 | provider "aws" { 36 | version = "1.39" 37 | 38 | "alias" = "eu-west-1" 39 | "region" = "eu-west-1" 40 | } 41 | 42 | // ------------ CURRENT TABLES ---------------- 43 | // Create the Current tables for all regions: 44 | resource "aws_dynamodb_table" "current_table_primary" { 45 | provider = "aws.us-west-2" // Set this to the alias pointed to for the PRIMARY REGION ALIAS 46 | 47 | name = "${var.CURRENT_TABLE}" 48 | read_capacity = "${var.CURRENT_TABLE_READ_CAP}" 49 | write_capacity = "${var.CURRENT_TABLE_WRITE_CAP}" 50 | hash_key = "arn" 51 | stream_enabled = true 52 | stream_view_type = "NEW_AND_OLD_IMAGES" 53 | 54 | attribute { 55 | name = "arn" 56 | type = "S" 57 | } 58 | 59 | ttl { 60 | attribute_name = "ttl" 61 | enabled = true 62 | } 63 | } 64 | 65 | // SET UP YOUR SECONDARY REGION TABLES HERE: 66 | resource "aws_dynamodb_table" "current_table_secondary_1" { 67 | provider = "aws.us-east-1" // Set this to the alias for your secondary table 68 | 69 | name = "${var.CURRENT_TABLE}" 70 | read_capacity = "${var.CURRENT_TABLE_READ_CAP}" 71 | write_capacity = "${var.CURRENT_TABLE_WRITE_CAP}" 72 | hash_key = "arn" 73 | stream_enabled = true 74 | stream_view_type = "NEW_AND_OLD_IMAGES" 75 | 76 | attribute { 77 | name = "arn" 78 | type = "S" 79 | } 80 | 81 | ttl { 82 | attribute_name = "ttl" 83 | enabled = true 84 | } 85 | } 86 | 87 | resource "aws_dynamodb_table" "current_table_secondary_2" { 88 | provider = "aws.eu-west-1" // Set this to the alias for your secondary table 89 | 90 | name = "${var.CURRENT_TABLE}" 91 | read_capacity = "${var.CURRENT_TABLE_READ_CAP}" 92 | write_capacity = "${var.CURRENT_TABLE_WRITE_CAP}" 93 | hash_key = "arn" 94 | stream_enabled = true 95 | stream_view_type = "NEW_AND_OLD_IMAGES" 96 | 97 | attribute { 98 | name = "arn" 99 | type = "S" 100 | } 101 | 102 | ttl { 103 | attribute_name = "ttl" 104 | enabled = true 105 | } 106 | } 107 | 108 | 109 | // GLOBAL DYNAMO TABLE: 110 | resource "aws_dynamodb_global_table" "current_table" { 111 | // Set these to the proper tables above: 112 | depends_on = [ 113 | "aws_dynamodb_table.current_table_primary", 114 | "aws_dynamodb_table.current_table_secondary_1", 115 | "aws_dynamodb_table.current_table_secondary_2"] 116 | 117 | name = "${var.CURRENT_TABLE}" 118 | 119 | // Set the primary and secondary regions below: 120 | replica = { 121 | region_name = "us-west-2" 122 | } 123 | 124 | replica = { 125 | region_name = "us-east-1" 126 | } 127 | 128 | replica = { 129 | region_name = "eu-west-1" 130 | } 131 | } 132 | // ---------------------------- 133 | 134 | // ------------ DURABLE TABLES ---------------- 135 | 136 | // Create the Durable table: 137 | resource "aws_dynamodb_table" "durable_table_primary" { 138 | provider = "aws.us-west-2" // Set this to the alias pointed to for the PRIMARY REGION ALIAS 139 | 140 | name = "${var.DURABLE_TABLE}" 141 | read_capacity = "${var.DURABLE_TABLE_READ_CAP}" 142 | write_capacity = "${var.DURABLE_TABLE_WRITE_CAP}" 143 | hash_key = "arn" 144 | range_key = "eventTime" 145 | stream_enabled = true 146 | stream_view_type = "NEW_AND_OLD_IMAGES" 147 | 148 | attribute { 149 | name = "arn" 150 | type = "S" 151 | } 152 | 153 | attribute { 154 | name = "eventTime" 155 | type = "S" 156 | } 157 | } 158 | 159 | resource "aws_dynamodb_table" "durable_table_secondary_1" { 160 | provider = "aws.us-east-1" // Set this to the alias for your secondary table 161 | 162 | name = "${var.DURABLE_TABLE}" 163 | read_capacity = "${var.DURABLE_TABLE_READ_CAP}" 164 | write_capacity = "${var.DURABLE_TABLE_WRITE_CAP}" 165 | hash_key = "arn" 166 | range_key = "eventTime" 167 | stream_enabled = true 168 | stream_view_type = "NEW_AND_OLD_IMAGES" 169 | 170 | attribute { 171 | name = "arn" 172 | type = "S" 173 | } 174 | 175 | attribute { 176 | name = "eventTime" 177 | type = "S" 178 | } 179 | } 180 | 181 | resource "aws_dynamodb_table" "durable_table_secondary_2" { 182 | provider = "aws.eu-west-1" // Set this to the alias for your secondary table 183 | 184 | name = "${var.DURABLE_TABLE}" 185 | read_capacity = "${var.DURABLE_TABLE_READ_CAP}" 186 | write_capacity = "${var.DURABLE_TABLE_WRITE_CAP}" 187 | hash_key = "arn" 188 | range_key = "eventTime" 189 | stream_enabled = true 190 | stream_view_type = "NEW_AND_OLD_IMAGES" 191 | 192 | attribute { 193 | name = "arn" 194 | type = "S" 195 | } 196 | 197 | attribute { 198 | name = "eventTime" 199 | type = "S" 200 | } 201 | } 202 | 203 | // GLOBAL DYNAMO TABLE: 204 | resource "aws_dynamodb_global_table" "durable_table" { 205 | // Set these to the proper tables above: 206 | depends_on = [ 207 | "aws_dynamodb_table.durable_table_primary", 208 | "aws_dynamodb_table.durable_table_secondary_1", 209 | "aws_dynamodb_table.durable_table_secondary_2" 210 | ] 211 | 212 | name = "${var.DURABLE_TABLE}" 213 | 214 | // Set the primary and secondary regions below: 215 | replica = { 216 | region_name = "us-west-2" 217 | } 218 | 219 | replica = { 220 | region_name = "us-east-1" 221 | } 222 | 223 | replica = { 224 | region_name = "eu-west-1" 225 | } 226 | } 227 | // ----------------------------- 228 | -------------------------------------------------------------------------------- /mkdocs/docs/installation/configuration.md: -------------------------------------------------------------------------------- 1 | # Historical Environment Variables & Configuration 2 | 3 | Below is a reference of all of the environment variables that Historical makes use of, and the required/default status of them: 4 | 5 | Most of these variables are found in: 6 | 7 | - [`historical/constants.py`](https://github.com/Netflix-Skunkworks/historical/blob/master/historical/constants.py) 8 | - [`historical/mapping/__init__.py`](https://github.com/Netflix-Skunkworks/historical/blob/master/historical/mapping/__init__.py) 9 | 10 | **NOTE: All environment variables are Strings** 11 | 12 | ## Required Fields 13 | The fields below are required and **MUST** be configured by you in your Terraform templates: 14 | 15 | | Variable | Where to set | Sample Value | 16 | |:----------:|:-------------|:-------------| 17 | |`PRIMARY_REGION`|Per-stack Terraform template
`variable PRIMARY_REGION`|`us-west-2`| 18 | |`POLLING_REGIONS`|Per-stack Terraform template
`variable POLLING_REGIONS`|`["us-west-2", "us-east-1", "eu-west-1"]`
This should be set to the secondary regions for most stacks.

S3 is the exception since it's a "global" namespace.
For S3, this is always set to the `PRIMARY_REGION`.

This populates the `POLL_REGIONS` env. var for the
Poller Lambdas.| 19 | |`REGION`|Infrastructure `main.tf`
This is a variable supplied
to Terraform in the
application of the template.|This value is used to determine if the current region
of the deployment is the primary region or a secondary region.| 20 | |`PROXY_REGIONS`|Per-stack Terraform template
`current_proxy_env_vars` and `durable_proxy_env_vars`|`us-east-1,eu-west-1,us-east-2,etc.`
This is a comma-separated string of regions.

The `current_proxy_env_vars` for the `PRIMARY_REGION` needs to be configured to contain the `PRIMARY_REGION` and all the "off-regions".

The `durable_proxy_env_vars` should contain ALL
the regions (default).| 21 | |`HISTORICAL_TECHNOLOGY`|Per-stack Terraform template
`durable_proxy_env_vars`|`s3` or `securitygroup`. This should be set in each sample stack properly.| 22 | |`SIMPLE_DURABLE_PROXY`|Per-stack Terraform template
`durable_proxy_env_vars`|`True` - This is the default value for the Durable Proxy.
Don't change this.

This value toggles whether the DynamoDB
stream events will be serialized nicely for downstream consumption or not.| 23 | |`ENABLED_ACCOUNTS`|Per-stack Terraform template
`env_vars`|`ACCOUNTID1,ACCOUNTID2,etc.`
If you are not making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client), then you need to set this.| 24 | |`SWAG_BUCKET`|Per-stack Terraform template
`env_vars`|`some-s3-bucket-name`
Required if you are making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client).| 25 | |`SWAG_DATA_FILE`|Per-stack Terraform template
`env_vars`|`v2/accounts.json`
Required if you are making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client).
Points to where the `accounts.json` file is located.| 26 | |`SWAG_OWNER`|Per-stack Terraform template
`env_vars`|`yourcompany`
Required if you are making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client).
The entity that owns the accounts you are monitoring.| 27 | |`SWAG_REGION`|Per-stack Terraform template
`env_vars`|`us-west-2`
Required if you are making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client).
The region the `SWAG_BUCKET` is located.| 28 | 29 | ### Default Required Fields 30 | These are fields that are required, but the default values are sufficient. These are not set in the Terraform templates. 31 | 32 | | Variable | Description & Defaults | 33 | |:----------:|:-------------| 34 | |`CURRENT_REGION`|This is populated by the `AWS_DEFAULT_REGION` environment variable provided by Lambda. This will be set to the region that the Lambda function is running in.| 35 | |`TTL_EXPIRY`|Default: `86400` seconds. This is the TTL for an item in the Current Table. This is used to account for missing deletion events.| 36 | |`HISTORICAL_ROLE`|Default: `Historical`. Don't change this -- this is the name of the IAM role that Historical needs to assume to describe resources.| 37 | |`REGION_ATTR`|Default: `Region`. Don't change this -- this is the name of the region attribute in the DynamoDB table.| 38 | |`EVENT_TOO_BIG_FLAG`|Default: `event_too_big`. Don't change this -- this is a field name that informs Historical downstream functions if an event is too big to fit in SNS and SQS (>256KB).| 39 | 40 | ## Optional Fields 41 | 42 | | Variable | Where to set | Sample Value | 43 | |:----------:|:-------------|:-------------| 44 | |`RANDOMIZE_POLLER`|Per-stack Terraform template
`poller_env_vars`|0 <= value <= 900. Number of seconds to delay
Polling messages in SQS.

It is recommended you set this to `"900"` for the Poller.| 45 | |`LOGGING_LEVEL`|Per-stack Terraform template
`env_vars`|[Any one of these values](https://github.com/Netflix-Skunkworks/historical/blob/master/historical/constants.py#L13-L17). `DEBUG` is recommended.| 46 | |`TEST_ACCOUNTS_ONLY`|Per-stack Terraform template
`env_vars`|Default `False`. This is used if you are making use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client).

Set this to `True` if you want your stack to _ONLY_ query
against "test" accounts. Useful for having
"test" and "prod" stacks.| 47 | |`PROXY_BATCH_SIZE`|Per-stack Terraform template
`current_proxy_env_vars`.|Default: `10`. Set this if the batched event size is too
big (>256KB) to send to SQS. This should be refactored
in the future so that this is not necessary.| 48 | |`SENTRY_DSN`|Per-stack Terraform template
`env_vars`|If you make use of [Sentry](https://sentry.io/), then set this to your DSN.

Historical makes use of the [`raven-python-lambda`](https://github.com/Netflix-Skunkworks/raven-python-lambda) for Sentry.
You can also optionally use SQS as a transport layer for
Sentry messages via [`raven-sqs-proxy`](https://github.com/Netflix-Skunkworks/raven-sqs-proxy).| 49 | |Custom Tags|Per-stack Terraform template
`tags`|Add in a name-value pair of tags you want to affix
to your Lambda functions.| 50 | 51 | 52 | ## Docker Installer Specific Fields 53 | The fields below are specific for installation and uninstallation of Historical via the Docker container. These values are present in the [`terraform/SAMPLE-env.list`](https://github.com/Netflix-Skunkworks/historical/blob/master/terraform/SAMPLE-env.list) file. 54 | 55 | **ALL FIELDS BELOW ARE REQUIRED** 56 | 57 | | Variable | Sample Value | 58 | |:----------:|:-------------| 59 | |`AWS_ACCESS_KEY_ID`|The AWS Access Key ID for the credential that will be used to run Terraform. This is for a very powerful IAM Role.| 60 | |`AWS_SECRET_ACCESS_KEY`|The AWS Secret Access Key for the credential that will be used to run Terraform. This is for a very powerful IAM Role.| 61 | |`AWS_SESSION_TOKEN`|The AWS Session Token for the credential that will be used to run Terraform. This is for a very powerful IAM Role.| 62 | |`TECH`|The Historical resource type for the stack in question. Either `s3` or `securitygroup` (for now).| 63 | |`PRIMARY_REGION`|The Primary Region of your Historical Stack.| 64 | |`SECONDARY_REGIONS`|The Secondary Regions of your Historical Stack. This is a comma separated string.| 65 | 66 | 67 | ## Next Steps 68 | [Please return to the Installation documentation](../). 69 | -------------------------------------------------------------------------------- /historical/security_group/collector.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.security_group.collector 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | """ 8 | import logging 9 | 10 | from botocore.exceptions import ClientError 11 | from pynamodb.exceptions import DeleteError 12 | 13 | from raven_python_lambda import RavenLambdaWrapper 14 | 15 | from cloudaux.aws.ec2 import describe_security_groups 16 | 17 | from historical.common.sqs import group_records_by_type 18 | from historical.constants import HISTORICAL_ROLE, LOGGING_LEVEL 19 | from historical.common import cloudwatch 20 | from historical.common.util import deserialize_records, pull_tag_dict 21 | from historical.security_group.models import CurrentSecurityGroupModel, VERSION 22 | 23 | logging.basicConfig() 24 | LOG = logging.getLogger('historical') 25 | LOG.setLevel(LOGGING_LEVEL) 26 | 27 | 28 | UPDATE_EVENTS = [ 29 | 'AuthorizeSecurityGroupEgress', 30 | 'AuthorizeSecurityGroupIngress', 31 | 'RevokeSecurityGroupEgress', 32 | 'RevokeSecurityGroupIngress', 33 | 'CreateSecurityGroup', 34 | 'PollSecurityGroups' 35 | ] 36 | 37 | DELETE_EVENTS = [ 38 | 'DeleteSecurityGroup' 39 | ] 40 | 41 | 42 | def get_arn(group_id, region, account_id): 43 | """Creates a security group ARN.""" 44 | return f'arn:aws:ec2:{region}:{account_id}:security-group/{group_id}' 45 | 46 | 47 | def describe_group(record, region): 48 | """Attempts to describe group ids.""" 49 | account_id = record['account'] 50 | group_name = cloudwatch.filter_request_parameters('groupName', record) 51 | vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 52 | group_id = cloudwatch.filter_request_parameters('groupId', record, look_in_response=True) 53 | 54 | # Did this get collected already by the poller? 55 | if cloudwatch.get_collected_details(record): 56 | LOG.debug(f"[<--] Received already collected security group data: {record['detail']['collected']}") 57 | return [record['detail']['collected']] 58 | 59 | try: 60 | # Always depend on Group ID first: 61 | if group_id: # pylint: disable=R1705 62 | return describe_security_groups( 63 | account_number=account_id, 64 | assume_role=HISTORICAL_ROLE, 65 | region=region, 66 | GroupIds=[group_id] 67 | )['SecurityGroups'] 68 | 69 | elif vpc_id and group_name: 70 | return describe_security_groups( 71 | account_number=account_id, 72 | assume_role=HISTORICAL_ROLE, 73 | region=region, 74 | Filters=[ 75 | { 76 | 'Name': 'group-name', 77 | 'Values': [group_name] 78 | }, 79 | { 80 | 'Name': 'vpc-id', 81 | 'Values': [vpc_id] 82 | } 83 | ] 84 | )['SecurityGroups'] 85 | 86 | else: 87 | raise Exception('[X] Did not receive Group ID or VPC/Group Name pairs. ' 88 | f'We got: ID: {group_id} VPC/Name: {vpc_id}/{group_name}.') 89 | except ClientError as exc: 90 | if exc.response['Error']['Code'] == 'InvalidGroup.NotFound': 91 | return [] 92 | raise exc 93 | 94 | 95 | def create_delete_model(record): 96 | """Create a security group model from a record.""" 97 | data = cloudwatch.get_historical_base_info(record) 98 | 99 | group_id = cloudwatch.filter_request_parameters('groupId', record) 100 | # vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 101 | # group_name = cloudwatch.filter_request_parameters('groupName', record) 102 | 103 | arn = get_arn(group_id, cloudwatch.get_region(record), record['account']) 104 | 105 | LOG.debug(f'[-] Deleting Dynamodb Records. Hash Key: {arn}') 106 | 107 | # Tombstone these records so that the deletion event time can be accurately tracked. 108 | data.update({'configuration': {}}) 109 | 110 | items = list(CurrentSecurityGroupModel.query(arn, limit=1)) 111 | 112 | if items: 113 | model_dict = items[0].__dict__['attribute_values'].copy() 114 | model_dict.update(data) 115 | model = CurrentSecurityGroupModel(**model_dict) 116 | model.save() 117 | return model 118 | 119 | return None 120 | 121 | 122 | def capture_delete_records(records): 123 | """Writes all of our delete events to DynamoDB.""" 124 | for rec in records: 125 | model = create_delete_model(rec) 126 | if model: 127 | try: 128 | model.delete(condition=(CurrentSecurityGroupModel.eventTime <= rec['detail']['eventTime'])) 129 | except DeleteError: 130 | LOG.warning(f'[X] Unable to delete security group. Security group does not exist. Record: {rec}') 131 | else: 132 | LOG.warning(f'[?] Unable to delete security group. Security group does not exist. Record: {rec}') 133 | 134 | 135 | def capture_update_records(records): 136 | """Writes all updated configuration info to DynamoDB""" 137 | for rec in records: 138 | data = cloudwatch.get_historical_base_info(rec) 139 | group = describe_group(rec, cloudwatch.get_region(rec)) 140 | 141 | if len(group) > 1: 142 | raise Exception(f'[X] Multiple groups found. Record: {rec}') 143 | 144 | if not group: 145 | LOG.warning(f'[?] No group information found. Record: {rec}') 146 | continue 147 | 148 | group = group[0] 149 | 150 | # Determine event data for group - and pop off items that are going to the top-level: 151 | LOG.debug(f'Processing group. Group: {group}') 152 | data.update({ 153 | 'GroupId': group['GroupId'], 154 | 'GroupName': group.pop('GroupName'), 155 | 'VpcId': group.pop('VpcId', None), 156 | 'arn': get_arn(group.pop('GroupId'), cloudwatch.get_region(rec), group.pop('OwnerId')), 157 | 'Region': cloudwatch.get_region(rec) 158 | }) 159 | 160 | data['Tags'] = pull_tag_dict(group) 161 | 162 | # Set the remaining items to the configuration: 163 | data['configuration'] = group 164 | 165 | # Set the version: 166 | data['version'] = VERSION 167 | 168 | LOG.debug(f'[+] Writing Dynamodb Record. Records: {data}') 169 | current_revision = CurrentSecurityGroupModel(**data) 170 | current_revision.save() 171 | 172 | 173 | @RavenLambdaWrapper() 174 | def handler(event, context): # pylint: disable=W0613 175 | """ 176 | Historical security group event collector. 177 | This collector is responsible for processing Cloudwatch events and polling events. 178 | """ 179 | records = deserialize_records(event['Records']) 180 | 181 | # Split records into two groups, update and delete. 182 | # We don't want to query for deleted records. 183 | update_records, delete_records = group_records_by_type(records, UPDATE_EVENTS) 184 | capture_delete_records(delete_records) 185 | 186 | # filter out error events 187 | update_records = [e for e in update_records if not e['detail'].get('errorCode')] 188 | 189 | # group records by account for more efficient processing 190 | LOG.debug(f'[@] Update Records: {records}') 191 | 192 | capture_update_records(update_records) 193 | -------------------------------------------------------------------------------- /historical/models.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.models 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Kevin Glisson 7 | .. author:: Mike Grima 8 | """ 9 | import time 10 | from datetime import datetime 11 | 12 | from marshmallow import fields, Schema 13 | from pynamodb.models import Model 14 | from pynamodb.attributes import ListAttribute, MapAttribute, NumberAttribute, UnicodeAttribute 15 | 16 | from historical.attributes import EventTimeAttribute, fix_decimals, HistoricalDecimalAttribute 17 | from historical.constants import TTL_EXPIRY 18 | 19 | 20 | EPHEMERAL_PATHS = [] 21 | 22 | 23 | def default_ttl(): 24 | """Return the default TTL as an int.""" 25 | return int(time.time() + TTL_EXPIRY) 26 | 27 | 28 | def default_event_time(): 29 | """Get the current time and format it for the event time.""" 30 | return datetime.utcnow().replace(tzinfo=None, microsecond=0).isoformat() + 'Z' 31 | 32 | 33 | class BaseHistoricalModel(Model): 34 | """This is the base Historical DynamoDB model. All Historical PynamoDB models should subclass this.""" 35 | 36 | # pylint: disable=R1701 37 | def __iter__(self): 38 | """Properly serialize the PynamoDB object as a `dict` via this function. 39 | Helper for serializing into a typical `dict`. See: https://github.com/pynamodb/PynamoDB/issues/152 40 | """ 41 | for name, attr in self.get_attributes().items(): 42 | try: 43 | if isinstance(attr, MapAttribute): 44 | name, obj = name, getattr(self, name).as_dict() 45 | yield name, fix_decimals(obj) # Don't forget to remove the stupid decimals :/ 46 | elif isinstance(attr, NumberAttribute) or isinstance(attr, HistoricalDecimalAttribute): 47 | yield name, int(attr.serialize(getattr(self, name))) 48 | elif isinstance(attr, ListAttribute): 49 | name, obj = name, [el.as_dict() for el in getattr(self, name)] 50 | yield name, fix_decimals(obj) # Don't forget to remove the stupid decimals :/ 51 | else: 52 | yield name, attr.serialize(getattr(self, name)) 53 | 54 | # For Nulls: 55 | except AttributeError: 56 | yield name, None 57 | 58 | 59 | class DurableHistoricalModel(BaseHistoricalModel): 60 | """The base Historical Durable (Differ) Table model base class.""" 61 | 62 | eventTime = EventTimeAttribute(range_key=True, default=default_event_time) 63 | 64 | 65 | class CurrentHistoricalModel(BaseHistoricalModel): 66 | """The base Historical Current Table model base class.""" 67 | 68 | eventTime = EventTimeAttribute(default=default_event_time) 69 | ttl = NumberAttribute(default=default_ttl()) 70 | eventSource = UnicodeAttribute() 71 | 72 | 73 | class AWSHistoricalMixin(BaseHistoricalModel): 74 | """This is the main Historical event mixin. All the major required (and optional) fields are here.""" 75 | 76 | arn = UnicodeAttribute(hash_key=True) 77 | accountId = UnicodeAttribute() 78 | configuration = MapAttribute() 79 | Tags = MapAttribute() 80 | version = HistoricalDecimalAttribute() 81 | userIdentity = MapAttribute(null=True) 82 | principalId = UnicodeAttribute(null=True) 83 | userAgent = UnicodeAttribute(null=True) 84 | sourceIpAddress = UnicodeAttribute(null=True) 85 | requestParameters = MapAttribute(null=True) 86 | eventName = UnicodeAttribute(null=True) 87 | 88 | 89 | class HistoricalPollingEventDetail(Schema): 90 | """This is the Marshmallow schema for a Polling event. This is made to look like a CloudWatch Event.""" 91 | 92 | # You must replace these: 93 | event_source = fields.Str(dump_to='eventSource', load_from='eventSource', required=True) 94 | event_name = fields.Str(dump_to='eventName', load_from='eventName', required=True) 95 | request_parameters = fields.Dict(dump_to='requestParameters', load_from='requestParameters', required=True) 96 | 97 | # This field is for technologies that lack a "list" method. For those technologies, the tasked poller 98 | # will perform all the describes and embed the major configuration details into this field: 99 | collected = fields.Dict(dump_to='collected', load_from='collected', required=False) 100 | # ^^ The collector will then need to look for this and figure out how to save it to DDB. 101 | 102 | event_time = fields.Str(dump_to='eventTime', load_from='eventTime', required=True, 103 | default=default_event_time, missing=default_event_time) 104 | 105 | 106 | class HistoricalPollingBaseModel(Schema): 107 | """This is a Marshmallow schema that holds objects that were described in the Poller. 108 | 109 | Data here will be passed onto the Collector so that the Collector need not fetch new 110 | data from AWS. 111 | """ 112 | 113 | version = fields.Str(required=True) 114 | account = fields.Str(required=True) 115 | 116 | detail_type = fields.Str(load_from='detail-type', dump_to='detail-type', required=True, 117 | missing='Poller', default='Poller') 118 | source = fields.Str(required=True, missing='historical', default='historical') 119 | time = fields.Str(required=True, default=default_event_time, missing=default_event_time) 120 | 121 | # You must replace this: 122 | detail = fields.Nested(HistoricalPollingEventDetail, required=True) 123 | 124 | 125 | class HistoricalPollerTaskEventModel(Schema): 126 | """This is a Marshmallow schema that will trigger the Poller to perform the List/Describe AWS API calls. 127 | 128 | This informs the Poller which account and region to list/describe against. If a next_token is specified, then it 129 | will properly list/describe from from that pagination marker. 130 | """ 131 | 132 | account_id = fields.Str(required=True) 133 | region = fields.Str(required=True) 134 | next_token = fields.Str(load_from='NextToken', dump_to='NextToken') 135 | 136 | def serialize_me(self, account_id, region, next_token=None): 137 | """Dumps the proper JSON for the schema. 138 | 139 | :param account_id: 140 | :param region: 141 | :param next_token: 142 | :return: 143 | """ 144 | payload = { 145 | 'account_id': account_id, 146 | 'region': region 147 | } 148 | 149 | if next_token: 150 | payload['next_token'] = next_token 151 | 152 | return self.dumps(payload).data 153 | 154 | 155 | class SimpleDurableSchema(Schema): 156 | """This is a Marshmallow schema that represents a simplified serialized dict of the Durable Proxy events. 157 | 158 | This is so that downstream consumers of Historical events need-not worry too much about DynamoDB. This is a 159 | fully-outlined dict of all the data for representing a given technology. This will specify if the object was 160 | too big for SNS/SQS delivery. 161 | """ 162 | 163 | arn = fields.Str(required=True) 164 | event_time = fields.Str(required=True, default=default_event_time) 165 | tech = fields.Str(required=True) 166 | event_too_big = fields.Boolean(required=False) 167 | item = fields.Dict(required=False) 168 | 169 | def serialize_me(self, arn, event_time, tech, item=None): 170 | """Dumps the proper JSON for the schema. If the event is too big, then don't include the item. 171 | 172 | :param arn: 173 | :param event_time: 174 | :param tech: 175 | :param item: 176 | :return: 177 | """ 178 | payload = { 179 | 'arn': arn, 180 | 'event_time': event_time, 181 | 'tech': tech 182 | } 183 | 184 | if item: 185 | payload['item'] = item 186 | 187 | else: 188 | payload['event_too_big'] = True 189 | 190 | return self.dumps(payload).data.replace('', '') 191 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/{{cookiecutter.technology_slug}}/collector.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: {{cookiecutter.technology_slug}}.collector 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: {{cookiecutter.author}} <{{cookiecutter.email}}> 7 | """ 8 | import os 9 | import logging 10 | 11 | from pynamodb.exceptions import DeleteError 12 | 13 | from raven_python_lambda import RavenLambdaWrapper 14 | 15 | from historical.common import cloudwatch 16 | from historical.common.kinesis import deserialize_records 17 | from .models import Current{{cookiecutter.technology_slug | titlecase}}Model 18 | 19 | logging.basicConfig() 20 | log = logging.getLogger('historical') 21 | level = logging.getLevelName(os.environ.get('HISTORICAL_LOGGING_LEVEL', 'WARNING')) 22 | log.setLevel(level) 23 | 24 | 25 | # TODO update with your events 26 | UPDATE_EVENTS = [ 27 | 'HistoricalPoller' 28 | ] 29 | 30 | DELETE_EVENTS = [ 31 | 32 | ] 33 | 34 | 35 | def get_arn(id, account): 36 | """Gets arn for {{cookiecutter.technology_name}}""" 37 | # TODO make ARN for technology 38 | # Example:: 39 | # return 'arn:aws:ec2:{region}:{account_id}:security-group/{group_id}'.format( 40 | # group_id=group_id, 41 | # region=CURRENT_REGION, 42 | # account_id=account_id 43 | # ) 44 | return 45 | 46 | 47 | def group_records_by_type(records): 48 | """Break records into two lists; create/update events and delete events.""" 49 | update_records, delete_records = [], [] 50 | for r in records: 51 | if isinstance(r, str): 52 | break 53 | 54 | if r['detail']['eventName'] in UPDATE_EVENTS: 55 | update_records.append(r) 56 | else: 57 | delete_records.append(r) 58 | return update_records, delete_records 59 | 60 | 61 | def describe_technology(record): 62 | """Attempts to describe {{cookiecutter.technology_name}} ids.""" 63 | account_id = record['account'] 64 | 65 | # TODO describe the technology item 66 | # Example:: 67 | # group_name = cloudwatch.filter_request_parameters('groupName', record) 68 | # vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 69 | # group_id = cloudwatch.filter_request_parameters('groupId', record) 70 | # 71 | # try: 72 | # if vpc_id and group_name: 73 | # return describe_security_groups( 74 | # account_number=account_id, 75 | # assume_role=HISTORICAL_ROLE, 76 | # region=CURRENT_REGION, 77 | # Filters=[ 78 | # { 79 | # 'Name': 'group-name', 80 | # 'Values': [group_name] 81 | # }, 82 | # { 83 | # 'Name': 'vpc-id', 84 | # 'Values': [vpc_id] 85 | # } 86 | # ] 87 | # )['SecurityGroups'] 88 | # elif group_id: 89 | # return describe_security_groups( 90 | # account_number=account_id, 91 | # assume_role=HISTORICAL_ROLE, 92 | # region=CURRENT_REGION, 93 | # GroupIds=[group_id] 94 | # )['SecurityGroups'] 95 | # else: 96 | # raise Exception('Describe requires a groupId or a groupName and VpcId.') 97 | # except ClientError as e: 98 | # if e.response['Error']['Code'] == 'InvalidGroup.NotFound': 99 | # return [] 100 | # raise e 101 | 102 | return 103 | 104 | 105 | def create_delete_model(record): 106 | """Create a {{cookiecutter.technology_name}} model from a record.""" 107 | data = cloudwatch.get_historical_base_info(record) 108 | 109 | # TODO get tech ID 110 | # Example:: 111 | # group_id = cloudwatch.filter_request_parameters('groupId', record) 112 | # vpc_id = cloudwatch.filter_request_parameters('vpcId', record) 113 | # group_name = cloudwatch.filter_request_parameters('groupName', record) 114 | 115 | tech_id = None 116 | arn = get_arn(tech_id, record['account']) 117 | 118 | log.debug('Deleting Dynamodb Records. Hash Key: {arn}'.format(arn=arn)) 119 | 120 | # tombstone these records so that the deletion event time can be accurately tracked. 121 | data.update({ 122 | 'configuration': {} 123 | }) 124 | 125 | items = list(Current{{cookiecutter.technology_slug | titlecase}}Model.query(arn, limit=1)) 126 | 127 | if items: 128 | model_dict = items[0].__dict__['attribute_values'].copy() 129 | model_dict.update(data) 130 | model = Current{{cookiecutter.technology_slug | titlecase }}Model(**model_dict) 131 | model.save() 132 | return model 133 | 134 | 135 | def capture_delete_records(records): 136 | """Writes all of our delete events to DynamoDB.""" 137 | for r in records: 138 | model = create_delete_model(r) 139 | if model: 140 | try: 141 | model.delete(eventTime__le=r['detail']['eventTime']) 142 | except DeleteError as e: 143 | log.warning('Unable to delete {{cookiecutter.technology_name}}. {{cookiecutter.technology_name}} does not exist. Record: {record}'.format( 144 | record=r 145 | )) 146 | else: 147 | log.warning('Unable to delete {{cookiecutter.technology_name}}. {{cookiecutter.technology_name}} does not exist. Record: {record}'.format( 148 | record=r 149 | )) 150 | 151 | 152 | def capture_update_records(records): 153 | """Writes all updated configuration info to DynamoDB""" 154 | for record in records: 155 | data = cloudwatch.get_historical_base_info(record) 156 | items = describe_technology(record) 157 | 158 | if len(items) > 1: 159 | raise Exception('Multiple items found. Record: {record}'.format(record=record)) 160 | 161 | if not items: 162 | log.warning('No technology information found. Record: {record}'.format(record=record)) 163 | continue 164 | 165 | item = items[0] 166 | 167 | # determine event data for group 168 | log.debug('Processing item. Group: {}'.format(item)) 169 | 170 | # TODO update data 171 | # Example:: 172 | # data.update({ 173 | # 'GroupId': item['GroupId'], 174 | # 'GroupName': item['GroupName'], 175 | # 'Description': item['Description'], 176 | # 'VpcId': item.get('VpcId'), 177 | # 'Tags': item.get('Tags', []), 178 | # 'arn': get_arn(item['GroupId'], item['OwnerId']), 179 | # 'OwnerId': item['OwnerId'], 180 | # 'configuration': item, 181 | # 'Region': cloudwatch.get_region(record) 182 | # }) 183 | 184 | log.debug('Writing Dynamodb Record. Records: {record}'.format(record=data)) 185 | 186 | current_revision = Current{{cookiecutter.technology_slug | titlecase}}Model(**data) 187 | current_revision.save() 188 | 189 | 190 | @RavenLambdaWrapper() 191 | def handler(event, context): 192 | """ 193 | Historical {{cookiecutter.technology_name}} event collector. 194 | This collector is responsible for processing Cloudwatch events and polling events. 195 | """ 196 | records = deserialize_records(event['Records']) 197 | 198 | # Split records into two groups, update and delete. 199 | # We don't want to query for deleted records. 200 | update_records, delete_records = group_records_by_type(records) 201 | capture_delete_records(delete_records) 202 | 203 | # filter out error events 204 | update_records = [e for e in update_records if not e['detail'].get('errorCode')] 205 | 206 | # group records by account for more efficient processing 207 | log.debug('Update Records: {records}'.format(records=records)) 208 | 209 | capture_update_records(update_records) 210 | -------------------------------------------------------------------------------- /historical/s3/collector.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.s3.collector 3 | :platform: Unix 4 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import logging 9 | from itertools import groupby 10 | 11 | from botocore.exceptions import ClientError 12 | from pynamodb.exceptions import PynamoDBConnectionError 13 | from raven_python_lambda import RavenLambdaWrapper 14 | from cloudaux.orchestration.aws.s3 import get_bucket 15 | 16 | from historical.common.sqs import group_records_by_type 17 | from historical.constants import CURRENT_REGION, HISTORICAL_ROLE, LOGGING_LEVEL 18 | from historical.common import cloudwatch 19 | from historical.common.util import deserialize_records 20 | from historical.s3.models import CurrentS3Model, VERSION 21 | 22 | logging.basicConfig() 23 | LOG = logging.getLogger('historical') 24 | LOG.setLevel(LOGGING_LEVEL) 25 | 26 | 27 | UPDATE_EVENTS = [ 28 | 'PollS3', # Polling event 29 | 'DeleteBucketCors', 30 | 'DeleteBucketLifecycle', 31 | 'DeleteBucketPolicy', 32 | 'DeleteBucketReplication', 33 | 'DeleteBucketTagging', 34 | 'DeleteBucketWebsite', 35 | 'CreateBucket', 36 | 'PutBucketAcl', 37 | 'PutBucketCors', 38 | 'PutBucketLifecycle', 39 | 'PutBucketPolicy', 40 | 'PutBucketLogging', 41 | 'PutBucketNotification', 42 | 'PutBucketReplication', 43 | 'PutBucketTagging', 44 | 'PutBucketRequestPayment', 45 | 'PutBucketVersioning', 46 | 'PutBucketWebsite' 47 | ] 48 | 49 | 50 | DELETE_EVENTS = [ 51 | 'DeleteBucket', 52 | ] 53 | 54 | 55 | def create_delete_model(record): 56 | """Create an S3 model from a record.""" 57 | arn = f"arn:aws:s3:::{cloudwatch.filter_request_parameters('bucketName', record)}" 58 | LOG.debug(f'[-] Deleting Dynamodb Records. Hash Key: {arn}') 59 | 60 | data = { 61 | 'arn': arn, 62 | 'principalId': cloudwatch.get_principal(record), 63 | 'userIdentity': cloudwatch.get_user_identity(record), 64 | 'accountId': record['account'], 65 | 'eventTime': record['detail']['eventTime'], 66 | 'BucketName': cloudwatch.filter_request_parameters('bucketName', record), 67 | 'Region': cloudwatch.get_region(record), 68 | 'Tags': {}, 69 | 'configuration': {}, 70 | 'eventSource': record['detail']['eventSource'], 71 | 'version': VERSION 72 | } 73 | 74 | return CurrentS3Model(**data) 75 | 76 | 77 | def process_delete_records(delete_records): 78 | """Process the requests for S3 bucket deletions""" 79 | for rec in delete_records: 80 | arn = f"arn:aws:s3:::{rec['detail']['requestParameters']['bucketName']}" 81 | 82 | # Need to check if the event is NEWER than the previous event in case 83 | # events are out of order. This could *possibly* happen if something 84 | # was deleted, and then quickly re-created. It could be *possible* for the 85 | # deletion event to arrive after the creation event. Thus, this will check 86 | # if the current event timestamp is newer and will only delete if the deletion 87 | # event is newer. 88 | try: 89 | LOG.debug(f'[-] Deleting bucket: {arn}') 90 | model = create_delete_model(rec) 91 | model.save(condition=(CurrentS3Model.eventTime <= rec['detail']['eventTime'])) 92 | model.delete() 93 | 94 | except PynamoDBConnectionError as pdce: 95 | LOG.warning(f"[?] Unable to delete bucket: {arn}. Either it doesn't exist, or this deletion event is stale " 96 | f"(arrived before a NEWER creation/update). The specific exception is: {pdce}") 97 | 98 | 99 | def process_update_records(update_records): 100 | """Process the requests for S3 bucket update requests""" 101 | events = sorted(update_records, key=lambda x: x['account']) 102 | 103 | # Group records by account for more efficient processing 104 | for account_id, events in groupby(events, lambda x: x['account']): 105 | events = list(events) 106 | 107 | # Grab the bucket names (de-dupe events): 108 | buckets = {} 109 | for event in events: 110 | # If the creation date is present, then use it: 111 | bucket_event = buckets.get(event['detail']['requestParameters']['bucketName'], { 112 | 'creationDate': event['detail']['requestParameters'].get('creationDate') 113 | }) 114 | bucket_event.update(event['detail']['requestParameters']) 115 | 116 | buckets[event['detail']['requestParameters']['bucketName']] = bucket_event 117 | buckets[event['detail']['requestParameters']['bucketName']]['eventDetails'] = event 118 | 119 | # Query AWS for current configuration 120 | for b_name, item in buckets.items(): 121 | LOG.debug(f'[~] Processing Create/Update for: {b_name}') 122 | # If the bucket does not exist, then simply drop the request -- 123 | # If this happens, there is likely a Delete event that has occurred and will be processed soon. 124 | try: 125 | bucket_details = get_bucket(b_name, 126 | account_number=account_id, 127 | include_created=(item.get('creationDate') is None), 128 | assume_role=HISTORICAL_ROLE, 129 | region=CURRENT_REGION) 130 | if bucket_details.get('Error'): 131 | LOG.error(f"[X] Unable to fetch details about bucket: {b_name}. " 132 | f"The error details are: {bucket_details['Error']}") 133 | continue 134 | 135 | except ClientError as cerr: 136 | if cerr.response['Error']['Code'] == 'NoSuchBucket': 137 | LOG.warning(f'[?] Received update request for bucket: {b_name} that does not ' 138 | 'currently exist. Skipping.') 139 | continue 140 | 141 | # Catch Access Denied exceptions as well: 142 | if cerr.response['Error']['Code'] == 'AccessDenied': 143 | LOG.error(f'[X] Unable to fetch details for S3 Bucket: {b_name} in {account_id}. Access is Denied. ' 144 | 'Skipping...') 145 | continue 146 | raise Exception(cerr) 147 | 148 | # Pull out the fields we want: 149 | data = { 150 | 'arn': f'arn:aws:s3:::{b_name}', 151 | 'principalId': cloudwatch.get_principal(item['eventDetails']), 152 | 'userIdentity': cloudwatch.get_user_identity(item['eventDetails']), 153 | 'userAgent': item['eventDetails']['detail'].get('userAgent'), 154 | 'sourceIpAddress': item['eventDetails']['detail'].get('sourceIPAddress'), 155 | 'requestParameters': item['eventDetails']['detail'].get('requestParameters'), 156 | 'accountId': account_id, 157 | 'eventTime': item['eventDetails']['detail']['eventTime'], 158 | 'BucketName': b_name, 159 | 'Region': bucket_details.pop('Region'), 160 | # Duplicated in top level and configuration for secondary index 161 | 'Tags': bucket_details.pop('Tags', {}) or {}, 162 | 'eventSource': item['eventDetails']['detail']['eventSource'], 163 | 'eventName': item['eventDetails']['detail']['eventName'], 164 | 'version': VERSION 165 | } 166 | 167 | # Remove the fields we don't care about: 168 | del bucket_details['Arn'] 169 | del bucket_details['GrantReferences'] 170 | del bucket_details['_version'] 171 | del bucket_details['Name'] 172 | 173 | if not bucket_details.get('CreationDate'): 174 | bucket_details['CreationDate'] = item['creationDate'] 175 | 176 | data['configuration'] = bucket_details 177 | 178 | current_revision = CurrentS3Model(**data) 179 | current_revision.save() 180 | 181 | 182 | @RavenLambdaWrapper() 183 | def handler(event, context): # pylint: disable=W0613 184 | """ 185 | Historical S3 event collector. 186 | 187 | This collector is responsible for processing CloudWatch events and polling events. 188 | """ 189 | records = deserialize_records(event['Records']) 190 | 191 | # Split records into two groups, update and delete. 192 | # We don't want to query for deleted records. 193 | update_records, delete_records = group_records_by_type(records, UPDATE_EVENTS) 194 | 195 | LOG.debug('[@] Processing update records...') 196 | process_update_records(update_records) 197 | LOG.debug('[@] Completed processing of update records.') 198 | 199 | LOG.debug('[@] Processing delete records...') 200 | process_delete_records(delete_records) 201 | LOG.debug('[@] Completed processing of delete records.') 202 | 203 | LOG.debug('[@] Successfully updated current Historical table') 204 | -------------------------------------------------------------------------------- /historical/common/proxy.py: -------------------------------------------------------------------------------- 1 | """ 2 | .. module: historical.common.proxy 3 | :platform: Unix 4 | :copyright: (c) 2018 by Netflix Inc., see AUTHORS for more 5 | :license: Apache, see LICENSE for more details. 6 | .. author:: Mike Grima 7 | """ 8 | import logging 9 | import json 10 | import math 11 | import os 12 | import sys 13 | 14 | import boto3 15 | from retrying import retry 16 | 17 | from raven_python_lambda import RavenLambdaWrapper 18 | 19 | from historical.common.dynamodb import DESER, remove_global_dynamo_specific_fields 20 | from historical.common.exceptions import MissingProxyConfigurationException 21 | from historical.common.sqs import produce_events 22 | from historical.constants import CURRENT_REGION, EVENT_TOO_BIG_FLAG, PROXY_REGIONS, REGION_ATTR, SIMPLE_DURABLE_PROXY 23 | 24 | from historical.mapping import DURABLE_MAPPING, HISTORICAL_TECHNOLOGY 25 | 26 | LOG = logging.getLogger('historical') 27 | 28 | 29 | @retry(stop_max_attempt_number=4, wait_exponential_multiplier=1000, wait_exponential_max=1000) 30 | def _publish_sns_message(client, blob, topic_arn): 31 | client.publish(TopicArn=topic_arn, Message=blob) 32 | 33 | 34 | def shrink_blob(record, deletion): 35 | """ 36 | Makes a shrunken blob to be sent to SNS/SQS (due to the 256KB size limitations of SNS/SQS messages). 37 | This will essentially remove the "configuration" field such that the size of the SNS/SQS message remains under 38 | 256KB. 39 | :param record: 40 | :return: 41 | """ 42 | item = { 43 | "eventName": record["eventName"], 44 | EVENT_TOO_BIG_FLAG: (not deletion) 45 | } 46 | 47 | # To handle TTLs (if they happen) 48 | if record.get("userIdentity"): 49 | item["userIdentity"] = record["userIdentity"] 50 | 51 | # Remove the 'configuration' and 'requestParameters' fields from new and old images if applicable: 52 | if not deletion: 53 | # Only remove it from non-deletions: 54 | if record['dynamodb'].get('NewImage'): 55 | record['dynamodb']['NewImage'].pop('configuration', None) 56 | record['dynamodb']['NewImage'].pop('requestParameters', None) 57 | 58 | if record['dynamodb'].get('OldImage'): 59 | record['dynamodb']['OldImage'].pop('configuration', None) 60 | record['dynamodb']['OldImage'].pop('requestParameters', None) 61 | 62 | item['dynamodb'] = record['dynamodb'] 63 | 64 | return item 65 | 66 | 67 | @RavenLambdaWrapper() 68 | def handler(event, context): # pylint: disable=W0613 69 | """Historical S3 DynamoDB Stream Forwarder (the 'Proxy'). 70 | 71 | Passes events from the Historical DynamoDB stream and passes it to SNS or SQS for additional events to trigger. 72 | 73 | You can optionally use SNS or SQS. It is preferable to use SNS -> SQS, but in some cases, such as the Current stream 74 | to the Differ, this will make use of SQS to directly feed into the differ for performance purposes. 75 | """ 76 | queue_url = os.environ.get('PROXY_QUEUE_URL') 77 | topic_arn = os.environ.get('PROXY_TOPIC_ARN') 78 | 79 | if not queue_url and not topic_arn: 80 | raise MissingProxyConfigurationException('[X] Must set the `PROXY_QUEUE_URL` or the `PROXY_TOPIC_ARN` vars.') 81 | 82 | items_to_ship = [] 83 | 84 | # Must ALWAYS shrink for SQS because of 256KB limit of sending batched messages 85 | force_shrink = True if queue_url else False 86 | 87 | # Is this a "Simple Durable Proxy" -- that is -- are we stripping out all of the DynamoDB data from 88 | # the Differ? 89 | record_maker = make_proper_simple_record if SIMPLE_DURABLE_PROXY else make_proper_dynamodb_record 90 | 91 | for record in event['Records']: 92 | # We should NOT be processing this if the item in question does not 93 | # reside in the PROXY_REGIONS 94 | correct_region = True 95 | for img in ['NewImage', 'OldImage']: 96 | if record['dynamodb'].get(img): 97 | if record['dynamodb'][img][REGION_ATTR]['S'] not in PROXY_REGIONS: 98 | LOG.debug(f"[/] Not processing record -- record event took place in:" 99 | f" {record['dynamodb'][img][REGION_ATTR]['S']}") 100 | correct_region = False 101 | break 102 | 103 | if not correct_region: 104 | continue 105 | 106 | # Global DynamoDB tables will update a record with the global table specific fields. This creates 2 events 107 | # whenever there is an update. The second update, which is a MODIFY event is not relevant and noise. This 108 | # needs to be skipped over to prevent duplicated events. This is a "gotcha" in Global DynamoDB tables. 109 | if detect_global_table_updates(record): 110 | continue 111 | 112 | items_to_ship.append(record_maker(record, force_shrink=force_shrink)) 113 | 114 | if items_to_ship: 115 | # SQS: 116 | if queue_url: 117 | produce_events(items_to_ship, queue_url, batch_size=int(os.environ.get('PROXY_BATCH_SIZE', 10))) 118 | 119 | # SNS: 120 | else: 121 | client = boto3.client("sns", region_name=CURRENT_REGION) 122 | for i in items_to_ship: 123 | _publish_sns_message(client, i, topic_arn) 124 | 125 | 126 | def detect_global_table_updates(record): 127 | """This will detect DDB Global Table updates that are not relevant to application data updates. These need to be 128 | skipped over as they are pure noise. 129 | 130 | :param record: 131 | :return: 132 | """ 133 | # This only affects MODIFY events. 134 | if record['eventName'] == 'MODIFY': 135 | # Need to compare the old and new images to check for GT specific changes only (just pop off the GT fields) 136 | old_image = remove_global_dynamo_specific_fields(record['dynamodb']['OldImage']) 137 | new_image = remove_global_dynamo_specific_fields(record['dynamodb']['NewImage']) 138 | 139 | if json.dumps(old_image, sort_keys=True) == json.dumps(new_image, sort_keys=True): 140 | return True 141 | 142 | return False 143 | 144 | 145 | def make_proper_dynamodb_record(record, force_shrink=False): 146 | """Prepares and ships an individual DynamoDB record over to SNS/SQS for future processing. 147 | 148 | :param record: 149 | :param force_shrink: 150 | :return: 151 | """ 152 | # Get the initial blob and determine if it is too big for SNS/SQS: 153 | blob = json.dumps(record) 154 | size = math.ceil(sys.getsizeof(blob) / 1024) 155 | 156 | # If it is too big, then we need to send over a smaller blob to inform the recipient that it needs to go out and 157 | # fetch the item from the Historical table! 158 | if size >= 200 or force_shrink: 159 | deletion = False 160 | # ^^ However -- deletions need to be handled differently, because the Differ won't be able to find a 161 | # deleted record. For deletions, we will only shrink the 'OldImage', but preserve the 'NewImage' since that is 162 | # "already" shrunken. 163 | if record['dynamodb'].get('NewImage'): 164 | # Config will be empty if there was a deletion: 165 | if not (record['dynamodb']['NewImage'].get('configuration', {}) or {}).get('M'): 166 | deletion = True 167 | 168 | blob = json.dumps(shrink_blob(record, deletion)) 169 | 170 | return blob 171 | 172 | 173 | def _get_durable_pynamo_obj(record_data, durable_model): 174 | image = remove_global_dynamo_specific_fields(record_data) 175 | data = {} 176 | 177 | for item, value in image.items(): 178 | # This could end up as loss of precision 179 | data[item] = DESER.deserialize(value) 180 | 181 | return durable_model(**data) 182 | 183 | 184 | def make_proper_simple_record(record, force_shrink=False): 185 | """Prepares and ships an individual simplified durable table record over to SNS/SQS for future processing. 186 | 187 | :param record: 188 | :param force_shrink: 189 | :return: 190 | """ 191 | # Convert to a simple object 192 | item = { 193 | 'arn': record['dynamodb']['Keys']['arn']['S'], 194 | 'event_time': record['dynamodb']['NewImage']['eventTime']['S'], 195 | 'tech': HISTORICAL_TECHNOLOGY 196 | } 197 | 198 | # We need to de-serialize the raw DynamoDB object into the proper PynamoDB obj: 199 | prepped_new_record = _get_durable_pynamo_obj(record['dynamodb']['NewImage'], 200 | DURABLE_MAPPING.get(HISTORICAL_TECHNOLOGY)) 201 | 202 | item['item'] = dict(prepped_new_record) 203 | 204 | # Get the initial blob and determine if it is too big for SNS/SQS: 205 | blob = json.dumps(item) 206 | size = math.ceil(sys.getsizeof(blob) / 1024) 207 | 208 | # If it is too big, then we need to send over a smaller blob to inform the recipient that it needs to go out and 209 | # fetch the item from the Historical table! 210 | if size >= 200 or force_shrink: 211 | del item['item'] 212 | 213 | item[EVENT_TOO_BIG_FLAG] = True 214 | 215 | blob = json.dumps(item) 216 | 217 | return blob.replace('', '') 218 | -------------------------------------------------------------------------------- /historical/historical-cookiecutter/historical_{{cookiecutter.technology_slug}}/serverless.yaml: -------------------------------------------------------------------------------- 1 | service: "historical-{{cookiecutter.technology_slug}}" 2 | 3 | provider: 4 | name: aws 5 | runtime: python3.6 6 | memorySize: 1024 7 | timeout: 300 8 | deploymentBucket: 9 | name: ${opt:region}-${self:custom.accountName}-{{cookiecutter.team}} 10 | 11 | custom: ${file(serverless_configs/${opt:stage}.yml)} 12 | 13 | functions: 14 | Collector: 15 | handler: historical.security_group.collector.handler 16 | description: Processes polling and cloudwatch events. 17 | tags: 18 | owner: {{cookiecutter.email}} 19 | role: arn:aws:iam::${self:custom.accountId}:role/HistoricalLambdaProfile 20 | events: 21 | - stream: 22 | type: kinesis 23 | arn: 24 | Fn::GetAtt: 25 | - Historical{{cookiecutter.technology_slug | titlecase}}Stream 26 | - Arn 27 | batchSize: 100 28 | startingPosition: LATEST 29 | 30 | - stream: 31 | type: kinesis 32 | arn: 33 | Fn::GetAtt: 34 | - Historical{{cookiecutter.technology_slug | titlecase}}PollerStream 35 | - Arn 36 | batchSize: 100 37 | startingPosition: LATEST 38 | environment: 39 | SENTRY_DSN: ${self:custom.sentryDSN} 40 | 41 | Poller: 42 | handler: historical.{{cookiecutter.technology_slug}}.poller.handler 43 | description: Scheduled event that describes {{cookiecutter.technology_name}}. 44 | tags: 45 | owner: {{cookiecutter.email}} 46 | role: arn:aws:iam::${self:custom.accountId}:role/HistoricalLambdaProfile 47 | 48 | Differ: 49 | handler: historical.{{cookiecutter.technology_slug}}.differ.handler 50 | description: Stream based function that is resposible for finding differences. 51 | tags: 52 | owner: {{cookiecutter.email}} 53 | role: arn:aws:iam::${self:custom.accountId}:role/HistoricalLambdaProfile 54 | events: 55 | - stream: 56 | type: dynamodb 57 | arn: 58 | Fn::GetAtt: 59 | - Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable 60 | - StreamArn 61 | resources: 62 | Resources: 63 | # The Kinesis Stream -- Where the events will go: 64 | Historical{{cookiecutter.technology_slug | titlecase}}Stream: 65 | Type: AWS::Kinesis::Stream 66 | Properties: 67 | Name: Historical{{cookiecutter.technology_slug | titlecase}}Stream 68 | ShardCount: 1 69 | 70 | # The Kinesis Polling Stream -- Where the polling events will go: 71 | Historical{{cookiecutter.technology_slug | titlecase}}PollerStream: 72 | Type: AWS::Kinesis::Stream 73 | Properties: 74 | Name: Historical{{cookiecutter.technology_slug | titlecase}}PollerStream 75 | ShardCount: 1 76 | 77 | # The events -- these will be placed on the Kinesis stream: 78 | CloudWatchEventRule: 79 | Type: AWS::Events::Rule 80 | DependsOn: 81 | - Historical{{cookiecutter.technology_slug | titlecase}}Stream 82 | Properties: 83 | Description: EventRule forwarding security group changes. 84 | EventPattern: 85 | source: 86 | - aws.ec2 87 | detail-type: 88 | - AWS API Call via CloudTrail 89 | detail: 90 | eventSource: 91 | - ec2.amazonaws.com 92 | eventName: 93 | # TODO Update with your events 94 | State: ENABLED 95 | Targets: 96 | - 97 | Arn: 98 | Fn::GetAtt: 99 | - Historical{{cookiecutter.technology_slug | titlecase}}Stream 100 | - Arn 101 | Id: EventStream 102 | RoleArn: arn:aws:iam::${self:custom.accountId}:role/service-role/AwsEventsInvokeKinesis 103 | 104 | # The "Current" DynamoDB table: 105 | Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable: 106 | Type: AWS::DynamoDB::Table 107 | Properties: 108 | TableName: Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable 109 | TimeToLiveSpecification: 110 | AttributeName: ttl 111 | Enabled: true 112 | AttributeDefinitions: 113 | - AttributeName: arn 114 | AttributeType: S 115 | KeySchema: 116 | - AttributeName: arn 117 | KeyType: HASH 118 | ProvisionedThroughput: 119 | ReadCapacityUnits: 100 120 | WriteCapacityUnits: 100 121 | StreamSpecification: 122 | StreamViewType: NEW_AND_OLD_IMAGES 123 | 124 | # The Durable (Historical) change DynamoDB table: 125 | Historical{{cookiecutter.technology_slug | titlecase}}DurableTable: 126 | Type: AWS::DynamoDB::Table 127 | Properties: 128 | TableName: Historical{{cookiecutter.technology_slug | titlecase}}DurableTable 129 | AttributeDefinitions: 130 | - AttributeName: arn 131 | AttributeType: S 132 | - AttributeName: eventTime 133 | AttributeType: S 134 | KeySchema: 135 | - AttributeName: arn 136 | KeyType: HASH 137 | - AttributeName: eventTime 138 | KeyType: RANGE 139 | ProvisionedThroughput: 140 | ReadCapacityUnits: 100 141 | WriteCapacityUnits: 100 142 | StreamSpecification: 143 | StreamViewType: NEW_AND_OLD_IMAGES 144 | 145 | # Lambdas 146 | CollectorLambdaFunction: 147 | Type: AWS::Lambda::Function 148 | DependsOn: 149 | - Historical{{cookiecutter.technology_slug | titlecase}}Stream 150 | 151 | DifferLambdaFunction: 152 | Type: AWS::Lambda::Function 153 | DependsOn: 154 | - Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable 155 | 156 | PollerScheduledRule: 157 | Type: AWS::Events::Rule 158 | Properties: 159 | Description: ScheduledRule 160 | ScheduleExpression: rate(60 minutes) 161 | State: ENABLED 162 | Targets: 163 | - 164 | Arn: 165 | Fn::GetAtt: 166 | - PollerLambdaFunction 167 | - Arn 168 | Id: TargetFunctionV1 169 | 170 | PermissionForEventsToInvokeLambda: 171 | Type: AWS::Lambda::Permission 172 | Properties: 173 | FunctionName: 174 | Ref: PollerLambdaFunction 175 | Action: lambda:InvokeFunction 176 | Principal: events.amazonaws.com 177 | SourceArn: 178 | Fn::GetAtt: 179 | - PollerScheduledRule 180 | - Arn 181 | 182 | # Log group -- 1 for each function... 183 | CollectorLogGroup: 184 | Properties: 185 | RetentionInDays: "3" 186 | 187 | PollerLogGroup: 188 | Properties: 189 | RetentionInDays: "3" 190 | 191 | DifferLogGroup: 192 | Properties: 193 | RetentionInDays: "3" 194 | 195 | # Outputs -- for use in other dependent Historical deployments: 196 | Outputs: 197 | Historical{{cookiecutter.technology_slug | titlecase}}StreamArn: 198 | Description: Historical Security Group Event Kinesis Stream ARN 199 | Value: 200 | Fn::GetAtt: 201 | - Historical{{cookiecutter.technology_slug | titlecase}}Stream 202 | - Arn 203 | Export: 204 | Name: Historical{{cookiecutter.technology_slug | titlecase}}StreamArn 205 | 206 | Historical{{cookiecutter.technology_slug | titlecase}}PollerStreamArn: 207 | Description: Historical Security Group Poller Kinesis Stream ARN 208 | Value: 209 | Fn::GetAtt: 210 | - Historical{{cookiecutter.technology_slug | titlecase}}PollerStream 211 | - Arn 212 | Export: 213 | Name: Historical{{cookiecutter.technology_slug | titlecase}}PollerStreamArn 214 | 215 | Historical{{cookiecutter.technology_slug | titlecase}}CurrentTableArn: 216 | Description: Historical Security Group Current DynamoDB Table ARN 217 | Value: 218 | Fn::GetAtt: 219 | - Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable 220 | - Arn 221 | Export: 222 | Name: Historical{{cookiecutter.technology_slug | titlecase}}CurrentTableArn 223 | 224 | Historical{{cookiecutter.technology_slug | titlecase}}CurrentTableStreamArn: 225 | Description: Historical Security Group Current DynamoDB Table Stream ARN 226 | Value: 227 | Fn::GetAtt: 228 | - Historical{{cookiecutter.technology_slug | titlecase}}CurrentTable 229 | - StreamArn 230 | Export: 231 | Name: Historical{{cookiecutter.technology_slug | titlecase}}CurrentTableStreamArn 232 | 233 | Historical{{cookiecutter.technology_slug | titlecase}}DurableTableArn: 234 | Description: Historical Security Group Durable DynamoDB Table ARN 235 | Value: 236 | Fn::GetAtt: 237 | - Historical{{cookiecutter.technology_slug | titlecase}}DurableTable 238 | - Arn 239 | Export: 240 | Name: Historical{{cookiecutter.technology_slug | titlecase}}DurableTableArn 241 | 242 | Historical{{cookiecutter.technology_slug | titlecase}}DurableTableStreamArn: 243 | Description: Historical Security Group Durable DynamoDB Table Stream ARN 244 | Value: 245 | Fn::GetAtt: 246 | - Historical{{cookiecutter.technology_slug | titlecase}}DurableTable 247 | - StreamArn 248 | Export: 249 | Name: Historical{{cookiecutter.technology_slug | titlecase}}DurableTableStreamArn 250 | 251 | plugins: 252 | - serverless-python-requirements 253 | - serverless-prune-plugin 254 | -------------------------------------------------------------------------------- /historical/tests/factories.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=R0205,E1101,C0103,W0622,W0613 2 | """ 3 | .. module: historical.tests.factories 4 | :platform: Unix 5 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 6 | :license: Apache, see LICENSE for more details. 7 | .. author:: Kevin Glisson 8 | .. author:: Mike Grima 9 | """ 10 | import datetime 11 | 12 | from boto3.dynamodb.types import TypeSerializer 13 | from factory import SubFactory, Factory, post_generation # pylint: disable=E0401 14 | from factory.fuzzy import FuzzyDateTime, FuzzyText # pylint: disable=E0401 15 | import pytz # pylint: disable=E0401 16 | 17 | SERIA = TypeSerializer() 18 | 19 | 20 | def serialize(obj): 21 | """JSON serializer for objects not serializable by default json code""" 22 | 23 | if isinstance(obj, datetime.datetime): 24 | serial = obj.replace(microsecond=0).replace(tzinfo=None).isoformat() + "Z" 25 | return serial 26 | 27 | if isinstance(obj, bytes): 28 | return obj.decode('utf-8') 29 | 30 | return obj.__dict__ 31 | 32 | 33 | class SessionIssuer(object): 34 | """Model for the Session Issuer in the CloudWatch Event""" 35 | 36 | def __init__(self, userName, type, arn, principalId, accountId): 37 | self.userName = userName 38 | self.type = type 39 | self.arn = arn 40 | self.principalId = principalId 41 | self.accountId = accountId 42 | 43 | 44 | class SessionIssuerFactory(Factory): 45 | """Generates the Session Issuer component of the CloudWatch Event""" 46 | 47 | class Meta: 48 | """Defines the Model""" 49 | 50 | model = SessionIssuer 51 | 52 | userName = FuzzyText() 53 | type = 'Role' 54 | arn = 'arn:aws:iam::123456789012:role/historical_poller' 55 | principalId = 'AROAIKELBS2RNWG7KASDF' 56 | accountId = '123456789012' 57 | 58 | 59 | class UserIdentity(object): 60 | """Model for the User Identity component of the CloudWatch Event""" 61 | 62 | def __init__(self, sessionContext, principalId, type): 63 | self.sessionContext = sessionContext 64 | self.principalId = principalId 65 | self.type = type 66 | 67 | 68 | class UserIdentityFactory(Factory): 69 | """Generates the User Identity component of the CloudWatch Event""" 70 | 71 | class Meta: 72 | """Defines the Model""" 73 | 74 | model = UserIdentity 75 | 76 | sessionContext = SubFactory(SessionIssuerFactory) 77 | principalId = 'AROAIKELBS2RNWG7KASDF:joe@example.com' 78 | type = 'Service' 79 | 80 | 81 | class SQSData(object): 82 | """Model for an SQS Event Message""" 83 | 84 | def __init__(self, messageId, receiptHandle, body): 85 | self.messageId = messageId 86 | self.receiptHandle = receiptHandle 87 | self.body = body 88 | self.eventSource = "aws:sqs" 89 | 90 | 91 | class SQSDataFactory(Factory): 92 | """Generates the SQS Event Message""" 93 | 94 | class Meta: 95 | """Defines the Model""" 96 | 97 | model = SQSData 98 | 99 | body = FuzzyText() 100 | messageId = FuzzyText() 101 | receiptHandle = FuzzyText() 102 | 103 | 104 | class SQSRecord(object): 105 | """Model for an individual SQS Event Record""" 106 | 107 | def __init__(self, sqs): 108 | self.sqs = sqs 109 | 110 | 111 | class Records(object): 112 | """Generic Model for multiple Records for an event source (DynamoDB, SQS, SNS, etc.)""" 113 | 114 | def __init__(self, records): 115 | self.Records = records 116 | 117 | 118 | class RecordsFactory(Factory): 119 | """Factory for generating multiple Event (SNS, CloudWatch, Kinesis, DynamoDB, SQS) records.""" 120 | 121 | class Meta: 122 | """Defines the Model""" 123 | 124 | model = Records 125 | 126 | @post_generation 127 | def Records(self, create, extracted, **kwargs): 128 | """Generates the Records""" 129 | if not create: 130 | # Simple build, do nothing. 131 | return 132 | 133 | if extracted: 134 | # A list of groups were passed in, use them 135 | for record in extracted: 136 | self.Records.append(record) 137 | 138 | 139 | class DynamoDBData(object): 140 | """Model for the DynamoDB Stream data itself""" 141 | 142 | def __init__(self, NewImage, OldImage, Keys): 143 | self.OldImage = {k: SERIA.serialize(v) for k, v in OldImage.items()} 144 | self.NewImage = {k: SERIA.serialize(v) for k, v in NewImage.items()} 145 | self.Keys = {k: SERIA.serialize(v) for k, v in Keys.items()} 146 | 147 | 148 | class DynamoDBDataFactory(Factory): 149 | """DynamoDB Stream Data Component Model""" 150 | 151 | class Meta: 152 | """Defines the Model""" 153 | 154 | model = DynamoDBData 155 | 156 | NewImage = {} 157 | Keys = {} 158 | OldImage = {} 159 | 160 | 161 | class DynamoDBRecord(object): 162 | """DynamoDB Stream Model""" 163 | 164 | def __init__(self, dynamodb, eventName, userIdentity): 165 | self.dynamodb = dynamodb 166 | self.eventName = eventName 167 | self.userIdentity = userIdentity 168 | 169 | 170 | class DynamoDBRecordFactory(Factory): 171 | """Factory generating a DynamoDBRecord""" 172 | 173 | class Meta: 174 | """Defines the Model""" 175 | 176 | model = DynamoDBRecord 177 | 178 | dynamodb = SubFactory(DynamoDBDataFactory) 179 | eventName = 'INSERT' 180 | userIdentity = SubFactory(UserIdentityFactory) 181 | 182 | 183 | class DynamoDBRecordsFactory(Factory): 184 | """Factory to generate DynamoDB Stream Events""" 185 | 186 | class Meta: 187 | """Defines the Model""" 188 | 189 | model = Records 190 | 191 | @post_generation 192 | def Records(self, create, extracted, **kwargs): 193 | """Generates the proper records""" 194 | if not create: 195 | # Simple build, do nothing. 196 | return 197 | 198 | if extracted: 199 | # A list of groups were passed in, use them 200 | for record in extracted: 201 | self.Records.append(record) 202 | 203 | 204 | class Event(object): 205 | """The base of the Event Model""" 206 | 207 | def __init__(self, account, region, time): 208 | self.account = account 209 | self.region = region 210 | self.time = time 211 | 212 | 213 | class EventFactory(Factory): 214 | """Parent class for all event factories.""" 215 | 216 | class Meta: 217 | """Defines the Model""" 218 | 219 | model = Event 220 | 221 | account = '123456789012' 222 | region = 'us-east-1' 223 | time = FuzzyDateTime(datetime.datetime.utcnow().replace(tzinfo=pytz.utc)) 224 | 225 | 226 | class Detail(object): 227 | """The CloudWatch Event `detail` Model""" 228 | 229 | # pylint: disable=W0622,R0902 230 | def __init__(self, eventTime, awsEventType, awsRegion, eventName, userIdentity, id, eventSource, 231 | requestParameters, responseElements, collected=None): 232 | self.eventTime = eventTime 233 | self.awsRegion = awsRegion 234 | self.awsEventType = awsEventType 235 | self.userIdentity = userIdentity 236 | self.id = id 237 | self.eventSource = eventSource 238 | self.requestParameters = requestParameters 239 | self.responseElements = responseElements 240 | self.eventName = eventName 241 | self.collected = collected 242 | 243 | 244 | class DetailFactory(Factory): 245 | """Factory for making the CloudWatch Event `detail` component""" 246 | 247 | class Meta: 248 | """Defines the Model""" 249 | 250 | model = Detail 251 | 252 | eventTime = FuzzyDateTime(datetime.datetime.utcnow().replace(tzinfo=pytz.utc, microsecond=0)) 253 | awsEventType = 'AwsApiCall' 254 | userIdentity = SubFactory(UserIdentityFactory) 255 | id = FuzzyText() 256 | eventName = '' 257 | requestParameters = dict() 258 | responseElements = dict() 259 | eventSource = 'aws.ec2' 260 | awsRegion = 'us-east-1' 261 | collected = None 262 | 263 | 264 | class CloudwatchEvent(Event): 265 | """The CloudWatch Event Model""" 266 | 267 | def __init__(self, detail, account, region, time): 268 | self.detail = detail 269 | super().__init__(account, region, time) 270 | 271 | 272 | class CloudwatchEventFactory(EventFactory): 273 | """Factory for generating CloudWatch Events""" 274 | 275 | class Meta: 276 | """Defines the Model""" 277 | 278 | model = CloudwatchEvent 279 | 280 | detail = SubFactory(DetailFactory) 281 | 282 | 283 | class HistoricalPollingEvent(Event): 284 | """Polling Event Model""" 285 | 286 | def __init__(self, detail, account, region, time): 287 | self.detail = detail 288 | super().__init__(account, region, time) 289 | 290 | 291 | class HistoricalPollingEventFactory(CloudwatchEventFactory): 292 | """Factory for generating historical polling events""" 293 | 294 | class Meta: 295 | """Defines the model""" 296 | 297 | model = HistoricalPollingEvent 298 | 299 | detail = SubFactory(DetailFactory) 300 | 301 | 302 | class SnsData: 303 | """SNS Event model""" 304 | 305 | def __init__(self, Message, EventSource, EventVersion, EventSubscriptionArn): 306 | self.Message = Message 307 | self.EventSource = EventSource 308 | self.EventVersion = EventVersion 309 | self.EventSubscriptionArn = EventSubscriptionArn 310 | 311 | 312 | class SnsDataFactory(Factory): 313 | """SNS Event Model Factory""" 314 | 315 | class Meta: 316 | """Defines the model""" 317 | 318 | model = SnsData 319 | 320 | Message = FuzzyText() 321 | EventVersion = FuzzyText() 322 | EventSource = "aws:sns" 323 | EventSubscriptionArn = FuzzyText() 324 | -------------------------------------------------------------------------------- /historical/tests/conftest.py: -------------------------------------------------------------------------------- 1 | # pylint: disable=E0401,C0103 2 | """ 3 | .. module: historical.tests.test_s3 4 | :platform: Unix 5 | :copyright: (c) 2017 by Netflix Inc., see AUTHORS for more 6 | :license: Apache, see LICENSE for more details. 7 | .. author:: Kevin Glisson 8 | .. author:: Mike Grima 9 | """ 10 | import os 11 | 12 | import boto3 13 | from mock import patch 14 | from moto import mock_sqs 15 | from moto.dynamodb2 import mock_dynamodb2 16 | from moto.s3 import mock_s3 17 | from moto.iam import mock_iam 18 | from moto.sts import mock_sts 19 | from moto.ec2 import mock_ec2 20 | import pytest 21 | 22 | 23 | @pytest.fixture(scope='function') 24 | def s3(): 25 | """Mocked S3 Fixture.""" 26 | with mock_s3(): 27 | yield boto3.client('s3', region_name='us-east-1') 28 | 29 | 30 | @pytest.fixture(scope='function') 31 | def ec2(): 32 | """Mocked EC2 Fixture.""" 33 | with mock_ec2(): 34 | yield boto3.client('ec2', region_name='us-east-1') 35 | 36 | 37 | @pytest.fixture(scope='function') 38 | def sts(): 39 | """Mocked STS Fixture.""" 40 | with mock_sts(): 41 | yield boto3.client('sts', region_name='us-east-1') 42 | 43 | 44 | @pytest.fixture(scope='function') 45 | def iam(): 46 | """Mocked IAM Fixture.""" 47 | with mock_iam(): 48 | yield boto3.client('iam', region_name='us-east-1') 49 | 50 | 51 | @pytest.fixture(scope='function') 52 | def dynamodb(): 53 | """Mocked DynamoDB Fixture.""" 54 | with mock_dynamodb2(): 55 | yield boto3.client('dynamodb', region_name='us-east-1') 56 | 57 | 58 | # pylint: disable=W0621,W0613 59 | @pytest.fixture(scope='function') 60 | def retry(): 61 | """Mock the retry library so that it doesn't retry.""" 62 | def mock_retry_decorator(*args, **kwargs): 63 | def retry(func): 64 | return func 65 | return retry 66 | 67 | patch_retry = patch('retrying.retry', mock_retry_decorator) 68 | yield patch_retry.start() 69 | 70 | patch_retry.stop() 71 | 72 | 73 | @pytest.fixture(scope='function') 74 | def swag_accounts(s3, retry): 75 | """Create mocked SWAG Accounts.""" 76 | from swag_client.backend import SWAGManager 77 | from swag_client.util import parse_swag_config_options 78 | 79 | bucket_name = 'SWAG' 80 | data_file = 'accounts.json' 81 | region = 'us-east-1' 82 | owner = 'third-party' 83 | 84 | s3.create_bucket(Bucket=bucket_name) 85 | os.environ['SWAG_BUCKET'] = bucket_name 86 | os.environ['SWAG_DATA_FILE'] = data_file 87 | os.environ['SWAG_REGION'] = region 88 | os.environ['SWAG_OWNER'] = owner 89 | 90 | swag_opts = { 91 | 'swag.type': 's3', 92 | 'swag.bucket_name': bucket_name, 93 | 'swag.data_file': data_file, 94 | 'swag.region': region, 95 | 'swag.cache_expires': 0 96 | } 97 | 98 | swag = SWAGManager(**parse_swag_config_options(swag_opts)) 99 | 100 | account = { 101 | 'aliases': ['test'], 102 | 'contacts': ['admins@test.net'], 103 | 'description': 'LOL, Test account', 104 | 'email': 'testaccount@test.net', 105 | 'environment': 'test', 106 | 'id': '012345678910', 107 | 'name': 'testaccount', 108 | 'owner': 'third-party', 109 | 'provider': 'aws', 110 | 'sensitive': False, 111 | 'account_status': 'ready', 112 | 'services': [ 113 | { 114 | 'name': 'historical', 115 | 'status': [ 116 | { 117 | 'region': 'all', 118 | 'enabled': True 119 | } 120 | ] 121 | } 122 | ] 123 | } 124 | 125 | swag.create(account) 126 | 127 | 128 | @pytest.fixture(scope='function') 129 | def historical_role(iam, sts): 130 | """Create the mocked Historical IAM role that Historical Lambdas would need to assume to List and 131 | Collect details about a given technology in the target account. 132 | 133 | """ 134 | iam.create_role(RoleName='historicalrole', AssumeRolePolicyDocument='{}') 135 | os.environ['HISTORICAL_ROLE'] = 'historicalrole' 136 | 137 | 138 | @pytest.fixture(scope='function') 139 | def historical_sqs(): 140 | """Create the Mocked SQS queues that are used throughout Historical.""" 141 | with mock_sqs(): 142 | client = boto3.client('sqs', region_name='us-east-1') 143 | 144 | # Poller Tasker Queue: 145 | client.create_queue(QueueName='pollertaskerqueue') 146 | os.environ['POLLER_TASKER_QUEUE_NAME'] = 'pollertaskerqueue' 147 | 148 | # Poller Queue: 149 | client.create_queue(QueueName='pollerqueue') 150 | os.environ['POLLER_QUEUE_NAME'] = 'pollerqueue' 151 | 152 | # Event Queue: 153 | client.create_queue(QueueName='eventqueue') 154 | os.environ['EVENT_QUEUE_NAME'] = 'eventqueue' 155 | 156 | # Proxy Queue: 157 | client.create_queue(QueueName='proxyqueue') 158 | 159 | yield client 160 | 161 | 162 | @pytest.fixture(scope='function') 163 | def buckets(s3): 164 | """Create Testing S3 buckets for testing the S3 stack.""" 165 | # Create buckets: 166 | for i in range(0, 50): 167 | s3.create_bucket(Bucket=f'testbucket{i}') 168 | s3.put_bucket_tagging( 169 | Bucket=f'testbucket{i}', 170 | Tagging={ 171 | 'TagSet': [ 172 | { 173 | 'Key': 'theBucketName', 174 | 'Value': f'testbucket{i}' 175 | } 176 | ] 177 | } 178 | ) 179 | s3.put_bucket_lifecycle_configuration(Bucket=f'testbucket{i}', LifecycleConfiguration={ 180 | 'Rules': [ 181 | { 182 | 'Expiration': { 183 | 'Days': 5 184 | }, 185 | 'ID': 'string', 186 | 'Filter': { 187 | 'Prefix': 'string', 188 | 'Tag': { 189 | 'Key': 'string', 190 | 'Value': 'string' 191 | }, 192 | 'And': { 193 | 'Prefix': 'string', 194 | 'Tags': [ 195 | { 196 | 'Key': 'string', 197 | 'Value': 'string' 198 | }, 199 | ] 200 | } 201 | }, 202 | 'Status': 'Enabled', 203 | 'NoncurrentVersionTransitions': [ 204 | { 205 | 'NoncurrentDays': 123, 206 | 'StorageClass': 'GLACIER' 207 | }, 208 | ], 209 | 'NoncurrentVersionExpiration': { 210 | 'NoncurrentDays': 123 211 | } 212 | } 213 | ] 214 | }) 215 | 216 | 217 | @pytest.fixture(scope='function') 218 | def security_groups(ec2): 219 | """Creates security groups.""" 220 | sg = ec2.create_security_group( 221 | Description='test security group', 222 | GroupName='test', 223 | VpcId='vpc-test' 224 | ) 225 | 226 | # Tag it: 227 | ec2.create_tags(Resources=[sg['GroupId']], Tags=[ 228 | { 229 | "Key": "Some", 230 | "Value": "Value" 231 | }, 232 | { 233 | "Key": "Empty", 234 | "Value": "" 235 | } 236 | ]) 237 | 238 | yield sg 239 | 240 | 241 | @pytest.fixture(scope='function') 242 | def vpcs(ec2): 243 | """Creates vpcs.""" 244 | yield ec2.create_vpc( 245 | CidrBlock='192.168.1.1/32', 246 | AmazonProvidedIpv6CidrBlock=True, 247 | InstanceTenancy='default' 248 | )['Vpc'] 249 | 250 | 251 | @pytest.fixture(scope='function') 252 | def mock_lambda_environment(): 253 | """Mocks out the AWS Lambda environment context that AWS Lambda passes into the handler.""" 254 | os.environ['SENTRY_ENABLED'] = 'f' 255 | 256 | class MockedContext: 257 | """Class that Mocks out the Lambda `context` object.""" 258 | 259 | def get_remaining_time_in_millis(self): 260 | """Mocked method to return the remaining Lambda time in milliseconds.""" 261 | return 99999 262 | 263 | return MockedContext() 264 | 265 | 266 | @pytest.fixture(scope='function') 267 | def current_security_group_table(): 268 | """Create the Current Security Group Table.""" 269 | from historical.security_group.models import CurrentSecurityGroupModel 270 | mock_dynamodb2().start() 271 | yield CurrentSecurityGroupModel.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 272 | mock_dynamodb2().stop() 273 | 274 | 275 | @pytest.fixture(scope='function') 276 | def durable_security_group_table(): 277 | """Create the Durable Security Group Table.""" 278 | from historical.security_group.models import DurableSecurityGroupModel 279 | mock_dynamodb2().start() 280 | yield DurableSecurityGroupModel.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 281 | mock_dynamodb2().stop() 282 | 283 | 284 | @pytest.fixture(scope='function') 285 | def current_vpc_table(): 286 | """Create the Current VPC Table.""" 287 | from historical.vpc.models import CurrentVPCModel 288 | mock_dynamodb2().start() 289 | yield CurrentVPCModel.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 290 | mock_dynamodb2().stop() 291 | 292 | 293 | @pytest.fixture(scope='function') 294 | def durable_vpc_table(): 295 | """Create the Durable VPC Table.""" 296 | from historical.vpc.models import DurableVPCModel 297 | mock_dynamodb2().start() 298 | yield DurableVPCModel.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 299 | mock_dynamodb2().stop() 300 | 301 | 302 | @pytest.fixture(scope='function') 303 | def current_s3_table(dynamodb): 304 | """Create the Current S3 Table.""" 305 | from historical.s3.models import CurrentS3Model 306 | yield CurrentS3Model.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 307 | 308 | 309 | @pytest.fixture(scope='function') 310 | def durable_s3_table(dynamodb): 311 | """Create the Durable S3 Table.""" 312 | from historical.s3.models import DurableS3Model 313 | yield DurableS3Model.create_table(read_capacity_units=1, write_capacity_units=1, wait=True) 314 | -------------------------------------------------------------------------------- /mkdocs/docs/architecture.md: -------------------------------------------------------------------------------- 1 | # Historical Architecture 2 | Historical is a serverless AWS application that consists of many components. 3 | 4 | Historical is written in Python 3 and heavily leverages AWS technologies such as Lambda, SNS, SQS, DynamoDB, CloudTrail, and CloudWatch. 5 | 6 | ## General Architectural Overview 7 | Here is a diagram of the Historical Architecture: 8 | 9 | 10 | **Please Note:** This stack is deployed _for every technology monitored_! There are many, many Historical stacks that will be deployed. 11 | 12 | ### Polling vs. Events 13 | Historical is *both* a polling and event driven system. It will periodically poll AWS accounts for changes. However, because Historical responds to events in the environment, polling doesn't need to be very aggressive and only happens once every few hours. 14 | 15 | Polling is necessary because events are not 100% reliable. This ensures that data is current just in case an event is dropped. 16 | 17 | Historical is *eventually consistent*, and makes a *best effort* to maintain a current and up-to-date inventory of AWS resources. 18 | 19 | ## Prerequisite Overview 20 | 21 | This is a high-level overview of the prerequisites that are required to make Historical operate. For more details on setting up the required prerequisites, please review the [installation documentation](../installation). 22 | 23 | 1. **ALL AWS accounts** accounts have CloudTrail enabled. 24 | 1. **ALL AWS accounts** and **ALL regions** in those accounts have a CloudWatch Event rule that captures ALL events and sends them over the CloudWatch Event Bus to the Historical account for processing. 25 | 1. IAM roles exist in **ALL** accounts and are assumable by the Historical Lambda functions. 26 | 1. Historical makes use of [SWAG](https://github.com/Netflix-Skunkworks/swag-client) to define which AWS accounts Historical is enabled for. While not a hard requirement, use of SWAG is _highly recommended_. 27 | 28 | ## Regions 29 | Historical has the concept of regions that fit 3 categories: 30 | 31 | - Primary region 32 | - Secondary region(s) 33 | - Off region(s) 34 | 35 | The **Primary Region** is considered the "Base" of Historical. This region has all of the major components that make up Historical. This region processes all in-region AND off-region originating events. 36 | 37 | The **Off Region(s)** are regions you don't have a lot of infrastructure deployed in. However, you still want visibility in these regions should events happen there. These regions have very minimal amount of Historical-related infrastructure deployed. These regions will forward ALL events to the Primary Region for processing. 38 | 39 | The **Secondary Region(s)** are regions that are important to you. Secondary regions look like the primary region and process in-region events. If you have a lot of infrastructure within a region, you should place a Historical stack there. This will allow you to quickly receive and process events, and also gives your applications a regionally-local means of accessing Historical data. 40 | 41 | **Note:** Place a Historical off-region stack in any region that is not Primary or Secondary. This will ensure full visibility in your environment. 42 | 43 | ## Component Overview 44 | This section describes some of the high-level architectural components. 45 | 46 | ### Primary Components 47 | Below are the primary components of the Historical architecture: 48 | 49 | 1. CloudWatch Event Rules 50 | 1. CloudWatch Change Events 51 | 1. Poller 52 | 1. Collector 53 | 1. Current Table 54 | 1. DynamoDB Stream Proxy 55 | 1. Differ 56 | 1. Durable Table 57 | 1. Off-region SNS forwarders 58 | 59 | As general overview, the infrastructure is an event processing and enriching pipeline. An event will arrive, will get enriched with additional information, and will provide notifications to downstream subscribers on the given changes. 60 | 61 | SQS queues are used in as many places as much as possible to invoke Lambda functions. SQS makes it easy to provide Lambda execution concurrency, auto-scaling, retry of failures without blocking, and dead-letter queuing capabilities. 62 | 63 | SNS topics are used to make it easy for _N_ number of interested parties to subscribe to the Historical DynamoDB table changes. Presently, this is only attached to the Durable table. More details on this below. 64 | 65 | ### CloudWatch Event Rules 66 | There are two different CloudWatch Event Rules: 67 | 68 | 1. Timed Events 69 | 1. Change Events 70 | 71 | Timed events are used to kick off the Poller. See the section on the poller below for additional details. Change events are events that arrive from CloudWatch Events when an AWS resource's configuration changes. 72 | 73 | ### Poller 74 | The Poller's primary function is to obtain a full inventory of AWS resources. 75 | 76 | The Poller is split into two parts: 77 | 78 | 1. Poller Tasker 79 | 1. Poller 80 | 81 | The "Poller Tasker" is a Lambda function that iterates over all AWS accounts Historical is configured for, and tasks the Poller to *list* all resources in the given environment. 82 | 83 | The Poller Tasker in the *PRIMARY REGION* tasks the Poller to list resources that reside in the primary region and all off-regions. A Poller Tasker in a *SECONDARY REGION* will only task a poller to describe resources that reside in the same region. 84 | 85 | The Poller *lists* all resources in a given account/region, and tasks a "Poller Collector" to fetch details about the resource in question. 86 | 87 | ### Collector 88 | The Collector describes a given AWS resource and stores its configuration to the "Current" DynamoDB table. The Collector is split into two parts (same code, different invocation mechanisms): 89 | 90 | 1. Poller Collector 91 | 1. Event Collector 92 | 93 | The Poller Collector is a collector that will only respond to polling events. The Event Collector will only respond to CloudWatch change events. 94 | 95 | The Collector is split into two parts to prevent change events from being sandwiched in between polling events. Historical gives priority to change events over polling events to ensure timeliness of resource configuration changes. 96 | 97 | In both cases, the Collector will go to the AWS account and region that the item resides in, and use `boto3` to describe the configuration of the resource. 98 | 99 | ### Current Table 100 | The "Current" table is a global DynamoDB table that stores the current configuration of a given resource in 101 | AWS. 102 | 103 | This acts as a cache for current the state of the environment. 104 | 105 | The Current table has as DynamoDB Stream that will kick off a DynamoDB Stream Proxy that then invokes the Differ. 106 | 107 | #### Special Note: 108 | The Current table has a TTL set on all items. This TTL is updated any time a change event arrives, or when the Poller runs. The TTL is set to clean-up orphaned items, which can happen if a deletion event is lost. Deleted items will not be picked up by the Poller (only lists items that exist in the account) and thus, will be removed from the Current table on TTL expiration. As a result, the Poller must "see" a resource at least once every few hours before it is deemed deleted from the environment. 109 | 110 | ### DynamoDB Stream Proxy 111 | The DynamoDB Stream Proxy is a Lambda function that proxies DynamoDB Stream events to SNS or SQS. The purpose is to task subsequent Lambda functions on the specific changes that happen to the DynamoDB table. 112 | 113 | The Historical infrastructure has two configurations for the DynamoDB Proxy: 114 | 115 | 1. Current Table Forwarder (DynamoDB Stream Proxy to Differ SQS) 116 | 1. Durable Table Forwarder (DynamoDB Stream Proxy to Change Notification SNS) 117 | 118 | The Current Table Forwarder proxies events to the SQS queue that invokes the Differ Lambda function. 119 | 120 | The Durable Table Forwarder proxies events to an SNS topic that can be subscribed. SNS enables *N* subscribers to Historical events. The Durable table proxy serializes the DynamoDB Stream events into an easily consumable JSON that contains the full and complete configuration of the resource in question, along with the the CloudTrail context. This enables downstream applications to make intelligent decisions about the changes that occur as they have the full and complete context of the resource and the changes made to it. 121 | 122 | #### Special Note: 123 | DynamoDB Streams in Global DynamoDB tables invoke this Lambda whenever a DynamoDB update occurs in ANY of the regions the table is configured to sync with. For the Current table, this can result in Historical Lambda functions _"stepping on each other's toes"_ (this is not a concern for Durable table changes). To avoid this, the Current table DynamoDB Stream Proxy has a `PROXY_REGIONS` environment variable that is configured to only proxy DynamoDB Stream updates that occur to resources that reside in the specified regions. The *PRIMARY REGION* must be configured to proxy events that occur in the primary region, and all off-regions. The *SECONDARY REGION(S)* must be configured to proxy events that occur in the same region. 124 | 125 | #### Another Special Note: 126 | DynamoDB items are capped to 400KB. SNS and SQS have maximum message sizes of 256KB. Logic exists to handle cases where DynamoDB items are too big to send over to SNS/SQS. Follow-up Lambdas and subscribers will need to make use of the Historical code to fetch the full configuration of the item either out of the Current or Durable tables (depending on the use case). Enhancements will be made in the future to help address this to make the data easier to consume in these (rare) circumstances. 127 | 128 | ### Differ 129 | The Differ is a Lambda function that gets invoked upon changes to the Current table. The DynamoDB stream provides the Differ (via the Proxy) the current state of the resource that changed. The Differ checks if the resource in question has had an effective change. If so, the Differ saves a new change record to the Durable table to maintain history of the resource as it changes over time, and also saves the CloudTrail context. 130 | 131 | ### Durable Table 132 | The "Durable" table is a Global DynamoDB table that stores a resource configuration with change history. 133 | 134 | The Durable table has as DynamoDB Stream that invokes another DynamoDB Stream Proxy. This is used to notify downstream subscribers of the effective changes that occur to the environment. 135 | 136 | ### Off-Region SNS Forwarders 137 | Very bare infrastructure is intentionally deployed in the off-regions. This helps to reduce costs and complexity of the Historical infrastructure. 138 | 139 | The off-region SNS forwarders are SNS topics that receive CloudWatch events for resource changes that occur in the off-regions. These topics forward events to the Event Collector SQS queue in the primary region for processing. 140 | 141 | ## Special Stacks 142 | Some resource types have different stack configurations due to nuances of the resource type. 143 | 144 | The following resource types have different stack types: 145 | 146 | - S3 147 | - IAM (Coming Soon!) 148 | 149 | ### S3 150 | The AWS S3 stack is almost identical to the standard stack. The difference is due to AWS S3 buckets having a globally unique namespace. 151 | 152 | For S3, because it is not presently possible to only poll for in-region S3 buckets, the poller lives in the primary region only. The poller in the primary region polls for all S3 buckets in all regions. 153 | 154 | The secondary regions will still respond to in-region events, but lack all polling components. 155 | 156 | This diagram showcases the S3 stack. 157 | 158 | ### IAM 159 | This is coming soon! 160 | 161 | ## Installation & Configuration 162 | 163 | Please refer to the [installation docs](../installation) for additional details. 164 | --------------------------------------------------------------------------------