├── .gitignore ├── DEV.md ├── LICENSE ├── README.md ├── awsbill2graphite.py ├── print_all_csvs.py ├── redact_csv.py ├── requirements.txt ├── static ├── dashboard.png └── grafana_dashboard.json ├── test_all.py └── test_data └── hourly_billing-1.csv /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | 27 | # PyInstaller 28 | # Usually these files are written by a python script from a template 29 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 30 | *.manifest 31 | *.spec 32 | 33 | # Installer logs 34 | pip-log.txt 35 | pip-delete-this-directory.txt 36 | 37 | # Unit test / coverage reports 38 | htmlcov/ 39 | .tox/ 40 | .coverage 41 | .coverage.* 42 | .cache 43 | nosetests.xml 44 | coverage.xml 45 | *,cover 46 | .hypothesis/ 47 | 48 | # Translations 49 | *.mo 50 | *.pot 51 | 52 | # Django stuff: 53 | *.log 54 | 55 | # Sphinx documentation 56 | docs/_build/ 57 | 58 | # PyBuilder 59 | target/ 60 | 61 | #Ipython Notebook 62 | .ipynb_checkpoints 63 | 64 | # env variables during development 65 | .env 66 | .env.gpg 67 | 68 | # editor swap files 69 | .*.sw? 70 | .sw? 71 | *~ 72 | -------------------------------------------------------------------------------- /DEV.md: -------------------------------------------------------------------------------- 1 | # Hacking on awsbill2graphite 2 | 3 | ## Running tests 4 | 5 | In the top level directory, run: 6 | 7 | nosetests 8 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Dan Slimmon 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # awsbill2graphite 2 | 3 | `awsbill2graphite` is a script that converts AWS hourly billing CSVs to Graphite metrics. 4 | 5 | ![dashboard screenshot](https://raw.githubusercontent.com/danslimmon/awsbill2graphite/master/static/dashboard.png) 6 | 7 | _If you want to hack on it, check out [DEV.md](https://github.com/danslimmon/awsbill2graphite/blob/master/DEV.md)._ 8 | 9 | So far, it does the following types of metrics: 10 | 11 | 1. Per-region, per-EC2-instance-type cost by the hour 12 | 2. EBS metrics, including storage costs, PIOPS costs, per-million-IOPS costs, and snapshot 13 | storage costs 14 | 3. Per-region RDS costs, including storage, PIOPS, and instance-hours 15 | 4. ElastiCache costs per-instance-type 16 | 5. Total AWS cost by the hour 17 | 18 | More are planned. 19 | 20 | 21 | ## Prep 22 | 23 | First of all, you'll need to have hourly billing reports enabled. You can do this 24 | through the AWS billing control panel. 25 | 26 | `awsbill2graphite` has some dependencies. We don't have a pip package yet (but we 27 | have an [issue](https://github.com/danslimmon/awsbill2graphite/issues/1) for it. To 28 | install the dependencies, go into a 29 | [virtualenv](http://docs.python-guide.org/en/latest/dev/virtualenvs/) and run 30 | 31 | pip install -r requirements.txt 32 | 33 | The script will have to be run in that virtualenv. 34 | 35 | In order to prevent Graphite from creating giant, mostly-zero data files, set the 36 | following in `storage-schemas.conf`: 37 | 38 | [awsbill] 39 | priority = 256 40 | pattern = ^awsbill\. 41 | retentions = 1h:3650d 42 | 43 | ## Usage 44 | 45 | First set the following environment variables: 46 | 47 | * `AWSBILL_REPORT_PATH`: The path where the report lives. If downloading from S3, this 48 | should be `s3://` followed by the bucket name followed by the "Report path" as defined 49 | in the AWS billing control panel. If reading a local file, it should start with 50 | `file://` and give the path to an hourly billing CSV file. 51 | * `AWS_ACCESS_KEY_ID`: The identifier for an AWS credentials pair that will enable access 52 | to the bucket with billing reports in it. If you're using a local file instead of 53 | downloading the report from S3, you can omit this. 54 | * `AWS_SECRET_ACCESS_KEY`: The secret access key that corresponds to `AWS_ACCESS_KEY_ID`. 55 | If you're using a local file instead of downloading the report from S3, you can omit 56 | this. 57 | * `AWSBILL_GRAPHITE_HOST`: The hostname of the Graphite server to which to write metrics. 58 | If instead you want to output metrics to stdout, set this environment variable to 59 | `stdout`. If the Graphite port is not the default of 2003, you may append it after a 60 | colon. 61 | * `AWSBILL_METRIC_PREFIX`: The prefix to use for metrics written to Graphite. If absent, 62 | metrics will begin with "`awsbill.`". If you set this, you should modify the `[awsbill]` 63 | stanza you added to Graphite's `storage-schemas.conf` accordingly. 64 | 65 | Then run 66 | 67 | awsbill2graphite.py 68 | 69 | This will produce metrics named like so: 70 | 71 | PREFIX.REGION.ec2-instance.t2-micro 72 | PREFIX.REGION.ec2-instance.c4-2xlarge 73 | PREFIX.REGION.ebs.snapshot 74 | PREFIX.REGION.ebs.piops 75 | PREFIX.REGION.rds.db-r3-xlarge 76 | 77 | Each metric will have a data point every hour. This data point represents the total amount 78 | charged to your account for the hour _previous_ to the data point's timestamp. 79 | 80 | ## Making Graphite/Grafana dashboards with these metrics 81 | 82 | Here is a JSON description of a basic per-region-summary Grafana dashboard: [grafana_dashboard.json](https://github.com/danslimmon/awsbill2graphite/blob/master/static/grafana_dashboard.json). 83 | 84 | A few notes: 85 | 86 | * Snapshots are only billed once daily, so the snapshot metrics will be equal to 0 for 87 | most of their values. The value they do contain will be the cost for that _entire day_, 88 | not the hour. 89 | * At the end of a month, the billing report you get will be missing most of the final 90 | day's data. That's just how AWS hourly billing reports work. Eventually (4 or 5 days 91 | after the end of the month) they give you a final report for the month, with all the 92 | data. So in the interim, you'll have a big ugly dip in your graphs. 93 | -------------------------------------------------------------------------------- /awsbill2graphite.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import csv 3 | import gzip 4 | import json 5 | import logging 6 | import os 7 | import re 8 | import shutil 9 | import socket 10 | import sys 11 | import tempfile 12 | from collections import defaultdict 13 | from datetime import datetime 14 | from operator import attrgetter 15 | 16 | import boto3 17 | 18 | REGION_NAMES = { 19 | "US East (N. Virginia)": "us-east-1", 20 | "US West (N. California)": "us-west-1", 21 | "US West (Oregon)": "us-west-2", 22 | "EU (Ireland)": "eu-west-1", 23 | "EU (Frankfurt)": "eu-central-1", 24 | "Asia Pacific (Tokyo)": "ap-northeast-1", 25 | "Asia Pacific (Seoul)": "ap-northeast-2", 26 | "Asia Pacific (Singapore)": "ap-southeast-1", 27 | "Asia Pacific (Sydney)": "ap-southeast-2", 28 | "South America (Sao Paulo)": "sa-east-1", 29 | } 30 | 31 | EBS_TYPES = { 32 | "Magnetic": "standard", 33 | "General Purpose": "gp2", 34 | "Provisioned IOPS": "io1", 35 | "Unknown Storage": "unknown" 36 | } 37 | 38 | # As of 2016-09-01, the hourly billing report doesn't have data in the 39 | # 'product/volumeType' column for RDS storage anymore. We have to check 40 | # for a substring of 'lineItem/LineItemDescription' instead. 41 | RDS_STORAGE_TYPES = { 42 | "Provisioned IOPS Storage": "io1", 43 | "provisioned GP2 storage": "gp2", 44 | } 45 | 46 | 47 | def parse_datetime(timestamp): 48 | """Parses a timestamp in the format 2006-01-02T15:04:05Z.""" 49 | # This way is about 31x faster than arrow.get() 50 | # and 6.5x faster than datetime.strptime() 51 | year = int(timestamp[0:4]) 52 | month = int(timestamp[5:7]) 53 | day = int(timestamp[8:10]) 54 | hour = int(timestamp[11:13]) 55 | minute = int(timestamp[14:16]) 56 | second = int(timestamp[17:19]) 57 | return datetime(year, month, day, hour, minute, second) 58 | 59 | 60 | def open_csv(tempdir, region_name): 61 | """Opens the latest hourly billing CSV file. Returns an open file object. 62 | Depending on the AWSBILL_REPORT_PATH environment variable, 63 | this may involve 64 | downloading from S3, or it may just open a local file.""" 65 | report_path = os.getenv("AWSBILL_REPORT_PATH") 66 | if report_path.startswith("file://"): 67 | csv_path = report_path[len("file://"):] 68 | elif report_path.startswith("s3://"): 69 | csv_path = download_latest_from_s3(report_path, tempdir, region_name) 70 | else: 71 | raise ValueError("AWSBILL_REPORT_PATH environment variable must start with 'file://' or 's3://'") # noqa 72 | return open(csv_path) 73 | 74 | 75 | def open_output(): 76 | """Opens the file-like object that will be used for output, and returns it. 77 | Depending on the AWSBILL_GRAPHITE_HOST environment variable, 78 | writes to this object may be sent to a Graphite 79 | server or they may be written to stdout.""" 80 | output_host = os.getenv("AWSBILL_GRAPHITE_HOST") 81 | if output_host is None: 82 | raise ValueError("AWSBILL_GRAPHITE_HOST environment variable must specify the output destination; you may use 'stdout' to print metrics to stdout") # noqa 83 | elif output_host == "stdout": 84 | output_file = sys.stdout 85 | else: 86 | output_port = 2003 87 | if ":" in output_host: 88 | output_port = int(output_host.split(":", 1)[1]) 89 | output_host = output_host.split(":", 1)[0] 90 | output_file = SocketWriter(output_host, output_port) 91 | return output_file 92 | 93 | 94 | def s3_primary_manifests(objects): 95 | """Returns the S3 object(s) corresponding to the relevant primary manifests 96 | 97 | The relevant ones are considered to be the second-most- and most recent 98 | ones, and they are returned in that order. If there are no billing 99 | cycles older than the most recent, we return a single-element list with 100 | only the most recent manifest. 101 | 102 | `objects` should be an iterable of S3 objects.""" 103 | # The path to the billing report manifest is like this: 104 | # 105 | # //hourly_billing/-/hourly_billing-Manifest.json # noqa 106 | # 107 | # We look for the most recent timestamp directory and use the manifest 108 | # therein to find the most recent billing CSV. 109 | manifests = [o for o in objects if o.key.endswith("Manifest.json")] 110 | 111 | # Filter to those from the second-most- and most recent billing cycle 112 | manifests.sort(key=attrgetter("key"), reverse=True) 113 | cycles = set([]) 114 | for m in manifests: 115 | rslt = re.search("/(\d{8}-\d{8})/", m.key) 116 | if rslt is not None: 117 | cycles.add(rslt.group(1)) 118 | if len(cycles) == 0: 119 | raise Exception("Failed to find any appropriately-named billing CSVs") 120 | last_two_cycles = sorted(list(cycles))[-2:] 121 | if len(last_two_cycles) < 2: 122 | last_two_cycles = 2 * last_two_cycles 123 | manifests = [m for m in manifests if 124 | last_two_cycles[0] in m.key or last_two_cycles[1] in m.key] 125 | 126 | # The primary manifest(s) will be the one(s) with the shortest path length 127 | manifests.sort(key=lambda a: len(a.key)) 128 | if last_two_cycles[0] == last_two_cycles[1]: 129 | # There was only one billing cycle present among the manifests 130 | return [manifests[0]] 131 | return [manifests[1], manifests[0]] 132 | 133 | 134 | def download_latest_from_s3(s3_path, tempdir, region_name): 135 | """Puts the latest hourly billing report from the given S3 path in a local 136 | file. 137 | 138 | Returns the path to that file.""" 139 | s3 = boto3.resource("s3", region_name=region_name) 140 | bucket = s3.Bucket(s3_path.split("/")[2]) 141 | primaries = s3_primary_manifests(bucket.objects.all()) 142 | logging.info("Using primary manifest(s) {0}".format( 143 | [p.key for p in primaries] 144 | ) 145 | ) 146 | 147 | # Now we parse the manifest to get the path to the latest billing CSV 148 | s3_csvs = [] 149 | for pri in primaries: 150 | manifest = json.loads(pri.get()['Body'].read()) 151 | s3_csvs.extend(manifest["reportKeys"]) 152 | 153 | # Download each billing CSV to a temp directory and decompress 154 | try: 155 | cat_csv_path = os.path.join(tempdir, "billing_full.csv") 156 | cat_csv = open(cat_csv_path, "w") 157 | header_written = False 158 | for s3_csv in s3_csvs: 159 | logging.info("Downloading CSV from S3: {0}".format(s3_csv)) 160 | local_path = os.path.join(tempdir, s3_csv.split("/")[-1]) 161 | local_file = open(local_path, "w") 162 | obj = [o for o in bucket.objects.filter(Prefix=s3_csv)][0] 163 | local_file.write(obj.get()['Body'].read()) 164 | local_file.close() 165 | logging.info("Decompressing CSV: {0}".format(s3_csv)) 166 | 167 | with gzip.open(local_path, "r") as f: 168 | for line in f: 169 | if line.startswith( 170 | "identity/LineItemId," 171 | ) and header_written: 172 | continue 173 | cat_csv.write(line) 174 | header_written = True 175 | # Remove these files as we finish with them to save on disk space 176 | os.unlink(local_path) 177 | except Exception, e: 178 | logging.error( 179 | "Exception: cleaning up by removing temp directory '{0}'".format( 180 | tempdir 181 | ) 182 | ) 183 | shutil.rmtree(tempdir) 184 | raise e 185 | 186 | cat_csv.close() 187 | return cat_csv_path 188 | 189 | 190 | class SocketWriter(object): 191 | """Wraps a socket object with a file-like write() method.""" 192 | def __init__(self, host, port): 193 | self.host = host 194 | self.port = port 195 | self._sock = None 196 | 197 | def write(self, data): 198 | if self._sock is None: 199 | logging.info("Connecting to Graphite server at {0}:{1}".format( 200 | self.host, 201 | self.port 202 | ) 203 | ) 204 | self._sock = socket.create_connection((self.host, self.port)) 205 | return self._sock.send(data) 206 | 207 | 208 | class MetricLedger(object): 209 | """Processes Row instances and generates timeseries data from them.""" 210 | def __init__(self, timeseries_patterns): 211 | """Initializes the MetricLedger with alist of TimeseriesPattern 212 | objects.""" 213 | self._patterns = timeseries_patterns 214 | self._timeseries = defaultdict(lambda: defaultdict(float)) 215 | 216 | def process(self, row): 217 | """Adds the data from the given Row object to any appropriate 218 | timeseries.""" 219 | # Skip entries of the wrong type 220 | if row.content["lineItem/LineItemType"] != "Usage": 221 | return 222 | 223 | # Skip non-hourly entries 224 | if row.interval() != 3600: 225 | return 226 | for pat in self._patterns: 227 | if pat.match(row): 228 | for metric in pat.metric_names(row): 229 | self._timeseries[metric][row.end_time()] += row.amount() 230 | 231 | def output(self, output_file): 232 | formatter = MetricFormatter() 233 | logging.info("Writing metrics to timeseries database") 234 | for ts_id, ts in self._timeseries.iteritems(): 235 | for timestamp, value in ts.iteritems(): 236 | output_file.write(formatter.format(ts_id, timestamp, value)) 237 | logging.info("Finished writing %d metrics to timeseries database", len(self._timeseries)) 238 | 239 | def get_timeseries(self): 240 | """Returns self._timeseries (for tests).""" 241 | return self._timeseries 242 | 243 | 244 | class MetricFormatter(object): 245 | """Converts CSV data to Graphite format.""" 246 | def __init__(self): 247 | self._initial_pieces = [] 248 | if os.getenv("AWSBILL_METRIC_PREFIX") != "": 249 | self._initial_pieces = [os.getenv("AWSBILL_METRIC_PREFIX")] 250 | else: 251 | self._initial_pieces = ["awsbill"] 252 | 253 | def format(self, ts_id, timestamp, value): 254 | """Returns the Graphite line that corresponds to the given timeseries 255 | ID, timestamp, and value.""" 256 | pieces = [p for p in self._initial_pieces] 257 | pieces.append(ts_id) 258 | metric_name = ".".join(pieces) 259 | return "{0} {1:04f} {2}\n".format( 260 | metric_name, 261 | value, 262 | timestamp.strftime('%s') 263 | ) 264 | 265 | 266 | class TimeseriesPattern(object): 267 | """Describes a set of time series to be generated from the billing data. 268 | 269 | This is an abstract class. Provide an implementation of the match() and 270 | metric_name() methods.""" 271 | def match(self, row): 272 | """Determines whether the given Row instance matches the timeseries 273 | pattern. 274 | 275 | Returns True if so.""" 276 | raise NotImplementedError("This is an abstract class") 277 | 278 | def metric_names(self, row): 279 | """Returns the names of the metrics to which the given row's amount() 280 | value should be added. 281 | 282 | We assume that match() has been called on the row already, and 283 | returned True.""" 284 | raise NotImplementedError("This is an abstract class") 285 | 286 | 287 | class TsInstanceType(TimeseriesPattern): 288 | """Describes per-EC2-instance-type Graphite metrics.""" 289 | def match(self, row): 290 | if row.usage_type(): 291 | return (row.usage_type().startswith("ec2-instance.")) 292 | else: 293 | pass 294 | 295 | def metric_names(self, row): 296 | return [".".join((row.region(), row.usage_type()))] 297 | 298 | 299 | class TsEbsStorage(TimeseriesPattern): 300 | """Describes per-volume-type EBS storage metric.""" 301 | def match(self, row): 302 | return row.usage_type().startswith("ebs.storage.") 303 | 304 | def metric_names(self, row): 305 | return [".".join((row.region(), row.usage_type()))] 306 | 307 | 308 | class TsEbsPiops(TimeseriesPattern): 309 | """Describes the metric for PIOPS-month costs.""" 310 | def match(self, row): 311 | return row.usage_type() == "ebs.piops" 312 | 313 | def metric_names(self, row): 314 | return [".".join((row.region(), "ebs.piops"))] 315 | 316 | 317 | class TsEbsIops(TimeseriesPattern): 318 | """Describes the metric for IOPS costs.""" 319 | def match(self, row): 320 | return row.usage_type() == "ebs.iops" 321 | 322 | def metric_names(self, row): 323 | return [".".join((row.region(), "ebs.iops"))] 324 | 325 | 326 | class TsEbsSnapshot(TimeseriesPattern): 327 | """Describes the metric for EBS snapshot costs.""" 328 | def match(self, row): 329 | return row.usage_type() == "ebs.snapshot" 330 | 331 | def metric_names(self, row): 332 | return [".".join((row.region(), "ebs.snapshot"))] 333 | 334 | 335 | class TsRdsInstanceType(TimeseriesPattern): 336 | """Describes per-RDS-instance-type Graphite metrics.""" 337 | def match(self, row): 338 | return (row.usage_type().startswith("rds-instance.")) 339 | 340 | def metric_names(self, row): 341 | return [".".join((row.region(), row.usage_type()))] 342 | 343 | 344 | class TsRdsStorage(TimeseriesPattern): 345 | """Describes per-volume-type RDS storage metric.""" 346 | def match(self, row): 347 | return row.usage_type().startswith("rds.storage.") 348 | 349 | def metric_names(self, row): 350 | return [".".join((row.region(), row.usage_type()))] 351 | 352 | 353 | class TsRdsPiops(TimeseriesPattern): 354 | """Describes the metric for RDS PIOPS-month costs.""" 355 | def match(self, row): 356 | return row.usage_type() == "rds.piops" 357 | 358 | def metric_names(self, row): 359 | return [".".join((row.region(), "rds.piops"))] 360 | 361 | 362 | class TsElasticacheInstanceType(TimeseriesPattern): 363 | """Describes per-ElastiCache-instance-type Graphite metrics.""" 364 | def match(self, row): 365 | return (row.usage_type().startswith("elasticache-instance.")) 366 | 367 | def metric_names(self, row): 368 | return [".".join((row.region(), row.usage_type()))] 369 | 370 | 371 | class TsRegionTotal(TimeseriesPattern): 372 | """Describes a Graphite metric containing the sum of all hourly costs per 373 | region. 374 | 375 | This includes costs that we don't explicitly recognize and break out 376 | into individual metrics. Any cost that shows up in the billing report 377 | will go into this metric.""" 378 | def match(self, row): 379 | return True 380 | 381 | def metric_names(self, row): 382 | return ["total-cost.{0}".format(row.region())] 383 | 384 | 385 | class Row(object): 386 | __slots__ = ["content", "_usage_type"] 387 | 388 | def __init__(self, col_names, row_list): 389 | """Initializes a Row object, given the names of the CSV columns and 390 | their values.""" 391 | self.content = dict(zip(col_names, row_list)) 392 | self._usage_type = None 393 | 394 | def region(self): 395 | """Returns the normalized AWS region for the row, or 'noregion'. 396 | 397 | Normalized region names are like 'us-east-2', 'ap-northeast-1'.""" 398 | if self.content["product/location"] in REGION_NAMES: 399 | # Most services have product/location set 400 | return REGION_NAMES[self.content["product/location"]] 401 | elif self.content["lineItem/AvailabilityZone"] and \ 402 | self.content["lineItem/AvailabilityZone"][-1] in "1234567890": 403 | # Some services, e.g. ElastiCache, use lineItem/AvailabilityZone 404 | # instead 405 | return self.content["lineItem/AvailabilityZone"] 406 | return "noregion" 407 | 408 | def interval(self): 409 | """Returns the length of the time interval to which this row 410 | correpsonds, in seconds.""" 411 | start, end = [parse_datetime(x) for x in 412 | self.content["identity/TimeInterval"].split("/", 1)] 413 | return int((end - start).total_seconds()) 414 | 415 | def usage_type(self): 416 | """Parses the "lineItem/UsageType" field to get at the "subtype" 417 | (my term). 418 | 419 | Usage types can be of many forms. Here are some examples: 420 | 421 | USE1-USW2-AWS-In-Bytes 422 | Requests-RBP 423 | Request 424 | APN1-DataProcessing-Bytes 425 | APN1-BoxUsage:c3.2xlarge 426 | 427 | It's a goddamn nightmare. We try our best. Then we return the name 428 | of the subtype, in the format in which it'll appear in the Graphite 429 | metric. 430 | Examples of usage types are: 431 | 432 | ec2-instance.c3-2xlarge 433 | ebs.storage.io1 434 | ebs.piops 435 | rds-instance.db-r3-large 436 | 437 | This method returns the empty string if the usage type isn't 438 | known.""" 439 | if self._usage_type is not None: 440 | return self._usage_type 441 | splut = self.content["lineItem/UsageType"].split("-", 1) 442 | if len(splut[0]) == 4 and splut[0][0:2] in ( 443 | "US", 444 | "EU", 445 | "AP", 446 | "SA" 447 | ) and splut[0].isupper() and splut[0][3].isdigit(): 448 | # Stuff before dash was probably a region code like "APN1" 449 | csv_usage_type = splut[1] 450 | else: 451 | csv_usage_type = splut[0] 452 | self._usage_type = "" 453 | 454 | # EC2 455 | if csv_usage_type.startswith("BoxUsage:"): 456 | self._usage_type = self._usage_type_ec2_instance() 457 | if csv_usage_type == "EBS:VolumeP-IOPS.piops": 458 | self._usage_type = "ebs.piops" 459 | if csv_usage_type.startswith("EBS:VolumeUsage"): 460 | self._usage_type = self._usage_type_ebs_storage() 461 | if csv_usage_type == "EBS:VolumeIOUsage": 462 | self._usage_type = "ebs.iops" 463 | if csv_usage_type == "EBS:SnapshotUsage": 464 | self._usage_type = "ebs.snapshot" 465 | 466 | # RDS 467 | if csv_usage_type.startswith("InstanceUsage:") or \ 468 | csv_usage_type.startswith("Multi-AZUsage:"): 469 | self._usage_type = self._usage_type_rds_instance() 470 | if csv_usage_type == "RDS:PIOPS" or \ 471 | csv_usage_type == "RDS:Multi-AZ-PIOPS": 472 | self._usage_type = "rds.piops" 473 | if csv_usage_type.startswith("RDS:") and \ 474 | csv_usage_type.endswith("Storage"): 475 | self._usage_type = self._usage_type_rds_storage() 476 | 477 | # ElastiCache 478 | if csv_usage_type.startswith("NodeUsage:"): 479 | self._usage_type = self._usage_type_elasticache_instance() 480 | 481 | return self._usage_type 482 | 483 | def _usage_type_ec2_instance(self): 484 | splut = self.content["lineItem/UsageType"].split(":", 1) 485 | if len(splut) < 2: 486 | return None 487 | instance_type = splut[1].replace(".", "-") 488 | return "ec2-instance.{0}".format(instance_type) 489 | 490 | def _usage_type_ebs_storage(self): 491 | if "product/volumeType" in self.content: 492 | return "ebs.storage.{0}".format( 493 | EBS_TYPES[self.content["product/volumeType"]] 494 | ) 495 | else: 496 | return "ebs.storage.unknown" 497 | 498 | def _usage_type_rds_instance(self): 499 | splut = self.content["lineItem/UsageType"].split(":", 1) 500 | if len(splut) < 2: 501 | return None 502 | instance_type = splut[1].replace(".", "-") 503 | return "rds-instance.{0}".format(instance_type) 504 | 505 | def _usage_type_rds_storage(self): 506 | line_item_description = self.content['lineItem/LineItemDescription'] 507 | volume_type = "" 508 | for substring in RDS_STORAGE_TYPES.keys(): 509 | if substring in line_item_description: 510 | volume_type = RDS_STORAGE_TYPES[substring] 511 | if volume_type == "": 512 | raise ValueError("Can't determine RDS storage type from line item description: '{0}'".format(line_item_description)) #noqa 513 | return "rds.storage.{0}".format(volume_type) 514 | 515 | def _usage_type_elasticache_instance(self): 516 | splut = self.content["lineItem/UsageType"].split(":", 1) 517 | if len(splut) < 2: 518 | return None 519 | instance_type = splut[1].replace(".", "-") 520 | return "elasticache-instance.{0}".format(instance_type) 521 | 522 | def end_time(self): 523 | return parse_datetime( 524 | self.content["identity/TimeInterval"].split("/", 1)[1] 525 | ) 526 | 527 | def tags(self): 528 | return {} 529 | 530 | def amount(self): 531 | return float(self.content["lineItem/BlendedCost"]) 532 | 533 | 534 | def new_metric_ledger(): 535 | return MetricLedger([ 536 | # EC2 537 | TsInstanceType(), 538 | TsEbsStorage(), 539 | TsEbsPiops(), 540 | TsEbsIops(), 541 | TsEbsSnapshot(), 542 | # RDS 543 | TsRdsInstanceType(), 544 | TsRdsStorage(), 545 | TsRdsPiops(), 546 | # ElastiCache 547 | TsElasticacheInstanceType(), 548 | # Total 549 | TsRegionTotal(), 550 | ]) 551 | 552 | 553 | def generate_metrics(csv_file, output_file): 554 | """Generates metrics from the given CSV and writes them to the given 555 | file-like object.""" 556 | reader = csv.reader(csv_file) 557 | col_names = reader.next() 558 | # formatter = MetricFormatter() 559 | ledger = new_metric_ledger() 560 | logging.info("Calculating billing metrics") 561 | for row_list in reader: 562 | row = Row(col_names, row_list) 563 | ledger.process(row) 564 | ledger.output(output_file) 565 | 566 | if __name__ == "__main__": 567 | logging.basicConfig(format='%(asctime)s %(message)s', level=logging.INFO) 568 | logging.getLogger('boto').setLevel(logging.CRITICAL) 569 | logging.getLogger('boto3').setLevel(logging.CRITICAL) 570 | logging.getLogger('botocore').setLevel(logging.CRITICAL) 571 | if os.getenv("REGION_NAME") != '': 572 | region_name = os.getenv("REGION_NAME") 573 | else: 574 | region_name = 'us-west-1' 575 | try: 576 | tempdir = tempfile.mkdtemp(".awsbill") 577 | csv_file = open_csv(tempdir, region_name) 578 | output_file = open_output() 579 | generate_metrics(csv_file, output_file) 580 | logging.info("Removing temp directory '{0}'".format(tempdir)) 581 | shutil.rmtree(tempdir) 582 | logging.info("Mission complete.") 583 | except Exception, e: 584 | logging.exception(e) 585 | -------------------------------------------------------------------------------- /print_all_csvs.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import gzip 3 | import json 4 | import os 5 | import shutil 6 | import sys 7 | import tempfile 8 | 9 | import boto3 10 | 11 | def all_s3_primary_manifests(objects): 12 | """Returns the S3 object(s) corresponding to all primary manifests. 13 | 14 | `objects` should be an iterable of S3 objects.""" 15 | manifests = [o for o in objects if o.key.endswith("Manifest.json")] 16 | # The primary manifest(s) will be the one(s) with the shortest path length 17 | manifests.sort(key=lambda a: len(a.key)) 18 | n_slash = manifests[0].key.count("/") 19 | for i in range(len(manifests)-1): 20 | if manifests[i].key.count("/") > n_slash: 21 | break 22 | return manifests[:i] 23 | 24 | 25 | def print_all_from_s3(s3_path, tempdir, region_name): 26 | """Outputs all hourly billing reports from the given S3 path to stdout.""" 27 | s3 = boto3.resource("s3", region_name=region_name) 28 | bucket = s3.Bucket(s3_path.split("/")[2]) 29 | primaries = all_s3_primary_manifests(bucket.objects.all()) 30 | 31 | # Now we parse the manifest to get the path to the latest billing CSV 32 | s3_csvs = [] 33 | for pri in primaries: 34 | manifest = json.loads(pri.get()['Body'].read()) 35 | s3_csvs.extend(manifest["reportKeys"]) 36 | 37 | # Download each billing CSV to a temp directory and decompress 38 | header_written = False 39 | for s3_csv in s3_csvs: 40 | local_path = os.path.join(tempdir, s3_csv.split("/")[-1]) 41 | local_file = open(local_path, "w") 42 | obj = [o for o in bucket.objects.filter(Prefix=s3_csv)][0] 43 | local_file.write(obj.get()['Body'].read()) 44 | local_file.close() 45 | 46 | with gzip.open(local_path, "r") as f: 47 | for line in f: 48 | if line.startswith( 49 | "identity/LineItemId," 50 | ) and header_written: 51 | continue 52 | sys.stdout.write(line) 53 | header_written = True 54 | # Remove these files as we finish with them to save on disk space 55 | os.unlink(local_path) 56 | 57 | if __name__ == "__main__": 58 | if os.getenv("REGION_NAME") != '': 59 | region_name = os.getenv("REGION_NAME") 60 | else: 61 | region_name = 'us-west-1' 62 | 63 | tempdir = tempfile.mkdtemp(".awsbill") 64 | print_all_from_s3(os.getenv("AWSBILL_REPORT_PATH"), tempdir, region_name) 65 | shutil.rmtree(tempdir) 66 | -------------------------------------------------------------------------------- /redact_csv.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """Turns an hourly billing CSV into one we can test against. 3 | 4 | We redact anything proprietary, including: 5 | 6 | * tag names and values 7 | * cost values 8 | * instance IDs 9 | * line item IDs 10 | * account IDs 11 | 12 | We write the redacted CSV to stdout.""" 13 | 14 | import sys 15 | import csv 16 | import random 17 | 18 | from awsbill2graphite import Row 19 | 20 | ALPHA = "abcdefghijklmnopqrstuvwxyz" 21 | 22 | INCLUDED_COLS = set(( 23 | "identity/TimeInterval", 24 | "lineItem/LineItemType", 25 | "product/location", 26 | "product/volumeType", 27 | )) 28 | 29 | def make_alpha(n): 30 | """Returns a lowercase alphabetic string n characters long.""" 31 | global ALPHA 32 | return "".join((random.choice(ALPHA) for i in range(n))) 33 | 34 | def make_instance_type(instance_type): 35 | """Returns a random instance type string of the same kind as the given one. 36 | 37 | For example, if instance_type is "db.r3.large", we'll return an instance 38 | type starting with "db.".""" 39 | splut = instance_type.split(".") 40 | splut[-2] = random.choice(("t2", "c4", "m4")) 41 | splut[-1] = random.choice(("medium", "large", "2xlarge")) 42 | return ".".join(splut) 43 | 44 | if __name__ == "__main__": 45 | reader = csv.reader(open(sys.argv[1], "rb")) 46 | writer = csv.writer(sys.stdout) 47 | col_names = reader.next() 48 | 49 | # Redact tag names 50 | for i in range(len(col_names)): 51 | if col_names[i].startswith("resourceTags/user:"): 52 | col_names[i] = "resourceTags/user:{0}".format(make_alpha(10)) 53 | writer.writerow(col_names) 54 | 55 | for row_list in reader: 56 | row = [] 57 | for i in range(len(row_list)): 58 | col_name = col_names[i] 59 | col_val = row_list[i] 60 | 61 | if col_name in INCLUDED_COLS: 62 | row.append(col_val) 63 | elif col_name.endswith("Cost"): 64 | row.append(round(random.random()*10., 8)) 65 | elif col_name.startswith("resourceTags/user:"): 66 | row.append(col_val) 67 | elif col_name == "lineItem/UsageType" and "Usage:" in col_val: 68 | splut = col_val.rsplit(":", 1) 69 | splut[-1] = make_instance_type(splut[-1]) 70 | row.append(":".join(splut)) 71 | elif col_name == "lineItem/UsageType": 72 | row.append(col_val) 73 | else: 74 | row.append("") 75 | writer.writerow(row) 76 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | boto3==1.3.0 2 | botocore==1.4.11 3 | python-dateutil==2.5.2 4 | six==1.10.0 5 | -------------------------------------------------------------------------------- /static/dashboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danslimmon/awsbill2graphite/98f60d7c8cf8784f5ad46322310994b86403ebde/static/dashboard.png -------------------------------------------------------------------------------- /static/grafana_dashboard.json: -------------------------------------------------------------------------------- 1 | { 2 | "id": 35, 3 | "title": "aws bill", 4 | "originalTitle": "aws bill", 5 | "tags": [], 6 | "style": "dark", 7 | "timezone": "browser", 8 | "editable": true, 9 | "hideControls": false, 10 | "sharedCrosshair": false, 11 | "rows": [ 12 | { 13 | "collapse": false, 14 | "editable": true, 15 | "height": "250px", 16 | "panels": [ 17 | { 18 | "aliasColors": {}, 19 | "bars": false, 20 | "datasource": null, 21 | "editable": true, 22 | "error": false, 23 | "fill": 5, 24 | "grid": { 25 | "leftLogBase": 1, 26 | "leftMax": null, 27 | "leftMin": 0, 28 | "rightLogBase": 1, 29 | "rightMax": null, 30 | "rightMin": null, 31 | "threshold1": null, 32 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 33 | "threshold2": null, 34 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 35 | }, 36 | "id": 2, 37 | "leftYAxisLabel": "$/month", 38 | "legend": { 39 | "avg": false, 40 | "current": false, 41 | "max": false, 42 | "min": false, 43 | "show": true, 44 | "total": false, 45 | "values": false 46 | }, 47 | "lines": true, 48 | "linewidth": 3, 49 | "links": [], 50 | "nullPointMode": "connected", 51 | "percentage": false, 52 | "pointradius": 5, 53 | "points": false, 54 | "renderer": "flot", 55 | "seriesOverrides": [], 56 | "span": 12, 57 | "stack": true, 58 | "steppedLine": false, 59 | "targets": [ 60 | { 61 | "target": "aliasByNode(scale(summarize(awsbill.total-cost.*, \"1day\", \"sum\"), 30), 2)", 62 | "textEditor": true 63 | } 64 | ], 65 | "timeFrom": null, 66 | "timeShift": null, 67 | "title": "Total cost per month (USD) (Don't trust last data point)", 68 | "tooltip": { 69 | "shared": true, 70 | "value_type": "cumulative" 71 | }, 72 | "type": "graph", 73 | "x-axis": true, 74 | "y-axis": true, 75 | "y_formats": [ 76 | "short", 77 | "short" 78 | ] 79 | } 80 | ], 81 | "title": "Total" 82 | }, 83 | { 84 | "collapse": false, 85 | "editable": true, 86 | "height": "250px", 87 | "panels": [ 88 | { 89 | "aliasColors": {}, 90 | "bars": false, 91 | "datasource": null, 92 | "editable": true, 93 | "error": false, 94 | "fill": 0, 95 | "grid": { 96 | "leftLogBase": 1, 97 | "leftMax": null, 98 | "leftMin": 0, 99 | "rightLogBase": 1, 100 | "rightMax": null, 101 | "rightMin": 0, 102 | "threshold1": null, 103 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 104 | "threshold2": null, 105 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 106 | }, 107 | "id": 1, 108 | "leftYAxisLabel": "$/month", 109 | "legend": { 110 | "avg": false, 111 | "current": false, 112 | "max": false, 113 | "min": false, 114 | "show": true, 115 | "total": false, 116 | "values": false 117 | }, 118 | "lines": true, 119 | "linewidth": 4, 120 | "links": [], 121 | "nullPointMode": "connected", 122 | "percentage": false, 123 | "pointradius": 5, 124 | "points": false, 125 | "renderer": "flot", 126 | "rightYAxisLabel": "", 127 | "seriesOverrides": [], 128 | "span": 12, 129 | "stack": false, 130 | "steppedLine": false, 131 | "targets": [ 132 | { 133 | "target": "aliasByNode(scale(groupByNode(awsbill.*.ec2-instance.*, 1, \"sumSeries\"), 720), 0)", 134 | "textEditor": true 135 | } 136 | ], 137 | "timeFrom": null, 138 | "timeShift": null, 139 | "title": "Monthly EC2 Instance Costs by Region (USD)", 140 | "tooltip": { 141 | "shared": true, 142 | "value_type": "cumulative" 143 | }, 144 | "type": "graph", 145 | "x-axis": true, 146 | "y-axis": true, 147 | "y_formats": [ 148 | "short", 149 | "short" 150 | ] 151 | } 152 | ], 153 | "title": "EC2 Instances" 154 | }, 155 | { 156 | "collapse": false, 157 | "editable": true, 158 | "height": "250px", 159 | "panels": [ 160 | { 161 | "aliasColors": {}, 162 | "bars": false, 163 | "datasource": null, 164 | "editable": true, 165 | "error": false, 166 | "fill": 0, 167 | "grid": { 168 | "leftLogBase": 1, 169 | "leftMax": null, 170 | "leftMin": 0, 171 | "rightLogBase": 1, 172 | "rightMax": null, 173 | "rightMin": null, 174 | "threshold1": null, 175 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 176 | "threshold2": null, 177 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 178 | }, 179 | "id": 3, 180 | "leftYAxisLabel": "$/month", 181 | "legend": { 182 | "avg": false, 183 | "current": false, 184 | "max": false, 185 | "min": false, 186 | "show": true, 187 | "total": false, 188 | "values": false 189 | }, 190 | "lines": true, 191 | "linewidth": 3, 192 | "links": [], 193 | "nullPointMode": "connected", 194 | "percentage": false, 195 | "pointradius": 5, 196 | "points": false, 197 | "renderer": "flot", 198 | "seriesOverrides": [], 199 | "span": 4, 200 | "stack": false, 201 | "steppedLine": false, 202 | "targets": [ 203 | { 204 | "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.storage.*, 1, \"sumSeries\"), 720), 0)", 205 | "textEditor": true 206 | } 207 | ], 208 | "timeFrom": null, 209 | "timeShift": null, 210 | "title": "Monthly EBS Storage cost (USD)", 211 | "tooltip": { 212 | "shared": true, 213 | "value_type": "cumulative" 214 | }, 215 | "type": "graph", 216 | "x-axis": true, 217 | "y-axis": true, 218 | "y_formats": [ 219 | "short", 220 | "short" 221 | ] 222 | }, 223 | { 224 | "aliasColors": {}, 225 | "bars": false, 226 | "datasource": null, 227 | "editable": true, 228 | "error": false, 229 | "fill": 0, 230 | "grid": { 231 | "leftLogBase": 1, 232 | "leftMax": null, 233 | "leftMin": 0, 234 | "rightLogBase": 1, 235 | "rightMax": null, 236 | "rightMin": null, 237 | "threshold1": null, 238 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 239 | "threshold2": null, 240 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 241 | }, 242 | "id": 4, 243 | "leftYAxisLabel": "$/month", 244 | "legend": { 245 | "avg": false, 246 | "current": false, 247 | "max": false, 248 | "min": false, 249 | "show": true, 250 | "total": false, 251 | "values": false 252 | }, 253 | "lines": true, 254 | "linewidth": 3, 255 | "links": [], 256 | "nullPointMode": "connected", 257 | "percentage": false, 258 | "pointradius": 5, 259 | "points": false, 260 | "renderer": "flot", 261 | "seriesOverrides": [], 262 | "span": 4, 263 | "stack": false, 264 | "steppedLine": false, 265 | "targets": [ 266 | { 267 | "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.piops, 1, \"sumSeries\"), 720), 0)", 268 | "textEditor": true 269 | } 270 | ], 271 | "timeFrom": null, 272 | "timeShift": null, 273 | "title": "Monthly EBS PIOPS cost (USD)", 274 | "tooltip": { 275 | "shared": true, 276 | "value_type": "cumulative" 277 | }, 278 | "type": "graph", 279 | "x-axis": true, 280 | "y-axis": true, 281 | "y_formats": [ 282 | "short", 283 | "short" 284 | ] 285 | }, 286 | { 287 | "aliasColors": {}, 288 | "bars": false, 289 | "datasource": null, 290 | "editable": true, 291 | "error": false, 292 | "fill": 0, 293 | "grid": { 294 | "leftLogBase": 1, 295 | "leftMax": null, 296 | "leftMin": 0, 297 | "rightLogBase": 1, 298 | "rightMax": null, 299 | "rightMin": null, 300 | "threshold1": null, 301 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 302 | "threshold2": null, 303 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 304 | }, 305 | "id": 5, 306 | "leftYAxisLabel": "$/month", 307 | "legend": { 308 | "avg": false, 309 | "current": false, 310 | "max": false, 311 | "min": false, 312 | "show": true, 313 | "total": false, 314 | "values": false 315 | }, 316 | "lines": true, 317 | "linewidth": 3, 318 | "links": [], 319 | "nullPointMode": "connected", 320 | "percentage": false, 321 | "pointradius": 5, 322 | "points": false, 323 | "renderer": "flot", 324 | "seriesOverrides": [], 325 | "span": 4, 326 | "stack": false, 327 | "steppedLine": false, 328 | "targets": [ 329 | { 330 | "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.iops, 1, \"sumSeries\"), 720), 0)", 331 | "textEditor": true 332 | } 333 | ], 334 | "timeFrom": null, 335 | "timeShift": null, 336 | "title": "Monthly EBS IOPS cost (USD)", 337 | "tooltip": { 338 | "shared": true, 339 | "value_type": "cumulative" 340 | }, 341 | "type": "graph", 342 | "x-axis": true, 343 | "y-axis": true, 344 | "y_formats": [ 345 | "short", 346 | "short" 347 | ] 348 | } 349 | ], 350 | "title": "EBS" 351 | }, 352 | { 353 | "collapse": false, 354 | "editable": true, 355 | "height": "250px", 356 | "panels": [ 357 | { 358 | "aliasColors": {}, 359 | "bars": false, 360 | "datasource": null, 361 | "editable": true, 362 | "error": false, 363 | "fill": 0, 364 | "grid": { 365 | "leftLogBase": 1, 366 | "leftMax": null, 367 | "leftMin": 0, 368 | "rightLogBase": 1, 369 | "rightMax": null, 370 | "rightMin": null, 371 | "threshold1": null, 372 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 373 | "threshold2": null, 374 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 375 | }, 376 | "id": 6, 377 | "leftYAxisLabel": "$/month", 378 | "legend": { 379 | "avg": false, 380 | "current": false, 381 | "max": false, 382 | "min": false, 383 | "show": true, 384 | "total": false, 385 | "values": false 386 | }, 387 | "lines": true, 388 | "linewidth": 3, 389 | "links": [], 390 | "nullPointMode": "connected", 391 | "percentage": false, 392 | "pointradius": 5, 393 | "points": false, 394 | "renderer": "flot", 395 | "seriesOverrides": [], 396 | "span": 12, 397 | "stack": false, 398 | "steppedLine": false, 399 | "targets": [ 400 | { 401 | "target": "aliasByNode(scale(awsbill.*.ebs.snapshot, 30), 1)", 402 | "textEditor": true 403 | } 404 | ], 405 | "timeFrom": null, 406 | "timeShift": null, 407 | "title": "Monthly EBS snapshot cost", 408 | "tooltip": { 409 | "shared": true, 410 | "value_type": "cumulative" 411 | }, 412 | "type": "graph", 413 | "x-axis": true, 414 | "y-axis": true, 415 | "y_formats": [ 416 | "short", 417 | "short" 418 | ] 419 | } 420 | ], 421 | "title": "Snapshots" 422 | }, 423 | { 424 | "collapse": false, 425 | "editable": true, 426 | "height": "250px", 427 | "panels": [ 428 | { 429 | "aliasColors": {}, 430 | "bars": false, 431 | "datasource": null, 432 | "editable": true, 433 | "error": false, 434 | "fill": 0, 435 | "grid": { 436 | "leftLogBase": 1, 437 | "leftMax": null, 438 | "leftMin": null, 439 | "rightLogBase": 1, 440 | "rightMax": null, 441 | "rightMin": null, 442 | "threshold1": null, 443 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 444 | "threshold2": null, 445 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 446 | }, 447 | "id": 8, 448 | "leftYAxisLabel": "$/month", 449 | "legend": { 450 | "avg": false, 451 | "current": false, 452 | "max": false, 453 | "min": false, 454 | "show": true, 455 | "total": false, 456 | "values": false 457 | }, 458 | "lines": true, 459 | "linewidth": 3, 460 | "links": [], 461 | "nullPointMode": "connected", 462 | "percentage": false, 463 | "pointradius": 5, 464 | "points": false, 465 | "renderer": "flot", 466 | "seriesOverrides": [], 467 | "span": 4, 468 | "stack": false, 469 | "steppedLine": false, 470 | "targets": [ 471 | { 472 | "target": "aliasByNode(scale(groupByNode(awsbill.*.rds.storage.*, 1, \"sumSeries\"), 720), 0)", 473 | "textEditor": true 474 | } 475 | ], 476 | "timeFrom": null, 477 | "timeShift": null, 478 | "title": "Monthly RDS storage cost (USD)", 479 | "tooltip": { 480 | "shared": true, 481 | "value_type": "cumulative" 482 | }, 483 | "type": "graph", 484 | "x-axis": true, 485 | "y-axis": true, 486 | "y_formats": [ 487 | "short", 488 | "short" 489 | ] 490 | }, 491 | { 492 | "aliasColors": {}, 493 | "bars": false, 494 | "datasource": null, 495 | "editable": true, 496 | "error": false, 497 | "fill": 0, 498 | "grid": { 499 | "leftLogBase": 1, 500 | "leftMax": null, 501 | "leftMin": 0, 502 | "rightLogBase": 1, 503 | "rightMax": null, 504 | "rightMin": null, 505 | "threshold1": null, 506 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 507 | "threshold2": null, 508 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 509 | }, 510 | "id": 9, 511 | "leftYAxisLabel": "$/month", 512 | "legend": { 513 | "avg": false, 514 | "current": false, 515 | "max": false, 516 | "min": false, 517 | "show": true, 518 | "total": false, 519 | "values": false 520 | }, 521 | "lines": true, 522 | "linewidth": 3, 523 | "links": [], 524 | "nullPointMode": "connected", 525 | "percentage": false, 526 | "pointradius": 5, 527 | "points": false, 528 | "renderer": "flot", 529 | "seriesOverrides": [], 530 | "span": 4, 531 | "stack": false, 532 | "steppedLine": false, 533 | "targets": [ 534 | { 535 | "target": "aliasByNode(scale(awsbill.*.rds.piops, 720), 1)", 536 | "textEditor": true 537 | } 538 | ], 539 | "timeFrom": null, 540 | "timeShift": null, 541 | "title": "Monthly RDS PIOPS cost (USD)", 542 | "tooltip": { 543 | "shared": true, 544 | "value_type": "cumulative" 545 | }, 546 | "type": "graph", 547 | "x-axis": true, 548 | "y-axis": true, 549 | "y_formats": [ 550 | "short", 551 | "short" 552 | ] 553 | }, 554 | { 555 | "aliasColors": {}, 556 | "bars": false, 557 | "datasource": null, 558 | "editable": true, 559 | "error": false, 560 | "fill": 0, 561 | "grid": { 562 | "leftLogBase": 1, 563 | "leftMax": null, 564 | "leftMin": 0, 565 | "rightLogBase": 1, 566 | "rightMax": null, 567 | "rightMin": null, 568 | "threshold1": null, 569 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 570 | "threshold2": null, 571 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 572 | }, 573 | "id": 7, 574 | "leftYAxisLabel": "$/month", 575 | "legend": { 576 | "avg": false, 577 | "current": false, 578 | "max": false, 579 | "min": false, 580 | "show": true, 581 | "total": false, 582 | "values": false 583 | }, 584 | "lines": true, 585 | "linewidth": 3, 586 | "links": [], 587 | "nullPointMode": "connected", 588 | "percentage": false, 589 | "pointradius": 5, 590 | "points": false, 591 | "renderer": "flot", 592 | "seriesOverrides": [], 593 | "span": 4, 594 | "stack": false, 595 | "steppedLine": false, 596 | "targets": [ 597 | { 598 | "target": "aliasByNode(scale(groupByNode(awsbill.*.rds-instance.*, 1, \"sumSeries\"), 720), 0)", 599 | "textEditor": true 600 | } 601 | ], 602 | "timeFrom": null, 603 | "timeShift": null, 604 | "title": "Monthly RDS instance cost (USD)", 605 | "tooltip": { 606 | "shared": true, 607 | "value_type": "cumulative" 608 | }, 609 | "type": "graph", 610 | "x-axis": true, 611 | "y-axis": true, 612 | "y_formats": [ 613 | "short", 614 | "short" 615 | ] 616 | } 617 | ], 618 | "title": "New row" 619 | }, 620 | { 621 | "title": "New row", 622 | "height": "250px", 623 | "editable": true, 624 | "collapse": false, 625 | "panels": [ 626 | { 627 | "title": "Monthly ElastiCache cost (USD)", 628 | "error": false, 629 | "span": 12, 630 | "editable": true, 631 | "type": "graph", 632 | "id": 10, 633 | "datasource": null, 634 | "renderer": "flot", 635 | "x-axis": true, 636 | "y-axis": true, 637 | "y_formats": [ 638 | "short", 639 | "short" 640 | ], 641 | "grid": { 642 | "leftLogBase": 1, 643 | "leftMax": null, 644 | "rightMax": null, 645 | "leftMin": 0, 646 | "rightMin": null, 647 | "rightLogBase": 1, 648 | "threshold1": null, 649 | "threshold2": null, 650 | "threshold1Color": "rgba(216, 200, 27, 0.27)", 651 | "threshold2Color": "rgba(234, 112, 112, 0.22)" 652 | }, 653 | "lines": true, 654 | "fill": 0, 655 | "linewidth": 3, 656 | "points": false, 657 | "pointradius": 5, 658 | "bars": false, 659 | "stack": false, 660 | "percentage": false, 661 | "legend": { 662 | "show": true, 663 | "values": false, 664 | "min": false, 665 | "max": false, 666 | "current": false, 667 | "total": false, 668 | "avg": false 669 | }, 670 | "nullPointMode": "connected", 671 | "steppedLine": false, 672 | "tooltip": { 673 | "value_type": "cumulative", 674 | "shared": true 675 | }, 676 | "timeFrom": null, 677 | "timeShift": null, 678 | "targets": [ 679 | { 680 | "target": "aliasByNode(scale(groupByNode(awsbill.*.elasticache-instance.*, 1, \"sumSeries\"), 720), 0)", 681 | "textEditor": true 682 | } 683 | ], 684 | "aliasColors": {}, 685 | "seriesOverrides": [], 686 | "links": [], 687 | "leftYAxisLabel": "$/month" 688 | } 689 | ] 690 | } 691 | ], 692 | "nav": [ 693 | { 694 | "collapse": false, 695 | "enable": true, 696 | "notice": false, 697 | "now": true, 698 | "refresh_intervals": [ 699 | "5s", 700 | "10s", 701 | "30s", 702 | "1m", 703 | "5m", 704 | "15m", 705 | "30m", 706 | "1h", 707 | "2h", 708 | "1d" 709 | ], 710 | "status": "Stable", 711 | "time_options": [ 712 | "5m", 713 | "15m", 714 | "1h", 715 | "6h", 716 | "12h", 717 | "24h", 718 | "2d", 719 | "7d", 720 | "30d" 721 | ], 722 | "type": "timepicker" 723 | } 724 | ], 725 | "time": { 726 | "from": "now-30d", 727 | "to": "now" 728 | }, 729 | "templating": { 730 | "list": [] 731 | }, 732 | "annotations": { 733 | "list": [] 734 | }, 735 | "schemaVersion": 6, 736 | "version": 10, 737 | "links": [] 738 | } 739 | -------------------------------------------------------------------------------- /test_all.py: -------------------------------------------------------------------------------- 1 | import csv 2 | import random 3 | import unittest 4 | from datetime import datetime 5 | 6 | import awsbill2graphite as a2g 7 | 8 | class LedgerTest(unittest.TestCase): 9 | def setUp(self): 10 | ledger = a2g.new_metric_ledger() 11 | reader = csv.reader(open("test_data/hourly_billing-1.csv", "rb")) 12 | col_names = reader.next() 13 | for row_list in reader: 14 | row = a2g.Row(col_names, row_list) 15 | ledger.process(row) 16 | self.timeseries = ledger.get_timeseries() 17 | 18 | def assert_timeseries_equal(self, metric_name, expected, received): 19 | """Determines whether the two given timeseries dicts are equal (within a tolerance).""" 20 | for k in expected.keys(): 21 | if not received.has_key(k): 22 | self.fail("Key {0} missing from received timeseries '{1}'".format(k, metric_name)) 23 | return 24 | if abs(expected[k] - received[k]) > .00001: 25 | self.fail("Value for {0} for received timeseries {1} is {2}; should be {3}".format( 26 | k, metric_name, expected[k], received[k])) 27 | return 28 | for k in received.keys(): 29 | if not expected.has_key(k): 30 | self.fail("Unexpected key {0} in received timeseries '{1}'".format(k, metric_name)) 31 | return 32 | 33 | def testTsInstanceType(self): 34 | self.assertTrue(self.timeseries.has_key("us-west-1.ec2-instance.m4-2xlarge")) 35 | self.assert_timeseries_equal( 36 | "us-west-1.ec2-instance.m4-2xlarge", 37 | self.timeseries["us-west-1.ec2-instance.m4-2xlarge"], 38 | { 39 | datetime.fromtimestamp(1459746000): 31.497950, 40 | datetime.fromtimestamp(1459764000): 26.083113, 41 | datetime.fromtimestamp(1459782000): 61.615628, 42 | datetime.fromtimestamp(1459800000): 63.319794, 43 | datetime.fromtimestamp(1459789200): 42.888862, 44 | datetime.fromtimestamp(1459807200): 33.440607, 45 | datetime.fromtimestamp(1459753200): 49.640219, 46 | datetime.fromtimestamp(1459771200): 47.892134, 47 | datetime.fromtimestamp(1459735200): 43.360197, 48 | datetime.fromtimestamp(1459814400): 84.484617, 49 | datetime.fromtimestamp(1459760400): 55.846821, 50 | datetime.fromtimestamp(1459778400): 29.705564, 51 | datetime.fromtimestamp(1459742400): 63.989894, 52 | datetime.fromtimestamp(1459796400): 54.198285, 53 | datetime.fromtimestamp(1459731600): 47.450255, 54 | datetime.fromtimestamp(1459749600): 77.140611, 55 | datetime.fromtimestamp(1459803600): 78.267747, 56 | datetime.fromtimestamp(1459767600): 61.143072, 57 | datetime.fromtimestamp(1459785600): 39.729129, 58 | datetime.fromtimestamp(1459810800): 48.819524, 59 | datetime.fromtimestamp(1459774800): 44.610415, 60 | datetime.fromtimestamp(1459792800): 19.039679, 61 | datetime.fromtimestamp(1459738800): 41.609403, 62 | datetime.fromtimestamp(1459756800): 47.254336, 63 | } 64 | ) 65 | self.assertTrue(self.timeseries.has_key("ap-northeast-1.ec2-instance.t2-medium")) 66 | self.assert_timeseries_equal( 67 | "ap-northeast-1.ec2-instance.t2-medium", 68 | self.timeseries["ap-northeast-1.ec2-instance.t2-medium"], 69 | { 70 | datetime.fromtimestamp(1459807200): 9.228804, 71 | datetime.fromtimestamp(1459753200): 5.313574, 72 | datetime.fromtimestamp(1459771200): 4.238844, 73 | datetime.fromtimestamp(1459746000): 13.584161, 74 | datetime.fromtimestamp(1459764000): 5.319477, 75 | datetime.fromtimestamp(1459760400): 13.284314, 76 | datetime.fromtimestamp(1459778400): 8.792418, 77 | datetime.fromtimestamp(1459742400): 4.248000, 78 | datetime.fromtimestamp(1459735200): 3.921303, 79 | datetime.fromtimestamp(1459796400): 9.269115, 80 | datetime.fromtimestamp(1459810800): 13.109077, 81 | datetime.fromtimestamp(1459803600): 15.168434, 82 | datetime.fromtimestamp(1459785600): 3.440763, 83 | datetime.fromtimestamp(1459782000): 4.131503, 84 | datetime.fromtimestamp(1459800000): 8.607207, 85 | datetime.fromtimestamp(1459774800): 2.417751, 86 | datetime.fromtimestamp(1459814400): 5.292426, 87 | datetime.fromtimestamp(1459756800): 6.206031, 88 | } 89 | ) 90 | 91 | def testS3PrimaryManifest(self): 92 | class _S3Obj: 93 | def __init__(self, k): self.key = k 94 | # Make sure sorting the objects without `key=` causes test to fail 95 | def __lt__(self, o): return (random.randint(0, 1) == 0) 96 | 97 | manifests = [_S3Obj(k) for k in [ 98 | "prefix/hourly_billing/20160201-20160301/hourly_billing-Manifest.json", 99 | "prefix/hourly_billing/20160301-20160401/hourly_billing-Manifest.json", 100 | "prefix/hourly_billing/20160401-20160501/11c0a000-107e-11e6-813f-881fa1019b9e/hourly_billing-1.csv.gz", 101 | "prefix/hourly_billing/20160401-20160501/11c0a000-107e-11e6-813f-881fa1019b9e/hourly_billing-Manifest.json", 102 | "prefix/hourly_billing/20160501-20160601/hourly_billing-Manifest.json", 103 | "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-1.csv.gz", 104 | "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-2.csv.gz", 105 | "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-Manifest.json", 106 | "prefix/hourly_billing/20160501-20160601/2e6de863-107e-11e6-97e6-881fa1019b9e/hourly_billing-1.csv.gz", 107 | "prefix/hourly_billing/20160501-20160601/2e6de863-107e-11e6-97e6-881fa1019b9e/hourly_billing-Manifest.json", 108 | "prefix/hourly_billing/20160401-20160501/hourly_billing-Manifest.json", 109 | "prefix/hourly_billing/20160501-20160601/3d1ed007-107e-11e6-acd4-881fa1019b9e/hourly_billing-1.csv.gz", 110 | "prefix/hourly_billing/20160501-20160601/3d1ed007-107e-11e6-acd4-881fa1019b9e/hourly_billing-Manifest.json", 111 | "prefix/hourly_billing/20160301-20160401/fbc0aa99-1083-11e6-918b-881fa1019b9e/hourly_billing-1.csv.gz", 112 | "prefix/hourly_billing/20160301-20160401/fbc0aa99-1083-11e6-918b-881fa1019b9e/hourly_billing-Manifest.json", 113 | ]] 114 | primaries = a2g.s3_primary_manifests(manifests) 115 | self.assertEqual(primaries[0].key, 116 | "prefix/hourly_billing/20160401-20160501/hourly_billing-Manifest.json") 117 | self.assertEqual(primaries[1].key, 118 | "prefix/hourly_billing/20160501-20160601/hourly_billing-Manifest.json") 119 | --------------------------------------------------------------------------------