├── .gitignore
├── DEV.md
├── LICENSE
├── README.md
├── awsbill2graphite.py
├── print_all_csvs.py
├── redact_csv.py
├── requirements.txt
├── static
    ├── dashboard.png
    └── grafana_dashboard.json
├── test_all.py
└── test_data
    └── hourly_billing-1.csv


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | *$py.class
 5 | 
 6 | # C extensions
 7 | *.so
 8 | 
 9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | *.egg-info/
24 | .installed.cfg
25 | *.egg
26 | 
27 | # PyInstaller
28 | #  Usually these files are written by a python script from a template
29 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
30 | *.manifest
31 | *.spec
32 | 
33 | # Installer logs
34 | pip-log.txt
35 | pip-delete-this-directory.txt
36 | 
37 | # Unit test / coverage reports
38 | htmlcov/
39 | .tox/
40 | .coverage
41 | .coverage.*
42 | .cache
43 | nosetests.xml
44 | coverage.xml
45 | *,cover
46 | .hypothesis/
47 | 
48 | # Translations
49 | *.mo
50 | *.pot
51 | 
52 | # Django stuff:
53 | *.log
54 | 
55 | # Sphinx documentation
56 | docs/_build/
57 | 
58 | # PyBuilder
59 | target/
60 | 
61 | #Ipython Notebook
62 | .ipynb_checkpoints
63 | 
64 | # env variables during development
65 | .env
66 | .env.gpg
67 | 
68 | # editor swap files
69 | .*.sw?
70 | .sw?
71 | *~
72 | 


--------------------------------------------------------------------------------
/DEV.md:
--------------------------------------------------------------------------------
1 | # Hacking on awsbill2graphite
2 | 
3 | ## Running tests
4 | 
5 | In the top level directory, run:
6 | 
7 |     nosetests
8 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2016 Dan Slimmon
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # awsbill2graphite
 2 | 
 3 | `awsbill2graphite` is a script that converts AWS hourly billing CSVs to Graphite metrics.
 4 | 
 5 | ![dashboard screenshot](https://raw.githubusercontent.com/danslimmon/awsbill2graphite/master/static/dashboard.png)
 6 | 
 7 | _If you want to hack on it, check out [DEV.md](https://github.com/danslimmon/awsbill2graphite/blob/master/DEV.md)._
 8 | 
 9 | So far, it does the following types of metrics:
10 | 
11 | 1. Per-region, per-EC2-instance-type cost by the hour
12 | 2. EBS metrics, including storage costs, PIOPS costs, per-million-IOPS costs, and snapshot
13 |    storage costs
14 | 3. Per-region RDS costs, including storage, PIOPS, and instance-hours
15 | 4. ElastiCache costs per-instance-type
16 | 5. Total AWS cost by the hour
17 | 
18 | More are planned.
19 | 
20 | 
21 | ## Prep
22 | 
23 | First of all, you'll need to have hourly billing reports enabled. You can do this
24 | through the AWS billing control panel.
25 | 
26 | `awsbill2graphite` has some dependencies. We don't have a pip package yet (but we
27 | have an [issue](https://github.com/danslimmon/awsbill2graphite/issues/1) for it. To
28 | install the dependencies, go into a
29 | [virtualenv](http://docs.python-guide.org/en/latest/dev/virtualenvs/) and run
30 | 
31 |     pip install -r requirements.txt
32 | 
33 | The script will have to be run in that virtualenv.
34 | 
35 | In order to prevent Graphite from creating giant, mostly-zero data files, set the
36 | following in `storage-schemas.conf`:
37 | 
38 |     [awsbill]
39 |     priority = 256
40 |     pattern = ^awsbill\.
41 |     retentions = 1h:3650d
42 | 
43 | ## Usage
44 | 
45 | First set the following environment variables:
46 | 
47 | * `AWSBILL_REPORT_PATH`: The path where the report lives. If downloading from S3, this
48 |   should be `s3://` followed by the bucket name followed by the "Report path" as defined
49 |   in the AWS billing control panel. If reading a local file, it should start with
50 |  `file://` and give the path to an hourly billing CSV file.
51 | * `AWS_ACCESS_KEY_ID`: The identifier for an AWS credentials pair that will enable access
52 |   to the bucket with billing reports in it. If you're using a local file instead of
53 |   downloading the report from S3, you can omit this.
54 | * `AWS_SECRET_ACCESS_KEY`: The secret access key that corresponds to `AWS_ACCESS_KEY_ID`.
55 |   If you're using a local file instead of downloading the report from S3, you can omit
56 |   this.
57 | * `AWSBILL_GRAPHITE_HOST`: The hostname of the Graphite server to which to write metrics.
58 |   If instead you want to output metrics to stdout, set this environment variable to
59 |   `stdout`. If the Graphite port is not the default of 2003, you may append it after a
60 |   colon.
61 | * `AWSBILL_METRIC_PREFIX`: The prefix to use for metrics written to Graphite. If absent,
62 |   metrics will begin with "`awsbill.`". If you set this, you should modify the `[awsbill]`
63 |   stanza you added to Graphite's `storage-schemas.conf` accordingly.
64 | 
65 | Then run
66 | 
67 |     awsbill2graphite.py
68 | 
69 | This will produce metrics named like so:
70 | 
71 |     PREFIX.REGION.ec2-instance.t2-micro
72 |     PREFIX.REGION.ec2-instance.c4-2xlarge
73 |     PREFIX.REGION.ebs.snapshot
74 |     PREFIX.REGION.ebs.piops
75 |     PREFIX.REGION.rds.db-r3-xlarge
76 | 
77 | Each metric will have a data point every hour. This data point represents the total amount
78 | charged to your account for the hour _previous_ to the data point's timestamp.
79 | 
80 | ## Making Graphite/Grafana dashboards with these metrics
81 | 
82 | Here is a JSON description of a basic per-region-summary Grafana dashboard: [grafana_dashboard.json](https://github.com/danslimmon/awsbill2graphite/blob/master/static/grafana_dashboard.json).
83 | 
84 | A few notes:
85 | 
86 | * Snapshots are only billed once daily, so the snapshot metrics will be equal to 0 for
87 |   most of their values. The value they do contain will be the cost for that _entire day_,
88 |   not the hour.
89 | * At the end of a month, the billing report you get will be missing most of the final
90 |   day's data. That's just how AWS hourly billing reports work. Eventually (4 or 5 days
91 |   after the end of the month) they give you a final report for the month, with all the
92 |   data. So in the interim, you'll have a big ugly dip in your graphs.
93 | 


--------------------------------------------------------------------------------
/awsbill2graphite.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | import csv
  3 | import gzip
  4 | import json
  5 | import logging
  6 | import os
  7 | import re
  8 | import shutil
  9 | import socket
 10 | import sys
 11 | import tempfile
 12 | from collections import defaultdict
 13 | from datetime import datetime
 14 | from operator import attrgetter
 15 | 
 16 | import boto3
 17 | 
 18 | REGION_NAMES = {
 19 |     "US East (N. Virginia)": "us-east-1",
 20 |     "US West (N. California)": "us-west-1",
 21 |     "US West (Oregon)": "us-west-2",
 22 |     "EU (Ireland)": "eu-west-1",
 23 |     "EU (Frankfurt)": "eu-central-1",
 24 |     "Asia Pacific (Tokyo)": "ap-northeast-1",
 25 |     "Asia Pacific (Seoul)": "ap-northeast-2",
 26 |     "Asia Pacific (Singapore)": "ap-southeast-1",
 27 |     "Asia Pacific (Sydney)": "ap-southeast-2",
 28 |     "South America (Sao Paulo)": "sa-east-1",
 29 | }
 30 | 
 31 | EBS_TYPES = {
 32 |     "Magnetic": "standard",
 33 |     "General Purpose": "gp2",
 34 |     "Provisioned IOPS": "io1",
 35 |     "Unknown Storage": "unknown"
 36 | }
 37 | 
 38 | # As of 2016-09-01, the hourly billing report doesn't have data in the
 39 | # 'product/volumeType' column for RDS storage anymore. We have to check
 40 | # for a substring of 'lineItem/LineItemDescription' instead.
 41 | RDS_STORAGE_TYPES = {
 42 |     "Provisioned IOPS Storage": "io1",
 43 |     "provisioned GP2 storage": "gp2",
 44 | }
 45 | 
 46 | 
 47 | def parse_datetime(timestamp):
 48 |     """Parses a timestamp in the format 2006-01-02T15:04:05Z."""
 49 |     # This way is about 31x faster than arrow.get()
 50 |     # and 6.5x faster than datetime.strptime()
 51 |     year = int(timestamp[0:4])
 52 |     month = int(timestamp[5:7])
 53 |     day = int(timestamp[8:10])
 54 |     hour = int(timestamp[11:13])
 55 |     minute = int(timestamp[14:16])
 56 |     second = int(timestamp[17:19])
 57 |     return datetime(year, month, day, hour, minute, second)
 58 | 
 59 | 
 60 | def open_csv(tempdir, region_name):
 61 |     """Opens the latest hourly billing CSV file. Returns an open file object.
 62 |        Depending on the AWSBILL_REPORT_PATH environment variable,
 63 |        this may involve
 64 |        downloading from S3, or it may just open a local file."""
 65 |     report_path = os.getenv("AWSBILL_REPORT_PATH")
 66 |     if report_path.startswith("file://"):
 67 |         csv_path = report_path[len("file://"):]
 68 |     elif report_path.startswith("s3://"):
 69 |         csv_path = download_latest_from_s3(report_path, tempdir, region_name)
 70 |     else:
 71 |         raise ValueError("AWSBILL_REPORT_PATH environment variable must start with 'file://' or 's3://'")  # noqa
 72 |     return open(csv_path)
 73 | 
 74 | 
 75 | def open_output():
 76 |     """Opens the file-like object that will be used for output, and returns it.
 77 |        Depending on the AWSBILL_GRAPHITE_HOST environment variable,
 78 |        writes to this object may be sent to a Graphite
 79 |        server or they may be written to stdout."""
 80 |     output_host = os.getenv("AWSBILL_GRAPHITE_HOST")
 81 |     if output_host is None:
 82 |         raise ValueError("AWSBILL_GRAPHITE_HOST environment variable must specify the output destination; you may use 'stdout' to print metrics to stdout")  # noqa
 83 |     elif output_host == "stdout":
 84 |         output_file = sys.stdout
 85 |     else:
 86 |         output_port = 2003
 87 |         if ":" in output_host:
 88 |             output_port = int(output_host.split(":", 1)[1])
 89 |             output_host = output_host.split(":", 1)[0]
 90 |         output_file = SocketWriter(output_host, output_port)
 91 |     return output_file
 92 | 
 93 | 
 94 | def s3_primary_manifests(objects):
 95 |     """Returns the S3 object(s) corresponding to the relevant primary manifests
 96 | 
 97 |        The relevant ones are considered to be the second-most- and most recent
 98 |        ones, and they are returned in that order. If there are no billing
 99 |        cycles older than the most recent, we return a single-element list with
100 |        only the most recent manifest.
101 | 
102 |        `objects` should be an iterable of S3 objects."""
103 |     # The path to the billing report manifest is like this:
104 |     #
105 |     # <bucket>/<configured prefix>/hourly_billing/<YYYYmmdd>-<YYYYmmdd>/hourly_billing-Manifest.json  # noqa
106 |     #
107 |     # We look for the most recent timestamp directory and use the manifest
108 |     #  therein to find the most recent billing CSV.
109 |     manifests = [o for o in objects if o.key.endswith("Manifest.json")]
110 | 
111 |     # Filter to those from the second-most- and most recent billing cycle
112 |     manifests.sort(key=attrgetter("key"), reverse=True)
113 |     cycles = set([])
114 |     for m in manifests:
115 |         rslt = re.search("/(\d{8}-\d{8})/", m.key)
116 |         if rslt is not None:
117 |             cycles.add(rslt.group(1))
118 |     if len(cycles) == 0:
119 |         raise Exception("Failed to find any appropriately-named billing CSVs")
120 |     last_two_cycles = sorted(list(cycles))[-2:]
121 |     if len(last_two_cycles) < 2:
122 |         last_two_cycles = 2 * last_two_cycles
123 |     manifests = [m for m in manifests if
124 |                  last_two_cycles[0] in m.key or last_two_cycles[1] in m.key]
125 | 
126 |     # The primary manifest(s) will be the one(s) with the shortest path length
127 |     manifests.sort(key=lambda a: len(a.key))
128 |     if last_two_cycles[0] == last_two_cycles[1]:
129 |         # There was only one billing cycle present among the manifests
130 |         return [manifests[0]]
131 |     return [manifests[1], manifests[0]]
132 | 
133 | 
134 | def download_latest_from_s3(s3_path, tempdir, region_name):
135 |     """Puts the latest hourly billing report from the given S3 path in a local
136 |        file.
137 | 
138 |        Returns the path to that file."""
139 |     s3 = boto3.resource("s3", region_name=region_name)
140 |     bucket = s3.Bucket(s3_path.split("/")[2])
141 |     primaries = s3_primary_manifests(bucket.objects.all())
142 |     logging.info("Using primary manifest(s) {0}".format(
143 |         [p.key for p in primaries]
144 |         )
145 |     )
146 | 
147 |     # Now we parse the manifest to get the path to the latest billing CSV
148 |     s3_csvs = []
149 |     for pri in primaries:
150 |         manifest = json.loads(pri.get()['Body'].read())
151 |         s3_csvs.extend(manifest["reportKeys"])
152 | 
153 |     # Download each billing CSV to a temp directory and decompress
154 |     try:
155 |         cat_csv_path = os.path.join(tempdir, "billing_full.csv")
156 |         cat_csv = open(cat_csv_path, "w")
157 |         header_written = False
158 |         for s3_csv in s3_csvs:
159 |             logging.info("Downloading CSV from S3: {0}".format(s3_csv))
160 |             local_path = os.path.join(tempdir, s3_csv.split("/")[-1])
161 |             local_file = open(local_path, "w")
162 |             obj = [o for o in bucket.objects.filter(Prefix=s3_csv)][0]
163 |             local_file.write(obj.get()['Body'].read())
164 |             local_file.close()
165 |             logging.info("Decompressing CSV: {0}".format(s3_csv))
166 | 
167 |             with gzip.open(local_path, "r") as f:
168 |                 for line in f:
169 |                     if line.startswith(
170 |                             "identity/LineItemId,"
171 |                     ) and header_written:
172 |                         continue
173 |                     cat_csv.write(line)
174 |                     header_written = True
175 |             # Remove these files as we finish with them to save on disk space
176 |             os.unlink(local_path)
177 |     except Exception, e:
178 |         logging.error(
179 |             "Exception: cleaning up by removing temp directory '{0}'".format(
180 |                 tempdir
181 |             )
182 |         )
183 |         shutil.rmtree(tempdir)
184 |         raise e
185 | 
186 |     cat_csv.close()
187 |     return cat_csv_path
188 | 
189 | 
190 | class SocketWriter(object):
191 |     """Wraps a socket object with a file-like write() method."""
192 |     def __init__(self, host, port):
193 |         self.host = host
194 |         self.port = port
195 |         self._sock = None
196 | 
197 |     def write(self, data):
198 |         if self._sock is None:
199 |             logging.info("Connecting to Graphite server at {0}:{1}".format(
200 |                 self.host,
201 |                 self.port
202 |                 )
203 |             )
204 |             self._sock = socket.create_connection((self.host, self.port))
205 |         return self._sock.send(data)
206 | 
207 | 
208 | class MetricLedger(object):
209 |     """Processes Row instances and generates timeseries data from them."""
210 |     def __init__(self, timeseries_patterns):
211 |         """Initializes the MetricLedger with alist of TimeseriesPattern
212 |         objects."""
213 |         self._patterns = timeseries_patterns
214 |         self._timeseries = defaultdict(lambda: defaultdict(float))
215 | 
216 |     def process(self, row):
217 |         """Adds the data from the given Row object to any appropriate
218 |         timeseries."""
219 |         # Skip entries of the wrong type
220 |         if row.content["lineItem/LineItemType"] != "Usage":
221 |             return
222 | 
223 |         # Skip non-hourly entries
224 |         if row.interval() != 3600:
225 |             return
226 |         for pat in self._patterns:
227 |             if pat.match(row):
228 |                 for metric in pat.metric_names(row):
229 |                     self._timeseries[metric][row.end_time()] += row.amount()
230 | 
231 |     def output(self, output_file):
232 |         formatter = MetricFormatter()
233 |         logging.info("Writing metrics to timeseries database")
234 |         for ts_id, ts in self._timeseries.iteritems():
235 |             for timestamp, value in ts.iteritems():
236 |                 output_file.write(formatter.format(ts_id, timestamp, value))
237 |         logging.info("Finished writing %d metrics to timeseries database", len(self._timeseries))
238 | 
239 |     def get_timeseries(self):
240 |         """Returns self._timeseries (for tests)."""
241 |         return self._timeseries
242 | 
243 | 
244 | class MetricFormatter(object):
245 |     """Converts CSV data to Graphite format."""
246 |     def __init__(self):
247 |         self._initial_pieces = []
248 |         if os.getenv("AWSBILL_METRIC_PREFIX") != "":
249 |             self._initial_pieces = [os.getenv("AWSBILL_METRIC_PREFIX")]
250 |         else:
251 |             self._initial_pieces = ["awsbill"]
252 | 
253 |     def format(self, ts_id, timestamp, value):
254 |         """Returns the Graphite line that corresponds to the given timeseries
255 |         ID, timestamp, and value."""
256 |         pieces = [p for p in self._initial_pieces]
257 |         pieces.append(ts_id)
258 |         metric_name = ".".join(pieces)
259 |         return "{0} {1:04f} {2}\n".format(
260 |             metric_name,
261 |             value,
262 |             timestamp.strftime('%s')
263 |         )
264 | 
265 | 
266 | class TimeseriesPattern(object):
267 |     """Describes a set of time series to be generated from the billing data.
268 | 
269 |        This is an abstract class. Provide an implementation of the match() and
270 |        metric_name() methods."""
271 |     def match(self, row):
272 |         """Determines whether the given Row instance matches the timeseries
273 |         pattern.
274 | 
275 |            Returns True if so."""
276 |         raise NotImplementedError("This is an abstract class")
277 | 
278 |     def metric_names(self, row):
279 |         """Returns the names of the metrics to which the given row's amount()
280 |         value should be added.
281 | 
282 |         We assume that match() has been called on the row already, and
283 |         returned True."""
284 |         raise NotImplementedError("This is an abstract class")
285 | 
286 | 
287 | class TsInstanceType(TimeseriesPattern):
288 |     """Describes per-EC2-instance-type Graphite metrics."""
289 |     def match(self, row):
290 |         if row.usage_type():
291 |             return (row.usage_type().startswith("ec2-instance."))
292 |         else:
293 |             pass
294 | 
295 |     def metric_names(self, row):
296 |         return [".".join((row.region(), row.usage_type()))]
297 | 
298 | 
299 | class TsEbsStorage(TimeseriesPattern):
300 |     """Describes per-volume-type EBS storage metric."""
301 |     def match(self, row):
302 |         return row.usage_type().startswith("ebs.storage.")
303 | 
304 |     def metric_names(self, row):
305 |         return [".".join((row.region(), row.usage_type()))]
306 | 
307 | 
308 | class TsEbsPiops(TimeseriesPattern):
309 |     """Describes the metric for PIOPS-month costs."""
310 |     def match(self, row):
311 |         return row.usage_type() == "ebs.piops"
312 | 
313 |     def metric_names(self, row):
314 |         return [".".join((row.region(), "ebs.piops"))]
315 | 
316 | 
317 | class TsEbsIops(TimeseriesPattern):
318 |     """Describes the metric for IOPS costs."""
319 |     def match(self, row):
320 |         return row.usage_type() == "ebs.iops"
321 | 
322 |     def metric_names(self, row):
323 |         return [".".join((row.region(), "ebs.iops"))]
324 | 
325 | 
326 | class TsEbsSnapshot(TimeseriesPattern):
327 |     """Describes the metric for EBS snapshot costs."""
328 |     def match(self, row):
329 |         return row.usage_type() == "ebs.snapshot"
330 | 
331 |     def metric_names(self, row):
332 |         return [".".join((row.region(), "ebs.snapshot"))]
333 | 
334 | 
335 | class TsRdsInstanceType(TimeseriesPattern):
336 |     """Describes per-RDS-instance-type Graphite metrics."""
337 |     def match(self, row):
338 |         return (row.usage_type().startswith("rds-instance."))
339 | 
340 |     def metric_names(self, row):
341 |         return [".".join((row.region(), row.usage_type()))]
342 | 
343 | 
344 | class TsRdsStorage(TimeseriesPattern):
345 |     """Describes per-volume-type RDS storage metric."""
346 |     def match(self, row):
347 |         return row.usage_type().startswith("rds.storage.")
348 | 
349 |     def metric_names(self, row):
350 |         return [".".join((row.region(), row.usage_type()))]
351 | 
352 | 
353 | class TsRdsPiops(TimeseriesPattern):
354 |     """Describes the metric for RDS PIOPS-month costs."""
355 |     def match(self, row):
356 |         return row.usage_type() == "rds.piops"
357 | 
358 |     def metric_names(self, row):
359 |         return [".".join((row.region(), "rds.piops"))]
360 | 
361 | 
362 | class TsElasticacheInstanceType(TimeseriesPattern):
363 |     """Describes per-ElastiCache-instance-type Graphite metrics."""
364 |     def match(self, row):
365 |         return (row.usage_type().startswith("elasticache-instance."))
366 | 
367 |     def metric_names(self, row):
368 |         return [".".join((row.region(), row.usage_type()))]
369 | 
370 | 
371 | class TsRegionTotal(TimeseriesPattern):
372 |     """Describes a Graphite metric containing the sum of all hourly costs per
373 |     region.
374 | 
375 |        This includes costs that we don't explicitly recognize and break out
376 |        into individual metrics. Any cost that shows up in the billing report
377 |        will go into this metric."""
378 |     def match(self, row):
379 |         return True
380 | 
381 |     def metric_names(self, row):
382 |         return ["total-cost.{0}".format(row.region())]
383 | 
384 | 
385 | class Row(object):
386 |     __slots__ = ["content", "_usage_type"]
387 | 
388 |     def __init__(self, col_names, row_list):
389 |         """Initializes a Row object, given the names of the CSV columns and
390 |         their values."""
391 |         self.content = dict(zip(col_names, row_list))
392 |         self._usage_type = None
393 | 
394 |     def region(self):
395 |         """Returns the normalized AWS region for the row, or 'noregion'.
396 | 
397 |            Normalized region names are like 'us-east-2', 'ap-northeast-1'."""
398 |         if self.content["product/location"] in REGION_NAMES:
399 |             # Most services have product/location set
400 |             return REGION_NAMES[self.content["product/location"]]
401 |         elif self.content["lineItem/AvailabilityZone"] and \
402 |                 self.content["lineItem/AvailabilityZone"][-1] in "1234567890":
403 |             # Some services, e.g. ElastiCache, use lineItem/AvailabilityZone
404 |             # instead
405 |             return self.content["lineItem/AvailabilityZone"]
406 |         return "noregion"
407 | 
408 |     def interval(self):
409 |         """Returns the length of the time interval to which this row
410 |         correpsonds, in seconds."""
411 |         start, end = [parse_datetime(x) for x in
412 |                       self.content["identity/TimeInterval"].split("/", 1)]
413 |         return int((end - start).total_seconds())
414 | 
415 |     def usage_type(self):
416 |         """Parses the "lineItem/UsageType" field to get at the "subtype"
417 |         (my term).
418 | 
419 |            Usage types can be of many forms. Here are some examples:
420 | 
421 |                USE1-USW2-AWS-In-Bytes
422 |                Requests-RBP
423 |                Request
424 |                APN1-DataProcessing-Bytes
425 |                APN1-BoxUsage:c3.2xlarge
426 | 
427 |            It's a goddamn nightmare. We try our best. Then we return the name
428 |            of the subtype, in the format in which it'll appear in the Graphite
429 |            metric.
430 |            Examples of usage types are:
431 | 
432 |                ec2-instance.c3-2xlarge
433 |                ebs.storage.io1
434 |                ebs.piops
435 |                rds-instance.db-r3-large
436 | 
437 |            This method returns the empty string if the usage type isn't
438 |            known."""
439 |         if self._usage_type is not None:
440 |             return self._usage_type
441 |         splut = self.content["lineItem/UsageType"].split("-", 1)
442 |         if len(splut[0]) == 4 and splut[0][0:2] in (
443 |                 "US",
444 |                 "EU",
445 |                 "AP",
446 |                 "SA"
447 |         ) and splut[0].isupper() and splut[0][3].isdigit():
448 |             # Stuff before dash was probably a region code like "APN1"
449 |             csv_usage_type = splut[1]
450 |         else:
451 |             csv_usage_type = splut[0]
452 |         self._usage_type = ""
453 | 
454 |         # EC2
455 |         if csv_usage_type.startswith("BoxUsage:"):
456 |             self._usage_type = self._usage_type_ec2_instance()
457 |         if csv_usage_type == "EBS:VolumeP-IOPS.piops":
458 |             self._usage_type = "ebs.piops"
459 |         if csv_usage_type.startswith("EBS:VolumeUsage"):
460 |             self._usage_type = self._usage_type_ebs_storage()
461 |         if csv_usage_type == "EBS:VolumeIOUsage":
462 |             self._usage_type = "ebs.iops"
463 |         if csv_usage_type == "EBS:SnapshotUsage":
464 |             self._usage_type = "ebs.snapshot"
465 | 
466 |         # RDS
467 |         if csv_usage_type.startswith("InstanceUsage:") or \
468 |                 csv_usage_type.startswith("Multi-AZUsage:"):
469 |             self._usage_type = self._usage_type_rds_instance()
470 |         if csv_usage_type == "RDS:PIOPS" or \
471 |                 csv_usage_type == "RDS:Multi-AZ-PIOPS":
472 |             self._usage_type = "rds.piops"
473 |         if csv_usage_type.startswith("RDS:") and \
474 |                 csv_usage_type.endswith("Storage"):
475 |             self._usage_type = self._usage_type_rds_storage()
476 | 
477 |         # ElastiCache
478 |         if csv_usage_type.startswith("NodeUsage:"):
479 |             self._usage_type = self._usage_type_elasticache_instance()
480 | 
481 |         return self._usage_type
482 | 
483 |     def _usage_type_ec2_instance(self):
484 |         splut = self.content["lineItem/UsageType"].split(":", 1)
485 |         if len(splut) < 2:
486 |             return None
487 |         instance_type = splut[1].replace(".", "-")
488 |         return "ec2-instance.{0}".format(instance_type)
489 | 
490 |     def _usage_type_ebs_storage(self):
491 |         if "product/volumeType" in self.content:
492 |             return "ebs.storage.{0}".format(
493 |                 EBS_TYPES[self.content["product/volumeType"]]
494 |             )
495 |         else:
496 |             return "ebs.storage.unknown"
497 | 
498 |     def _usage_type_rds_instance(self):
499 |         splut = self.content["lineItem/UsageType"].split(":", 1)
500 |         if len(splut) < 2:
501 |             return None
502 |         instance_type = splut[1].replace(".", "-")
503 |         return "rds-instance.{0}".format(instance_type)
504 | 
505 |     def _usage_type_rds_storage(self):
506 |         line_item_description = self.content['lineItem/LineItemDescription']
507 |         volume_type = ""
508 |         for substring in RDS_STORAGE_TYPES.keys():
509 |             if substring in line_item_description:
510 |                 volume_type = RDS_STORAGE_TYPES[substring]
511 |         if volume_type == "":
512 |             raise ValueError("Can't determine RDS storage type from line item description: '{0}'".format(line_item_description)) #noqa
513 |         return "rds.storage.{0}".format(volume_type)
514 | 
515 |     def _usage_type_elasticache_instance(self):
516 |         splut = self.content["lineItem/UsageType"].split(":", 1)
517 |         if len(splut) < 2:
518 |             return None
519 |         instance_type = splut[1].replace(".", "-")
520 |         return "elasticache-instance.{0}".format(instance_type)
521 | 
522 |     def end_time(self):
523 |         return parse_datetime(
524 |             self.content["identity/TimeInterval"].split("/", 1)[1]
525 |         )
526 | 
527 |     def tags(self):
528 |         return {}
529 | 
530 |     def amount(self):
531 |         return float(self.content["lineItem/BlendedCost"])
532 | 
533 | 
534 | def new_metric_ledger():
535 |     return MetricLedger([
536 |         # EC2
537 |         TsInstanceType(),
538 |         TsEbsStorage(),
539 |         TsEbsPiops(),
540 |         TsEbsIops(),
541 |         TsEbsSnapshot(),
542 |         # RDS
543 |         TsRdsInstanceType(),
544 |         TsRdsStorage(),
545 |         TsRdsPiops(),
546 |         # ElastiCache
547 |         TsElasticacheInstanceType(),
548 |         # Total
549 |         TsRegionTotal(),
550 |     ])
551 | 
552 | 
553 | def generate_metrics(csv_file, output_file):
554 |     """Generates metrics from the given CSV and writes them to the given
555 |     file-like object."""
556 |     reader = csv.reader(csv_file)
557 |     col_names = reader.next()
558 |     # formatter = MetricFormatter()
559 |     ledger = new_metric_ledger()
560 |     logging.info("Calculating billing metrics")
561 |     for row_list in reader:
562 |         row = Row(col_names, row_list)
563 |         ledger.process(row)
564 |     ledger.output(output_file)
565 | 
566 | if __name__ == "__main__":
567 |     logging.basicConfig(format='%(asctime)s %(message)s', level=logging.INFO)
568 |     logging.getLogger('boto').setLevel(logging.CRITICAL)
569 |     logging.getLogger('boto3').setLevel(logging.CRITICAL)
570 |     logging.getLogger('botocore').setLevel(logging.CRITICAL)
571 |     if os.getenv("REGION_NAME") != '':
572 |         region_name = os.getenv("REGION_NAME")
573 |     else:
574 |         region_name = 'us-west-1'
575 |     try:
576 |         tempdir = tempfile.mkdtemp(".awsbill")
577 |         csv_file = open_csv(tempdir, region_name)
578 |         output_file = open_output()
579 |         generate_metrics(csv_file, output_file)
580 |         logging.info("Removing temp directory '{0}'".format(tempdir))
581 |         shutil.rmtree(tempdir)
582 |         logging.info("Mission complete.")
583 |     except Exception, e:
584 |         logging.exception(e)
585 | 


--------------------------------------------------------------------------------
/print_all_csvs.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | import gzip
 3 | import json
 4 | import os
 5 | import shutil
 6 | import sys
 7 | import tempfile
 8 | 
 9 | import boto3
10 | 
11 | def all_s3_primary_manifests(objects):
12 |     """Returns the S3 object(s) corresponding to all primary manifests.
13 | 
14 |        `objects` should be an iterable of S3 objects."""
15 |     manifests = [o for o in objects if o.key.endswith("Manifest.json")]
16 |     # The primary manifest(s) will be the one(s) with the shortest path length
17 |     manifests.sort(key=lambda a: len(a.key))
18 |     n_slash = manifests[0].key.count("/")
19 |     for i in range(len(manifests)-1):
20 |         if manifests[i].key.count("/") > n_slash:
21 |             break
22 |     return manifests[:i]
23 | 
24 | 
25 | def print_all_from_s3(s3_path, tempdir, region_name):
26 |     """Outputs all hourly billing reports from the given S3 path to stdout."""
27 |     s3 = boto3.resource("s3", region_name=region_name)
28 |     bucket = s3.Bucket(s3_path.split("/")[2])
29 |     primaries = all_s3_primary_manifests(bucket.objects.all())
30 | 
31 |     # Now we parse the manifest to get the path to the latest billing CSV
32 |     s3_csvs = []
33 |     for pri in primaries:
34 |         manifest = json.loads(pri.get()['Body'].read())
35 |         s3_csvs.extend(manifest["reportKeys"])
36 | 
37 |     # Download each billing CSV to a temp directory and decompress
38 |     header_written = False
39 |     for s3_csv in s3_csvs:
40 |         local_path = os.path.join(tempdir, s3_csv.split("/")[-1])
41 |         local_file = open(local_path, "w")
42 |         obj = [o for o in bucket.objects.filter(Prefix=s3_csv)][0]
43 |         local_file.write(obj.get()['Body'].read())
44 |         local_file.close()
45 | 
46 |         with gzip.open(local_path, "r") as f:
47 |             for line in f:
48 |                 if line.startswith(
49 |                         "identity/LineItemId,"
50 |                 ) and header_written:
51 |                     continue
52 |                 sys.stdout.write(line)
53 |                 header_written = True
54 |         # Remove these files as we finish with them to save on disk space
55 |         os.unlink(local_path)
56 | 
57 | if __name__ == "__main__":
58 |     if os.getenv("REGION_NAME") != '':
59 |         region_name = os.getenv("REGION_NAME")
60 |     else:
61 |         region_name = 'us-west-1'
62 | 
63 |     tempdir = tempfile.mkdtemp(".awsbill")
64 |     print_all_from_s3(os.getenv("AWSBILL_REPORT_PATH"), tempdir, region_name)
65 |     shutil.rmtree(tempdir)
66 | 


--------------------------------------------------------------------------------
/redact_csv.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | """Turns an hourly billing CSV into one we can test against.
 3 | 
 4 |    We redact anything proprietary, including:
 5 | 
 6 |        * tag names and values
 7 |        * cost values
 8 |        * instance IDs
 9 |        * line item IDs
10 |        * account IDs
11 |        
12 |    We write the redacted CSV to stdout."""
13 | 
14 | import sys
15 | import csv
16 | import random
17 | 
18 | from awsbill2graphite import Row
19 | 
20 | ALPHA = "abcdefghijklmnopqrstuvwxyz"
21 | 
22 | INCLUDED_COLS = set((
23 |     "identity/TimeInterval",
24 |     "lineItem/LineItemType",
25 |     "product/location",
26 |     "product/volumeType",
27 | ))
28 | 
29 | def make_alpha(n):
30 |     """Returns a lowercase alphabetic string n characters long."""
31 |     global ALPHA
32 |     return "".join((random.choice(ALPHA) for i in range(n)))
33 | 
34 | def make_instance_type(instance_type):
35 |     """Returns a random instance type string of the same kind as the given one.
36 | 
37 |        For example, if instance_type is "db.r3.large", we'll return an instance
38 |        type starting with "db."."""
39 |     splut = instance_type.split(".")
40 |     splut[-2] = random.choice(("t2", "c4", "m4"))
41 |     splut[-1] = random.choice(("medium", "large", "2xlarge"))
42 |     return ".".join(splut)
43 | 
44 | if __name__ == "__main__":
45 |     reader = csv.reader(open(sys.argv[1], "rb"))
46 |     writer = csv.writer(sys.stdout)
47 |     col_names = reader.next()
48 | 
49 |     # Redact tag names
50 |     for i in range(len(col_names)):
51 |         if col_names[i].startswith("resourceTags/user:"):
52 |             col_names[i] = "resourceTags/user:{0}".format(make_alpha(10))
53 |     writer.writerow(col_names)
54 | 
55 |     for row_list in reader:
56 |         row = []
57 |         for i in range(len(row_list)):
58 |             col_name = col_names[i]
59 |             col_val = row_list[i]
60 | 
61 |             if col_name in INCLUDED_COLS:
62 |                 row.append(col_val)
63 |             elif col_name.endswith("Cost"):
64 |                 row.append(round(random.random()*10., 8))
65 |             elif col_name.startswith("resourceTags/user:"):
66 |                 row.append(col_val)
67 |             elif col_name == "lineItem/UsageType" and "Usage:" in col_val:
68 |                 splut = col_val.rsplit(":", 1) 
69 |                 splut[-1] = make_instance_type(splut[-1])
70 |                 row.append(":".join(splut))
71 |             elif col_name == "lineItem/UsageType":
72 |                 row.append(col_val)
73 |             else:
74 |                 row.append("")
75 |         writer.writerow(row)
76 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | boto3==1.3.0
2 | botocore==1.4.11
3 | python-dateutil==2.5.2
4 | six==1.10.0
5 | 


--------------------------------------------------------------------------------
/static/dashboard.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danslimmon/awsbill2graphite/98f60d7c8cf8784f5ad46322310994b86403ebde/static/dashboard.png


--------------------------------------------------------------------------------
/static/grafana_dashboard.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "id": 35,
  3 |   "title": "aws bill",
  4 |   "originalTitle": "aws bill",
  5 |   "tags": [],
  6 |   "style": "dark",
  7 |   "timezone": "browser",
  8 |   "editable": true,
  9 |   "hideControls": false,
 10 |   "sharedCrosshair": false,
 11 |   "rows": [
 12 |     {
 13 |       "collapse": false,
 14 |       "editable": true,
 15 |       "height": "250px",
 16 |       "panels": [
 17 |         {
 18 |           "aliasColors": {},
 19 |           "bars": false,
 20 |           "datasource": null,
 21 |           "editable": true,
 22 |           "error": false,
 23 |           "fill": 5,
 24 |           "grid": {
 25 |             "leftLogBase": 1,
 26 |             "leftMax": null,
 27 |             "leftMin": 0,
 28 |             "rightLogBase": 1,
 29 |             "rightMax": null,
 30 |             "rightMin": null,
 31 |             "threshold1": null,
 32 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
 33 |             "threshold2": null,
 34 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
 35 |           },
 36 |           "id": 2,
 37 |           "leftYAxisLabel": "$/month",
 38 |           "legend": {
 39 |             "avg": false,
 40 |             "current": false,
 41 |             "max": false,
 42 |             "min": false,
 43 |             "show": true,
 44 |             "total": false,
 45 |             "values": false
 46 |           },
 47 |           "lines": true,
 48 |           "linewidth": 3,
 49 |           "links": [],
 50 |           "nullPointMode": "connected",
 51 |           "percentage": false,
 52 |           "pointradius": 5,
 53 |           "points": false,
 54 |           "renderer": "flot",
 55 |           "seriesOverrides": [],
 56 |           "span": 12,
 57 |           "stack": true,
 58 |           "steppedLine": false,
 59 |           "targets": [
 60 |             {
 61 |               "target": "aliasByNode(scale(summarize(awsbill.total-cost.*, \"1day\", \"sum\"), 30), 2)",
 62 |               "textEditor": true
 63 |             }
 64 |           ],
 65 |           "timeFrom": null,
 66 |           "timeShift": null,
 67 |           "title": "Total cost per month (USD) (Don't trust last data point)",
 68 |           "tooltip": {
 69 |             "shared": true,
 70 |             "value_type": "cumulative"
 71 |           },
 72 |           "type": "graph",
 73 |           "x-axis": true,
 74 |           "y-axis": true,
 75 |           "y_formats": [
 76 |             "short",
 77 |             "short"
 78 |           ]
 79 |         }
 80 |       ],
 81 |       "title": "Total"
 82 |     },
 83 |     {
 84 |       "collapse": false,
 85 |       "editable": true,
 86 |       "height": "250px",
 87 |       "panels": [
 88 |         {
 89 |           "aliasColors": {},
 90 |           "bars": false,
 91 |           "datasource": null,
 92 |           "editable": true,
 93 |           "error": false,
 94 |           "fill": 0,
 95 |           "grid": {
 96 |             "leftLogBase": 1,
 97 |             "leftMax": null,
 98 |             "leftMin": 0,
 99 |             "rightLogBase": 1,
100 |             "rightMax": null,
101 |             "rightMin": 0,
102 |             "threshold1": null,
103 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
104 |             "threshold2": null,
105 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
106 |           },
107 |           "id": 1,
108 |           "leftYAxisLabel": "$/month",
109 |           "legend": {
110 |             "avg": false,
111 |             "current": false,
112 |             "max": false,
113 |             "min": false,
114 |             "show": true,
115 |             "total": false,
116 |             "values": false
117 |           },
118 |           "lines": true,
119 |           "linewidth": 4,
120 |           "links": [],
121 |           "nullPointMode": "connected",
122 |           "percentage": false,
123 |           "pointradius": 5,
124 |           "points": false,
125 |           "renderer": "flot",
126 |           "rightYAxisLabel": "",
127 |           "seriesOverrides": [],
128 |           "span": 12,
129 |           "stack": false,
130 |           "steppedLine": false,
131 |           "targets": [
132 |             {
133 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.ec2-instance.*, 1, \"sumSeries\"), 720), 0)",
134 |               "textEditor": true
135 |             }
136 |           ],
137 |           "timeFrom": null,
138 |           "timeShift": null,
139 |           "title": "Monthly EC2 Instance Costs by Region (USD)",
140 |           "tooltip": {
141 |             "shared": true,
142 |             "value_type": "cumulative"
143 |           },
144 |           "type": "graph",
145 |           "x-axis": true,
146 |           "y-axis": true,
147 |           "y_formats": [
148 |             "short",
149 |             "short"
150 |           ]
151 |         }
152 |       ],
153 |       "title": "EC2 Instances"
154 |     },
155 |     {
156 |       "collapse": false,
157 |       "editable": true,
158 |       "height": "250px",
159 |       "panels": [
160 |         {
161 |           "aliasColors": {},
162 |           "bars": false,
163 |           "datasource": null,
164 |           "editable": true,
165 |           "error": false,
166 |           "fill": 0,
167 |           "grid": {
168 |             "leftLogBase": 1,
169 |             "leftMax": null,
170 |             "leftMin": 0,
171 |             "rightLogBase": 1,
172 |             "rightMax": null,
173 |             "rightMin": null,
174 |             "threshold1": null,
175 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
176 |             "threshold2": null,
177 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
178 |           },
179 |           "id": 3,
180 |           "leftYAxisLabel": "$/month",
181 |           "legend": {
182 |             "avg": false,
183 |             "current": false,
184 |             "max": false,
185 |             "min": false,
186 |             "show": true,
187 |             "total": false,
188 |             "values": false
189 |           },
190 |           "lines": true,
191 |           "linewidth": 3,
192 |           "links": [],
193 |           "nullPointMode": "connected",
194 |           "percentage": false,
195 |           "pointradius": 5,
196 |           "points": false,
197 |           "renderer": "flot",
198 |           "seriesOverrides": [],
199 |           "span": 4,
200 |           "stack": false,
201 |           "steppedLine": false,
202 |           "targets": [
203 |             {
204 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.storage.*, 1, \"sumSeries\"), 720), 0)",
205 |               "textEditor": true
206 |             }
207 |           ],
208 |           "timeFrom": null,
209 |           "timeShift": null,
210 |           "title": "Monthly EBS Storage cost (USD)",
211 |           "tooltip": {
212 |             "shared": true,
213 |             "value_type": "cumulative"
214 |           },
215 |           "type": "graph",
216 |           "x-axis": true,
217 |           "y-axis": true,
218 |           "y_formats": [
219 |             "short",
220 |             "short"
221 |           ]
222 |         },
223 |         {
224 |           "aliasColors": {},
225 |           "bars": false,
226 |           "datasource": null,
227 |           "editable": true,
228 |           "error": false,
229 |           "fill": 0,
230 |           "grid": {
231 |             "leftLogBase": 1,
232 |             "leftMax": null,
233 |             "leftMin": 0,
234 |             "rightLogBase": 1,
235 |             "rightMax": null,
236 |             "rightMin": null,
237 |             "threshold1": null,
238 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
239 |             "threshold2": null,
240 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
241 |           },
242 |           "id": 4,
243 |           "leftYAxisLabel": "$/month",
244 |           "legend": {
245 |             "avg": false,
246 |             "current": false,
247 |             "max": false,
248 |             "min": false,
249 |             "show": true,
250 |             "total": false,
251 |             "values": false
252 |           },
253 |           "lines": true,
254 |           "linewidth": 3,
255 |           "links": [],
256 |           "nullPointMode": "connected",
257 |           "percentage": false,
258 |           "pointradius": 5,
259 |           "points": false,
260 |           "renderer": "flot",
261 |           "seriesOverrides": [],
262 |           "span": 4,
263 |           "stack": false,
264 |           "steppedLine": false,
265 |           "targets": [
266 |             {
267 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.piops, 1, \"sumSeries\"), 720), 0)",
268 |               "textEditor": true
269 |             }
270 |           ],
271 |           "timeFrom": null,
272 |           "timeShift": null,
273 |           "title": "Monthly EBS PIOPS cost (USD)",
274 |           "tooltip": {
275 |             "shared": true,
276 |             "value_type": "cumulative"
277 |           },
278 |           "type": "graph",
279 |           "x-axis": true,
280 |           "y-axis": true,
281 |           "y_formats": [
282 |             "short",
283 |             "short"
284 |           ]
285 |         },
286 |         {
287 |           "aliasColors": {},
288 |           "bars": false,
289 |           "datasource": null,
290 |           "editable": true,
291 |           "error": false,
292 |           "fill": 0,
293 |           "grid": {
294 |             "leftLogBase": 1,
295 |             "leftMax": null,
296 |             "leftMin": 0,
297 |             "rightLogBase": 1,
298 |             "rightMax": null,
299 |             "rightMin": null,
300 |             "threshold1": null,
301 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
302 |             "threshold2": null,
303 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
304 |           },
305 |           "id": 5,
306 |           "leftYAxisLabel": "$/month",
307 |           "legend": {
308 |             "avg": false,
309 |             "current": false,
310 |             "max": false,
311 |             "min": false,
312 |             "show": true,
313 |             "total": false,
314 |             "values": false
315 |           },
316 |           "lines": true,
317 |           "linewidth": 3,
318 |           "links": [],
319 |           "nullPointMode": "connected",
320 |           "percentage": false,
321 |           "pointradius": 5,
322 |           "points": false,
323 |           "renderer": "flot",
324 |           "seriesOverrides": [],
325 |           "span": 4,
326 |           "stack": false,
327 |           "steppedLine": false,
328 |           "targets": [
329 |             {
330 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.ebs.iops, 1, \"sumSeries\"), 720), 0)",
331 |               "textEditor": true
332 |             }
333 |           ],
334 |           "timeFrom": null,
335 |           "timeShift": null,
336 |           "title": "Monthly EBS IOPS cost (USD)",
337 |           "tooltip": {
338 |             "shared": true,
339 |             "value_type": "cumulative"
340 |           },
341 |           "type": "graph",
342 |           "x-axis": true,
343 |           "y-axis": true,
344 |           "y_formats": [
345 |             "short",
346 |             "short"
347 |           ]
348 |         }
349 |       ],
350 |       "title": "EBS"
351 |     },
352 |     {
353 |       "collapse": false,
354 |       "editable": true,
355 |       "height": "250px",
356 |       "panels": [
357 |         {
358 |           "aliasColors": {},
359 |           "bars": false,
360 |           "datasource": null,
361 |           "editable": true,
362 |           "error": false,
363 |           "fill": 0,
364 |           "grid": {
365 |             "leftLogBase": 1,
366 |             "leftMax": null,
367 |             "leftMin": 0,
368 |             "rightLogBase": 1,
369 |             "rightMax": null,
370 |             "rightMin": null,
371 |             "threshold1": null,
372 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
373 |             "threshold2": null,
374 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
375 |           },
376 |           "id": 6,
377 |           "leftYAxisLabel": "$/month",
378 |           "legend": {
379 |             "avg": false,
380 |             "current": false,
381 |             "max": false,
382 |             "min": false,
383 |             "show": true,
384 |             "total": false,
385 |             "values": false
386 |           },
387 |           "lines": true,
388 |           "linewidth": 3,
389 |           "links": [],
390 |           "nullPointMode": "connected",
391 |           "percentage": false,
392 |           "pointradius": 5,
393 |           "points": false,
394 |           "renderer": "flot",
395 |           "seriesOverrides": [],
396 |           "span": 12,
397 |           "stack": false,
398 |           "steppedLine": false,
399 |           "targets": [
400 |             {
401 |               "target": "aliasByNode(scale(awsbill.*.ebs.snapshot, 30), 1)",
402 |               "textEditor": true
403 |             }
404 |           ],
405 |           "timeFrom": null,
406 |           "timeShift": null,
407 |           "title": "Monthly EBS snapshot cost",
408 |           "tooltip": {
409 |             "shared": true,
410 |             "value_type": "cumulative"
411 |           },
412 |           "type": "graph",
413 |           "x-axis": true,
414 |           "y-axis": true,
415 |           "y_formats": [
416 |             "short",
417 |             "short"
418 |           ]
419 |         }
420 |       ],
421 |       "title": "Snapshots"
422 |     },
423 |     {
424 |       "collapse": false,
425 |       "editable": true,
426 |       "height": "250px",
427 |       "panels": [
428 |         {
429 |           "aliasColors": {},
430 |           "bars": false,
431 |           "datasource": null,
432 |           "editable": true,
433 |           "error": false,
434 |           "fill": 0,
435 |           "grid": {
436 |             "leftLogBase": 1,
437 |             "leftMax": null,
438 |             "leftMin": null,
439 |             "rightLogBase": 1,
440 |             "rightMax": null,
441 |             "rightMin": null,
442 |             "threshold1": null,
443 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
444 |             "threshold2": null,
445 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
446 |           },
447 |           "id": 8,
448 |           "leftYAxisLabel": "$/month",
449 |           "legend": {
450 |             "avg": false,
451 |             "current": false,
452 |             "max": false,
453 |             "min": false,
454 |             "show": true,
455 |             "total": false,
456 |             "values": false
457 |           },
458 |           "lines": true,
459 |           "linewidth": 3,
460 |           "links": [],
461 |           "nullPointMode": "connected",
462 |           "percentage": false,
463 |           "pointradius": 5,
464 |           "points": false,
465 |           "renderer": "flot",
466 |           "seriesOverrides": [],
467 |           "span": 4,
468 |           "stack": false,
469 |           "steppedLine": false,
470 |           "targets": [
471 |             {
472 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.rds.storage.*, 1, \"sumSeries\"), 720), 0)",
473 |               "textEditor": true
474 |             }
475 |           ],
476 |           "timeFrom": null,
477 |           "timeShift": null,
478 |           "title": "Monthly RDS storage cost (USD)",
479 |           "tooltip": {
480 |             "shared": true,
481 |             "value_type": "cumulative"
482 |           },
483 |           "type": "graph",
484 |           "x-axis": true,
485 |           "y-axis": true,
486 |           "y_formats": [
487 |             "short",
488 |             "short"
489 |           ]
490 |         },
491 |         {
492 |           "aliasColors": {},
493 |           "bars": false,
494 |           "datasource": null,
495 |           "editable": true,
496 |           "error": false,
497 |           "fill": 0,
498 |           "grid": {
499 |             "leftLogBase": 1,
500 |             "leftMax": null,
501 |             "leftMin": 0,
502 |             "rightLogBase": 1,
503 |             "rightMax": null,
504 |             "rightMin": null,
505 |             "threshold1": null,
506 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
507 |             "threshold2": null,
508 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
509 |           },
510 |           "id": 9,
511 |           "leftYAxisLabel": "$/month",
512 |           "legend": {
513 |             "avg": false,
514 |             "current": false,
515 |             "max": false,
516 |             "min": false,
517 |             "show": true,
518 |             "total": false,
519 |             "values": false
520 |           },
521 |           "lines": true,
522 |           "linewidth": 3,
523 |           "links": [],
524 |           "nullPointMode": "connected",
525 |           "percentage": false,
526 |           "pointradius": 5,
527 |           "points": false,
528 |           "renderer": "flot",
529 |           "seriesOverrides": [],
530 |           "span": 4,
531 |           "stack": false,
532 |           "steppedLine": false,
533 |           "targets": [
534 |             {
535 |               "target": "aliasByNode(scale(awsbill.*.rds.piops, 720), 1)",
536 |               "textEditor": true
537 |             }
538 |           ],
539 |           "timeFrom": null,
540 |           "timeShift": null,
541 |           "title": "Monthly RDS PIOPS cost (USD)",
542 |           "tooltip": {
543 |             "shared": true,
544 |             "value_type": "cumulative"
545 |           },
546 |           "type": "graph",
547 |           "x-axis": true,
548 |           "y-axis": true,
549 |           "y_formats": [
550 |             "short",
551 |             "short"
552 |           ]
553 |         },
554 |         {
555 |           "aliasColors": {},
556 |           "bars": false,
557 |           "datasource": null,
558 |           "editable": true,
559 |           "error": false,
560 |           "fill": 0,
561 |           "grid": {
562 |             "leftLogBase": 1,
563 |             "leftMax": null,
564 |             "leftMin": 0,
565 |             "rightLogBase": 1,
566 |             "rightMax": null,
567 |             "rightMin": null,
568 |             "threshold1": null,
569 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
570 |             "threshold2": null,
571 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
572 |           },
573 |           "id": 7,
574 |           "leftYAxisLabel": "$/month",
575 |           "legend": {
576 |             "avg": false,
577 |             "current": false,
578 |             "max": false,
579 |             "min": false,
580 |             "show": true,
581 |             "total": false,
582 |             "values": false
583 |           },
584 |           "lines": true,
585 |           "linewidth": 3,
586 |           "links": [],
587 |           "nullPointMode": "connected",
588 |           "percentage": false,
589 |           "pointradius": 5,
590 |           "points": false,
591 |           "renderer": "flot",
592 |           "seriesOverrides": [],
593 |           "span": 4,
594 |           "stack": false,
595 |           "steppedLine": false,
596 |           "targets": [
597 |             {
598 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.rds-instance.*, 1, \"sumSeries\"), 720), 0)",
599 |               "textEditor": true
600 |             }
601 |           ],
602 |           "timeFrom": null,
603 |           "timeShift": null,
604 |           "title": "Monthly RDS instance cost (USD)",
605 |           "tooltip": {
606 |             "shared": true,
607 |             "value_type": "cumulative"
608 |           },
609 |           "type": "graph",
610 |           "x-axis": true,
611 |           "y-axis": true,
612 |           "y_formats": [
613 |             "short",
614 |             "short"
615 |           ]
616 |         }
617 |       ],
618 |       "title": "New row"
619 |     },
620 |     {
621 |       "title": "New row",
622 |       "height": "250px",
623 |       "editable": true,
624 |       "collapse": false,
625 |       "panels": [
626 |         {
627 |           "title": "Monthly ElastiCache cost (USD)",
628 |           "error": false,
629 |           "span": 12,
630 |           "editable": true,
631 |           "type": "graph",
632 |           "id": 10,
633 |           "datasource": null,
634 |           "renderer": "flot",
635 |           "x-axis": true,
636 |           "y-axis": true,
637 |           "y_formats": [
638 |             "short",
639 |             "short"
640 |           ],
641 |           "grid": {
642 |             "leftLogBase": 1,
643 |             "leftMax": null,
644 |             "rightMax": null,
645 |             "leftMin": 0,
646 |             "rightMin": null,
647 |             "rightLogBase": 1,
648 |             "threshold1": null,
649 |             "threshold2": null,
650 |             "threshold1Color": "rgba(216, 200, 27, 0.27)",
651 |             "threshold2Color": "rgba(234, 112, 112, 0.22)"
652 |           },
653 |           "lines": true,
654 |           "fill": 0,
655 |           "linewidth": 3,
656 |           "points": false,
657 |           "pointradius": 5,
658 |           "bars": false,
659 |           "stack": false,
660 |           "percentage": false,
661 |           "legend": {
662 |             "show": true,
663 |             "values": false,
664 |             "min": false,
665 |             "max": false,
666 |             "current": false,
667 |             "total": false,
668 |             "avg": false
669 |           },
670 |           "nullPointMode": "connected",
671 |           "steppedLine": false,
672 |           "tooltip": {
673 |             "value_type": "cumulative",
674 |             "shared": true
675 |           },
676 |           "timeFrom": null,
677 |           "timeShift": null,
678 |           "targets": [
679 |             {
680 |               "target": "aliasByNode(scale(groupByNode(awsbill.*.elasticache-instance.*, 1, \"sumSeries\"), 720), 0)",
681 |               "textEditor": true
682 |             }
683 |           ],
684 |           "aliasColors": {},
685 |           "seriesOverrides": [],
686 |           "links": [],
687 |           "leftYAxisLabel": "$/month"
688 |         }
689 |       ]
690 |     }
691 |   ],
692 |   "nav": [
693 |     {
694 |       "collapse": false,
695 |       "enable": true,
696 |       "notice": false,
697 |       "now": true,
698 |       "refresh_intervals": [
699 |         "5s",
700 |         "10s",
701 |         "30s",
702 |         "1m",
703 |         "5m",
704 |         "15m",
705 |         "30m",
706 |         "1h",
707 |         "2h",
708 |         "1d"
709 |       ],
710 |       "status": "Stable",
711 |       "time_options": [
712 |         "5m",
713 |         "15m",
714 |         "1h",
715 |         "6h",
716 |         "12h",
717 |         "24h",
718 |         "2d",
719 |         "7d",
720 |         "30d"
721 |       ],
722 |       "type": "timepicker"
723 |     }
724 |   ],
725 |   "time": {
726 |     "from": "now-30d",
727 |     "to": "now"
728 |   },
729 |   "templating": {
730 |     "list": []
731 |   },
732 |   "annotations": {
733 |     "list": []
734 |   },
735 |   "schemaVersion": 6,
736 |   "version": 10,
737 |   "links": []
738 | }
739 | 


--------------------------------------------------------------------------------
/test_all.py:
--------------------------------------------------------------------------------
  1 | import csv
  2 | import random
  3 | import unittest
  4 | from datetime import datetime
  5 | 
  6 | import awsbill2graphite as a2g
  7 | 
  8 | class LedgerTest(unittest.TestCase):
  9 |     def setUp(self):
 10 |         ledger = a2g.new_metric_ledger()
 11 |         reader = csv.reader(open("test_data/hourly_billing-1.csv", "rb"))
 12 |         col_names = reader.next()
 13 |         for row_list in reader:
 14 |             row = a2g.Row(col_names, row_list)
 15 |             ledger.process(row)
 16 |         self.timeseries = ledger.get_timeseries()
 17 | 
 18 |     def assert_timeseries_equal(self, metric_name, expected, received):
 19 |         """Determines whether the two given timeseries dicts are equal (within a tolerance)."""
 20 |         for k in expected.keys():
 21 |             if not received.has_key(k):
 22 |                 self.fail("Key {0} missing from received timeseries '{1}'".format(k, metric_name))
 23 |                 return
 24 |             if abs(expected[k] - received[k]) > .00001:
 25 |                 self.fail("Value for {0} for received timeseries {1} is {2}; should be {3}".format(
 26 |                     k, metric_name, expected[k], received[k]))
 27 |                 return
 28 |         for k in received.keys():
 29 |             if not expected.has_key(k):
 30 |                 self.fail("Unexpected key {0} in received timeseries '{1}'".format(k, metric_name))
 31 |                 return
 32 | 
 33 |     def testTsInstanceType(self):
 34 |         self.assertTrue(self.timeseries.has_key("us-west-1.ec2-instance.m4-2xlarge"))
 35 |         self.assert_timeseries_equal(
 36 |             "us-west-1.ec2-instance.m4-2xlarge",
 37 |             self.timeseries["us-west-1.ec2-instance.m4-2xlarge"],
 38 |             {
 39 |                 datetime.fromtimestamp(1459746000): 31.497950,
 40 |                 datetime.fromtimestamp(1459764000): 26.083113,
 41 |                 datetime.fromtimestamp(1459782000): 61.615628,
 42 |                 datetime.fromtimestamp(1459800000): 63.319794,
 43 |                 datetime.fromtimestamp(1459789200): 42.888862,
 44 |                 datetime.fromtimestamp(1459807200): 33.440607,
 45 |                 datetime.fromtimestamp(1459753200): 49.640219,
 46 |                 datetime.fromtimestamp(1459771200): 47.892134,
 47 |                 datetime.fromtimestamp(1459735200): 43.360197,
 48 |                 datetime.fromtimestamp(1459814400): 84.484617,
 49 |                 datetime.fromtimestamp(1459760400): 55.846821,
 50 |                 datetime.fromtimestamp(1459778400): 29.705564,
 51 |                 datetime.fromtimestamp(1459742400): 63.989894,
 52 |                 datetime.fromtimestamp(1459796400): 54.198285,
 53 |                 datetime.fromtimestamp(1459731600): 47.450255,
 54 |                 datetime.fromtimestamp(1459749600): 77.140611,
 55 |                 datetime.fromtimestamp(1459803600): 78.267747,
 56 |                 datetime.fromtimestamp(1459767600): 61.143072,
 57 |                 datetime.fromtimestamp(1459785600): 39.729129,
 58 |                 datetime.fromtimestamp(1459810800): 48.819524,
 59 |                 datetime.fromtimestamp(1459774800): 44.610415,
 60 |                 datetime.fromtimestamp(1459792800): 19.039679,
 61 |                 datetime.fromtimestamp(1459738800): 41.609403,
 62 |                 datetime.fromtimestamp(1459756800): 47.254336,
 63 |             }
 64 |         )
 65 |         self.assertTrue(self.timeseries.has_key("ap-northeast-1.ec2-instance.t2-medium"))
 66 |         self.assert_timeseries_equal(
 67 |             "ap-northeast-1.ec2-instance.t2-medium",
 68 |             self.timeseries["ap-northeast-1.ec2-instance.t2-medium"],
 69 |             {
 70 |                 datetime.fromtimestamp(1459807200): 9.228804,
 71 |                 datetime.fromtimestamp(1459753200): 5.313574,
 72 |                 datetime.fromtimestamp(1459771200): 4.238844,
 73 |                 datetime.fromtimestamp(1459746000): 13.584161,
 74 |                 datetime.fromtimestamp(1459764000): 5.319477,
 75 |                 datetime.fromtimestamp(1459760400): 13.284314,
 76 |                 datetime.fromtimestamp(1459778400): 8.792418,
 77 |                 datetime.fromtimestamp(1459742400): 4.248000,
 78 |                 datetime.fromtimestamp(1459735200): 3.921303,
 79 |                 datetime.fromtimestamp(1459796400): 9.269115,
 80 |                 datetime.fromtimestamp(1459810800): 13.109077,
 81 |                 datetime.fromtimestamp(1459803600): 15.168434,
 82 |                 datetime.fromtimestamp(1459785600): 3.440763,
 83 |                 datetime.fromtimestamp(1459782000): 4.131503,
 84 |                 datetime.fromtimestamp(1459800000): 8.607207,
 85 |                 datetime.fromtimestamp(1459774800): 2.417751,
 86 |                 datetime.fromtimestamp(1459814400): 5.292426,
 87 |                 datetime.fromtimestamp(1459756800): 6.206031,
 88 |             }
 89 |         )
 90 | 
 91 |     def testS3PrimaryManifest(self):
 92 |         class _S3Obj:
 93 |             def __init__(self, k): self.key = k
 94 |             # Make sure sorting the objects without `key=` causes test to fail
 95 |             def __lt__(self, o): return (random.randint(0, 1) == 0)
 96 | 
 97 |         manifests = [_S3Obj(k) for k in [
 98 |             "prefix/hourly_billing/20160201-20160301/hourly_billing-Manifest.json",
 99 |             "prefix/hourly_billing/20160301-20160401/hourly_billing-Manifest.json",
100 |             "prefix/hourly_billing/20160401-20160501/11c0a000-107e-11e6-813f-881fa1019b9e/hourly_billing-1.csv.gz",
101 |             "prefix/hourly_billing/20160401-20160501/11c0a000-107e-11e6-813f-881fa1019b9e/hourly_billing-Manifest.json",
102 |             "prefix/hourly_billing/20160501-20160601/hourly_billing-Manifest.json",
103 |             "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-1.csv.gz",
104 |             "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-2.csv.gz",
105 |             "prefix/hourly_billing/20160401-20160501/21b3c44a-107e-11e6-8355-881fa1019b9e/hourly_billing-Manifest.json",
106 |             "prefix/hourly_billing/20160501-20160601/2e6de863-107e-11e6-97e6-881fa1019b9e/hourly_billing-1.csv.gz",
107 |             "prefix/hourly_billing/20160501-20160601/2e6de863-107e-11e6-97e6-881fa1019b9e/hourly_billing-Manifest.json",
108 |             "prefix/hourly_billing/20160401-20160501/hourly_billing-Manifest.json",
109 |             "prefix/hourly_billing/20160501-20160601/3d1ed007-107e-11e6-acd4-881fa1019b9e/hourly_billing-1.csv.gz",
110 |             "prefix/hourly_billing/20160501-20160601/3d1ed007-107e-11e6-acd4-881fa1019b9e/hourly_billing-Manifest.json",
111 |             "prefix/hourly_billing/20160301-20160401/fbc0aa99-1083-11e6-918b-881fa1019b9e/hourly_billing-1.csv.gz",
112 |             "prefix/hourly_billing/20160301-20160401/fbc0aa99-1083-11e6-918b-881fa1019b9e/hourly_billing-Manifest.json",
113 |         ]]
114 |         primaries = a2g.s3_primary_manifests(manifests)
115 |         self.assertEqual(primaries[0].key,
116 |                         "prefix/hourly_billing/20160401-20160501/hourly_billing-Manifest.json")
117 |         self.assertEqual(primaries[1].key,
118 |                         "prefix/hourly_billing/20160501-20160601/hourly_billing-Manifest.json")
119 | 


--------------------------------------------------------------------------------