├── README.md └── prometheus2csv.py /README.md: -------------------------------------------------------------------------------- 1 | # prometheus2csv 2 | This is a tool to query multiple metrics from a prometheus database through the REST API, and save them into a csv file. 3 | 4 | ## Why do we need this? 5 | Needless to say, we have lots of requirements towards docker monitoring. And what's more, we want to dig into metrics exporting methods. In order to do some data analysis, we want to extract metrics in a usable/simple format (e.g. CSV). It's easy to query some metrics, but the tools paid less attention to export multiple metrics joined by timestamp(this can be some scenarios about data analysis, or dev daily report), then prometheus2csv was born. More info about 2 basic docker monitoring solutions, you can visit my blog [here](http://blog.gluckzhang.com/archives/145/). 6 | 7 | ## How to use it? 8 | `prometheus2csv` is a command line tool for Python 3. Basic useage is as follows: 9 | 10 | ```bash 11 | python prometheus2csv.py -h http://prometheus:9090 -c blc_server -o test.csv -s 10s --period=120 12 | ``` 13 | 14 | `http://prometheus:9090` is your Prometheus server's address, `blc_server` is the name of the container which you want to query multiple metrics, `test.csv` is the target csv file's name(default is result.csv), `10s` indicates query resolution step width in Prometheus query API, `120` means that you will get 120 **minutes** data which is in the most recent period. 15 | 16 | ### All arguments of prometheus2csv 17 | 18 | Required arguments: 19 | 20 | - -h / --host: Prometheus server address 21 | - -c / --container: The name of the container which you want to query multiple metrics 22 | 23 | Optional arguments: 24 | 25 | - -o / --outfile: Query result's CSV filename, default is `result.csv` 26 | - -s / --step: Query resolution step width in Prometheus query API, default is `10s` 27 | - --period: Indicate this to get most recent period's data, it's an integer, in minute. For example, when you use `--period=120`, you will get the data from 120 mins ago till now 28 | - --start: Start time in the query, use timestamp or rfc3339 29 | - --end: End time in the query, use timestamp or rfc3339 30 | - --help: Get the basic help info 31 | 32 | *Attention: you can use start&end OR period to query data, if these 3 arguments all exist, `period` will be the priority.* 33 | 34 | ## Ideal data format 35 | 36 | ``` 37 | ------------------------------------------------- 38 | | timestame + metric1 + metric2 + metric3 + ... | 39 | ------------------------------------------------- 40 | | xxxxxxxxx + value1 + value2 + value3 + ... | 41 | ------------------------------------------------- 42 | | ... + ... + ... + ... + ... | 43 | ------------------------------------------------- 44 | ``` 45 | 46 | ## Implementation details 47 | 48 | Mainly use [Prometheus HTTP API](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) to get the result (query and query_range). 49 | 50 | ### Step1: Get all the metric names based on the container's name 51 | 52 | It's a little tricky here, because Prometheus doesn't provide any query functions to get this info, but we can query like this: 53 | 54 | ``` 55 | http://prometheus:9090/api/v1/query?query=sum by(__name__)({name="blc_server"}) 56 | ``` 57 | 58 | Then you will get all the metrics' names(get them by `__name__`) relevant to container 'blc_server'. Hence `timestamp + these_metrics_names` will be our csv file's header. 59 | 60 | ### Step2: Query every metric's values and join them together by timestamp 61 | 62 | Construct series of query urls based on the names of metrics, then we can get timestamp-value info for every metric. Use a dictionary to store these data, timestamp can be keys and each key corresponds to a list. 63 | 64 | *As far as I know, Prometheus doesn't provide such method to get multiple metrics' values joined by timestamp, so we have to make series of queries as well. Sometimes you can get similar csv file in Grafana, this is an end-user method talked in [my blog](http://blog.gluckzhang.com/archives/145/)* 65 | 66 | ## Commands to deploy a proper cAdvisor + Prometheus 67 | 68 | 1) Assume that you have run some containers which you want to monitor 69 | 70 | 2) Run cadvisor in a container 71 | 72 | ```bash 73 | sudo docker run \ 74 | --volume=/:/rootfs:ro \ 75 | --volume=/var/run:/var/run:rw \ 76 | --volume=/sys:/sys:ro \ 77 | --volume=/var/lib/docker/:/var/lib/docker:ro \ 78 | -p 8080:8080 \ 79 | --detach=true \ 80 | --name=cadvisor \ 81 | google/cadvisor:latest 82 | ``` 83 | 84 | 3) Run Prometheus and set up a job about fetching metrics from cAdvisor 85 | 86 | - First of all, we touch a new configure file 87 | 88 | ```xml 89 | ## prometheus.yml ## 90 | 91 | global: 92 | scrape_interval: 15s # By default, scrape targets every 15 seconds. 93 | evaluation_interval: 15s # By default, scrape targets every 15 seconds. 94 | # scrape_timeout is set to the global default (10s). 95 | 96 | # Attach these labels to any time series or alerts when communicating with 97 | # external systems (federation, remote storage, Alertmanager). 98 | external_labels: 99 | monitor: 'blc-monitor' 100 | 101 | scrape_configs: 102 | - job_name: 'prometheus' 103 | scrape_interval: 5s 104 | static_configs: 105 | - targets: ['localhost:9090'] 106 | 107 | - job_name: 'cadvisor' 108 | # Override the global default and scrape targets from this job every 5 seconds. 109 | scrape_interval: 5s 110 | static_configs: 111 | - targets: ['cadvisor:8080'] 112 | ``` 113 | 114 | - Start Prometheus as container service, and link it to cadvisor 115 | 116 | ```bash 117 | sudo docker run -d -p 9090:9090 \ 118 | -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \ 119 | --link cadvisor:cadvisor \ 120 | --name=prometheus \ 121 | prom/prometheus \ 122 | --config.file=/etc/prometheus/prometheus.yml 123 | ``` 124 | 125 | Now you can use prometheus2csv to export your monitoring data, cheers! 126 | 127 | ## TODO 128 | 129 | - Generate a config file for the first run: then you can update the query easily, and you can also choose some of the metrics you are interested in, instead of query all metrics. 130 | - Multiple query jobs: maybe we need to query series of containers' monitoring data, so it's better to run prometheus2csv one time and get all the info you want (maybe with multiple csv files, classified by containers' names). 131 | - For some metrics in Prometheus (acturally from cAdvisor), they have the same `__name__`, but some other labels' names differ. For example, you might get series of results for `container_fs_io_time_seconds_total`, because the container might have many `device` values. prometheus2csv should handle this circumstances. -------------------------------------------------------------------------------- /prometheus2csv.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # -*- coding: utf-8 -*- 3 | # Filename: metrics2csv.py 4 | 5 | import csv 6 | import requests 7 | import sys 8 | import getopt 9 | import time 10 | import logging 11 | 12 | PROMETHEUS_URL = '' 13 | CONTAINER = '' 14 | QUERY_API = '/api/v1/query' 15 | RANGE_QUERY_API = '/api/v1/query_range' 16 | RESOLUTION = '' # default: 10s 17 | OUTPUTFILE = '' # default: result.csv 18 | START = '' # rfc3339 | unix_timestamp 19 | END = '' # rfc3339 | unix_timestamp 20 | PERIOD = 60 # unit: minute, default 60 21 | 22 | def main(): 23 | handle_args(sys.argv[1:]) 24 | 25 | metricnames = query_metric_names() 26 | logging.info("Querying metric names succeeded, metric number: %s", len(metricnames)) 27 | 28 | csvset = query_metric_values(metricnames=metricnames) 29 | logging.info("Querying metric values succeeded, rows of data: %s", len(csvset)) 30 | 31 | write2csv(filename=OUTPUTFILE, metricnames=metricnames, dataset=csvset) 32 | 33 | def handle_args(argv): 34 | global PROMETHEUS_URL 35 | global OUTPUTFILE 36 | global CONTAINER 37 | global RESOLUTION 38 | global START 39 | global END 40 | global PERIOD 41 | 42 | try: 43 | opts, args = getopt.getopt(argv, "h:o:c:s:", ["host=", "outfile=", "container=", "step=", "help", "start=", "end=", "period="]) 44 | except getopt.GetoptError as error: 45 | logging.error(error) 46 | print_help_info() 47 | sys.exit(2) 48 | 49 | for opt, arg in opts: 50 | if opt == "--help": 51 | print_help_info() 52 | sys.exit() 53 | elif opt in ("-h", "--host"): 54 | PROMETHEUS_URL = arg 55 | elif opt in ("-o", "--outfile"): 56 | OUTPUTFILE = arg 57 | elif opt in ("-c", "--container"): 58 | CONTAINER = arg 59 | elif opt in ("-s", "--step"): 60 | RESOLUTION = arg 61 | elif opt == "--start": 62 | START = arg 63 | elif opt == "--end": 64 | END = arg 65 | elif opt == "--period": 66 | PERIOD = int(arg) 67 | 68 | if PROMETHEUS_URL == '': 69 | logging.error("You should use -h or --host to specify your prometheus server's url, e.g. http://prometheus:9090") 70 | print_help_info() 71 | sys.exit(2) 72 | elif CONTAINER == '': 73 | logging.error("You should use -c or --container to specify the name of the container which you want to query all the metrics of") 74 | print_help_info() 75 | sys.exit(2) 76 | 77 | if OUTPUTFILE == '': 78 | OUTPUTFILE = 'result.csv' 79 | logging.warning("You didn't specify output file's name, will use default name %s", OUTPUTFILE) 80 | if RESOLUTION == '': 81 | RESOLUTION = '10s' 82 | logging.warning("You didn't specify query resolution step width, will use default value %s", RESOLUTION) 83 | if PERIOD == '' and START == '' and END == '': 84 | PERIOD = 10 85 | logging.warning("You didn't specify query period or start&end time, will query the latest %s miniutes' data as a test", PERIOD) 86 | 87 | def print_help_info(): 88 | print('') 89 | print('Metrics2CSV Help Info') 90 | print(' metrics2csv.py -h -c [-o ]') 91 | print('or: metrics2csv.py --host= --container= [--outfile=]') 92 | print('---') 93 | print('Additional options: --start= --end= --period=') 94 | print(' use start&end or only use period') 95 | 96 | def query_metric_names(): 97 | response = requests.get(PROMETHEUS_URL + QUERY_API, params={'query': 'sum by(__name__)({{name="{0}"}})'.format(CONTAINER)}) 98 | status = response.json()['status'] 99 | 100 | if status == "error": 101 | logging.error(response.json()) 102 | sys.exit(2) 103 | 104 | results = response.json()['data']['result'] 105 | metricnames = list() 106 | for result in results: 107 | metricnames.append(result['metric'].get('__name__', '')) 108 | metricnames.sort() 109 | 110 | return metricnames 111 | 112 | 113 | def query_metric_values(metricnames): 114 | csvset = dict() 115 | 116 | if PERIOD != '': 117 | end_time = int(time.time()) 118 | start_time = end_time - 60 * PERIOD 119 | else: 120 | end_time = END 121 | start_time = START 122 | 123 | metric = metricnames[0] 124 | response = requests.get(PROMETHEUS_URL + RANGE_QUERY_API, params={'query': '{0}{{name="{1}"}}'.format(metric,CONTAINER), 'start': start_time, 'end': end_time, 'step': RESOLUTION}) 125 | status = response.json()['status'] 126 | 127 | if status == "error": 128 | logging.error(response.json()) 129 | sys.exit(2) 130 | 131 | results = response.json()['data']['result'] 132 | 133 | if len(results) == 0: 134 | logging.error(response.json()) 135 | sys.exit(2) 136 | 137 | for value in results[0]['values']: 138 | csvset[value[0]] = [value[1]] 139 | 140 | for metric in metricnames[1:]: 141 | response = requests.get(PROMETHEUS_URL + RANGE_QUERY_API, params={'query': '{0}{{name="{1}"}}'.format(metric,CONTAINER), 'start': start_time, 'end': end_time, 'step': RESOLUTION}) 142 | results = response.json()['data']['result'] 143 | for value in results[0]['values']: 144 | csvset[value[0]].append(value[1]) 145 | 146 | return csvset 147 | 148 | def write2csv(filename, metricnames, dataset): 149 | with open(filename, 'w') as file: 150 | writer = csv.writer(file) 151 | writer.writerow(['timestamp'] + metricnames) 152 | for timestamp in sorted(dataset.keys(), reverse=True): 153 | writer.writerow([timestamp] + dataset[timestamp]) 154 | # for line in dataset: 155 | # writer.writerow([line] + dataset[line]) 156 | 157 | if __name__ == "__main__": 158 | logging.basicConfig(level=logging.INFO) 159 | main() 160 | --------------------------------------------------------------------------------