Wednesday, June 13, 2018

Export All Prometheus data to CSV file

We can query Prometheus data via API.  for an example to query data for metric named CPU,  you can use following API
http://prom_server:9090/api/v1/query?query=cpu
or if you need  data for past 1 hour then add filters like  [1h] or [1m]  etc.
http://prom_server:9090/api/v1/query?query=cpu[1h]

sample output 
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"collectd_cpu","cpu":"0","instance":"overcloud-cephstorage-0.localdomain","job":"collectd","service":"idle"},"value":[1528895820.304,"2033227691"]},
 
 
Now this is very tedious job if we have 100's of metrics and if need to go over each metric names and query them individually and export to a file .
 
So I used a python script based on Robust Perception blog on query result as CSV.
https://www.robustperception.io/prometheus-query-results-as-csv/
 and  modified the script to query all metric names and then query individual metrics with this list of metric names and save to a file.
 
Now this can be run as cron job configured to run hourly. The python script will get last 1 hours of data and put it in a archive file.

python prom_csv.py http://prom_server:9090 | gzip > $(date +"%Y_%m_%d_%I_%M_%p")_metrics.gz
 
Python Script
 
import csv
import requests
import sys
def GetMetrixNames(url):
    response = requests.get('{0}/api/v1/label/__name__/values'.format(url))
    names = response.json()['data']

    #Return metrix names
    return names


"""
Prometheus hourly data as csv.
"""
writer = csv.writer(sys.stdout)

if len(sys.argv) != 2:
    print('Usage: {0} http://localhost:9090'.format(sys.argv[0]))
    sys.exit(1)
metrixNames=GetMetrixNames(sys.argv[1])

writeHeader=True
for metrixName in metrixNames:
     #now its hardcoded for hourly
     response = requests.get('{0}/api/v1/query'.format(sys.argv[1]),
      params={'query': metrixResult+'[1h]'})
      results = response.json()['data']['result']
      # Build a list of all labelnames used.
      #gets all keys and discard __name__
      labelnames = set()
      for result in results:
          labelnames.update(result['metric'].keys())
      # Canonicalize
      labelnames.discard('__name__')
      labelnames = sorted(labelnames)
      # Write the samples.
      if writeHeader:
          writer.writerow(['name', 'timestamp', 'value'] + labelnames)
          writeHeader=False
      for result in results:
          l = [result['metric'].get('__name__', '')] + result['values']
          for label in labelnames:
              l.append(result['metric'].get(label, ''))
              writer.writerow(l)
 
 
I hope this helps someone  


No comments:

Post a Comment

Openshift on Openstack : Resize attached docker volumes for Openshift nodes  Recently I deployed Openshift 3.9 on Openstack 10z and had ...