Prometheus: Storage, aggregation and query with M3

This document is a getting started guide to using M3DB as remote storage for Prometheus.

M3 Coordinator configuration

To write to a remote M3DB cluster the simplest configuration is to run m3coordinator as a sidecar alongside Prometheus.

Start by downloading the config template. Update the namespaces and the client section for a new cluster to match your cluster’s configuration.

You’ll need to specify the static IPs or hostnames of your M3DB seed nodes, and the name and retention values of the namespace you set up. You can leave the namespace storage metrics type as unaggregated since it’s required by default to have a cluster that receives all Prometheus metrics unaggregated. In the future you might also want to aggregate and downsample metrics for longer retention, and you can come back and update the config once you’ve setup those clusters. You can read more about our aggregation functionality here.

It should look something like:

listenAddress: 0.0.0.0:7201

logging:
  level: info

metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:7203 # until https://github.com/m3db/m3/issues/682 is resolved
  sanitization: prometheus
  samplingRate: 1.0
  extended: none
  
tagOptions:
  idScheme: quoted

clusters:
   - namespaces:
# We created a namespace called "default" and had set it to retention "48h".
       - namespace: default
         retention: 48h
         type: unaggregated
     client:
       config:
         service:
           env: default_env
           zone: embedded
           service: m3db
           cacheDir: /var/lib/m3kv
           etcdClusters:
             - zone: embedded
               endpoints:
# We have five M3DB nodes but only three are seed nodes, they are listed here.
                 - M3DB_NODE_01_STATIC_IP_ADDRESS:2379
                 - M3DB_NODE_02_STATIC_IP_ADDRESS:2379
                 - M3DB_NODE_03_STATIC_IP_ADDRESS:2379
       writeConsistencyLevel: majority
       readConsistencyLevel: unstrict_majority
       writeTimeout: 10s
       fetchTimeout: 15s
       connectTimeout: 20s
       writeRetry:
         initialBackoff: 500ms
         backoffFactor: 3
         maxRetries: 2
         jitter: true
       fetchRetry:
         initialBackoff: 500ms
         backoffFactor: 2
         maxRetries: 3
         jitter: true
       backgroundHealthCheckFailLimit: 4
       backgroundHealthCheckFailThrottleFactor: 0.5

Now start the process up:

m3coordinator -f <config-name.yml>

Or, use the docker container:

docker pull quay.io/m3db/m3coordinator:latest
docker run -p 7201:7201 --name m3coordinator -v <config-name.yml>:/etc/m3coordinator/m3coordinator.yml quay.io/m3db/m3coordinator:latest

Prometheus configuration

Add to your Prometheus configuration the m3coordinator sidecar remote read/write endpoints, something like:

remote_read:
  - url: "http://localhost:7201/api/v1/prom/remote/read"
    # To test reading even when local Prometheus has the data
    read_recent: true
remote_write:
  - url: "http://localhost:7201/api/v1/prom/remote/write"

Also, we recommend adding M3DB and M3Coordinator/M3Query to your list of jobs under scrape_configs so that you can monitor them using Prometheus. With this scraping setup, you can also use our pre-configured M3DB Grafana dashboard.

- job_name: 'm3db'
  static_configs:
    - targets: ['<M3DB_HOST_NAME_1>:7203', '<M3DB_HOST_NAME_2>:7203', '<M3DB_HOST_NAME_3>:7203']
- job_name: 'm3coordinator'
  static_configs:
    - targets: ['<M3COORDINATOR_HOST_NAME_1>:7203']

NOTE: If you are running M3DB with embedded M3Coordinator, you should only have one job. We recommend just calling this job m3. For example:

- job_name: 'm3'
  static_configs:
    - targets: ['<HOST_NAME>:7203']

Querying With Grafana

When using the Prometheus integration with Grafana, there are two different ways you can query for your metrics. The first option is to configure Grafana to query Prometheus directly by following these instructions.

Alternatively, you can configure Grafana to read metrics directly from M3Coordinator in which case you will bypass Prometheus entirely and use M3’s PromQL engine instead. To set this up, follow the same instructions from the previous step, but set the url to: http://<M3_COORDINATOR_HOST_NAME>:7201.