Overview
m3aggregator
is a distributed system written in Go. It does streaming aggregation of ingested time
series data before persisting it in m3db. The goal is to reduce the volume of
time series data stored (especially for longer retentions), which is achieved by reducing its
cardinality and/or datapoint resolution.
m3aggregator
is sharded for horizontal scalability and replicated (leader/
follower modes) for high availability.
The data processed is mapped to a predefined number of shards, depending on
hash of time series id.
Every node of m3aggregator
is responsible for a single
shardset which contains one or more shards assigned to it. The assignment of
shards to the shardsets, and shardsets to individual nodes is part of the
m3aggregator
placement which is stored in etcd as a
binary protobuf struct.
The communication protocol used by m3aggregator
is m3msg.
m3aggregator
receives the input data from m3coordinator.
See Flushing for how the aggregator outputs the
aggregated data (also for the inter-node communication).