Understanding the differences between alertmanager’s group_wait, group_interval and repeat_interval

Alertmanager is an application that handles alerts sent by client applications such as Prometheus. It can also perform alert grouping, deduplication, silencing, inhibition. Definitely a useful addition to any modern monitoring infrastructure. That being said, configuring it can be a little daunting with the many different configurations available and somewhat vague explanations on some of the terms. While configuring Alertmanager, I came across these 3 confusing terms: group_wait, group_interval and repeat_interval.
Read more →

Node-exporter setup with Systemd

For those who aren’t familiar, node-exporter is a Prometheus exporter that exposes hardware and OS metrics from *NIX kernels. To get it up and running, there’s a simple guide on Prometheus official docs. The issue with the approach is that running node-exporter by executing binary directly isn’t the most reliable approach in a production environment as there’s no way to ensure that the node_exporter process will run continuously. This is where systemd comes in.
Read more →