What Is A Prometheus Rule?
A Prometheus rule defines how Prometheus processes time-series data, primarily for recording precomputed results or triggering alerts. Rules are written in YAML and referenced in Prometheus' configuration under the rule_files
section.
Recording Rules
Recording rules create new time series by precomputing results from PromQL queries. This simplifies complex queries and improves performance for frequent use cases. For example, to calculate the average CPU usage across nodes:
groups:
- name: recording_rules
rules:
- record: job:node_cpu:avg
expr: avg(rate(node_cpu_seconds_total[5m])) by (job)
The record
field specifies the name of the new time series, while expr
defines the query to compute its value. Recording rules reduce computational overhead and simplify dashboards and alerts.
Alerting Rules
Alerting rules evaluate conditions to trigger alerts. When conditions are met, Prometheus generates an alert and sends it to external systems like Alertmanager. For example, to alert when CPU usage exceeds 90% for five minutes:
groups:
- name: alerting_rules
rules:
- alert: HighCPUUsage
expr: avg(rate(node_cpu_seconds_total[5m])) by (instance) > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage has exceeded 90% for the last five minutes."
The alert
field names the alert, while expr
defines the condition to evaluate. The for
field specifies a waiting period before firing the alert, and labels and annotations provide context for notifications.
Configuring Rules
Save rules to a file (e.g., rules.yml
) and reference it in the Prometheus configuration:
rule_files:
- /path/to/rules.yml
Reload Prometheus to apply changes using:
bash
kill -HUP $(pgrep prometheus)
Best Practices
Group related rules logically and precompute complex queries with recording rules to simplify alerting. Use reasonable evaluation intervals to avoid overloading Prometheus and include meaningful labels and annotations for clarity. Test PromQL expressions in the Prometheus UI before deploying them in rules.
Prometheus rules improve monitoring by enabling faster queries and proactive alerting, making systems easier to manage and maintain.
-
What is the Prometheus Alert Lifecycle?
The Prometheus alert lifecycle describes the process alerts follow from creation to resolution, enabling effective monitoring and timely notifications. Prometheus generates alerts based on pre-defi...
Questions -
What Is The Job Label In Prometheus?
The job label in Prometheus organizes monitored instances or endpoints into logical groups. It is automatically added to metrics scraped from targets defined in the same scrape configuration block ...
Questions -
What is a Prometheus target?
In Prometheus, a target is an endpoint or service that Prometheus monitors by scraping metrics. These metrics are exposed in a Prometheus-compatible format, typically over an HTTP or HTTPS endpoint...
Questions -
How To Set Up And Secure Prometheus Metrics Endpoints
Exposing Prometheus metrics is essential for monitoring, but securing these endpoints is crucial to prevent unauthorized access and protect sensitive data. Here’s how you can set up and secure Prom...
Questions
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for usBuild on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github