Prometheus - Convert Cpu_user_seconds to Cpu Usage %?

Better Stack Team
Updated on November 18, 2024

To convert cpu_user_seconds (or a similar metric that represents CPU time) to CPU usage percentage in Prometheus, you need to calculate the rate of CPU usage over a defined period and then normalize that by the number of available CPU cores. This gives you a percentage value that represents the CPU usage.

Step 1: Understanding the Metric

Assuming you have a metric called container_cpu_user_seconds_total, which tracks the total user CPU time consumed by the containers, you can calculate the CPU usage percentage as follows:

  1. Rate Calculation: Use the rate() function to get the per-second rate of CPU usage.
  2. Normalization: Divide the CPU usage rate by the total number of available CPU cores, then multiply by 100 to get the percentage.

Step 2: Sample Query

Here's a sample query that calculates the CPU usage percentage based on container_cpu_user_seconds_total:

 
100 * sum(rate(container_cpu_user_seconds_total[5m])) by (pod, namespace) / count(node_cpu_seconds_total{mode="user"})

Breakdown of the Query

  • rate(container_cpu_user_seconds_total[5m]): This computes the per-second rate of CPU time used in the last 5 minutes. You can adjust the duration as needed.
  • sum(...) by (pod, namespace): This aggregates the CPU usage for all containers grouped by pod and namespace.
  • count(node_cpu_seconds_total{mode="user"}): This counts the number of CPU cores available. You might want to sum it instead if you're interested in total CPU capacity rather than just counting the cores.
  • 100 * ...: This converts the ratio into a percentage.

Example with Total CPU Cores

If you want to calculate the CPU usage based on the total number of CPU cores on a node, you can use the node_cpu_seconds_total metric directly:

 
100 * sum(rate(container_cpu_user_seconds_total[5m])) by (pod, namespace) / sum(count(node_cpu_seconds_total) by (instance))

Step 3: Adjusting for Other Modes

If you want to include other CPU modes like system or idle, you can modify the query accordingly. For example:

 
100 * sum(rate(container_cpu_user_seconds_total[5m]) + rate(container_cpu_system_seconds_total[5m])) by (pod, namespace) / sum(count(node_cpu_seconds_total) by (instance))

Conclusion

By using the rate function along with aggregation and normalization, you can effectively convert CPU usage in seconds to a percentage in Prometheus. This allows for better visibility into resource utilization within your Kubernetes or other environments.

Got an article suggestion? Let us know
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github