Writing SQL queries

Create tailored charts by writing your own custom SQL queries.

Want a dashboard that works out of the box?

Use one of our ready-to-use templates to create a dashboard

Getting started

Create your first chart using a custom SQL query. Start by creating a dashboard in DashboardsCreate dashboard.

Create a new chart on your dashboard and use the query below:

Query counting the number of log lines by level
SELECT {{time}} AS time,
  countMerge(events_count) AS value,
  level AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

Chart preview

Dashboard queries use ClickHouse SQL, which is largely similar to ANSI SQL you’re likely familiar with. In the SQL query above, we leverage the countMerge ClickHouse function.

On top of SQL, dashboard queries feature variables like {{source}} or {{time}}. Variables conveniently insert selected source, current time range, and other selected values into your queries. Read more about query variables.

Queries don't access your logs directly. To make querying fast and efficient we process your logs into metrics and store them as ClickHouse -State combinators. Dashboard queries use ClickHouse -Merge functions to get the values of the metrics. Read more about querying the metrics below.

How to query the metrics?

Want to count the number of lines?

Use countMerge(events_count). Avoid using count() as it returns incorrect results. Metric tables don't have the same number of rows as the original log tables.

Query metrics with no aggregations by their name just like regular database columns.

For metrics with aggregations, use avgMerge(metric_name_avg) to get the average value of a metric.

Similarly, use minMerge(metric_name_min) and maxMerge(metric_name_max) to get a metric's minimum and maximum values. Make sure to always match the -Merge function with the metric suffix correctly:

  • ✅ Correct: Using sumMerge with _sum suffix: sumMerge(metric_name_sum)
  • ❌ Incorrect: Mixing sum and count: sumMerge(metric_name_count)

Want to use percentiles?
Check out the complete example on percentiles below.

Which metrics can you query?

We automatically create metrics for your logs based on the platform of your source.

Find the complete list of available metrics for your logs in Sourcesyour sourceConfigureLogs to metrics tab.

See Query with to know how to query the metric.
Use either ClickHouse aggregations or name of the metric depending on the configured aggregations.

You can add more metrics to use in your SQL queries.

Screenshot of Logs to metrics tables

Query examples

See the examples below for full queries using Metrics.

Number of errors

Chart the number of logs with the error log level:

Query counting number of errors
SELECT {{time}} AS time, countMerge(events_count) AS value
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
  AND level = 'error'
GROUP BY time, series

Use countMerge(events_count) to count the number of log lines. Avoid using count(*).

Filter error log lines using the level metric. The metric has to be present in Logs to metrics configuration with No aggregation selected.

Average request duration

Chart the average duration of request:

Query for average request duration
SELECT {{time}} AS time,
  avgMerge(request_duration_avg) AS "Request duration"
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}

Use avgMerge(request_duration_avg) to calculate an average for the request_duration metric. The metric has to be added in Logs to metricsMetrics with avg aggregation.

Column names Request duration will be used in chart tooltips and the legend. Select which column should be charted on the Y-axis on the right side in Chart settingAxes.

Percentile request duration

Chart the 95th percentile request duration:

Query for percentile request duration
SELECT {{time}} AS time,
  quantilesMerge(0.5, 0.9, 0.95, 0.99)(request_duration_quantiles)[3]
    AS "95th percentile duration"
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

Here we have the request_duration metric with p50, p90, p95, and p99 percentiles set up in Logs to metrics. Calculate the 95th percentile request duration using quantilesMerge. The quantilesMerge function gives us all four percentiles. We select the third percentile using [3].

Adding metrics

Add metrics and in Sourcesyour sourceConfigureLogs to metrics tab.

Click the + Metric button and fill out the inputs:

  • Metric: Name your metric.

  • JSON dot notation: Extract nested fields using JSON path - e.g. context.user.email. For more complex metrics write your own SQL expression.

  • SQL expression: Use JSONExtract() or any valid ClickHouse SQL expression.

    • For example, use JSONExtract(json, 'request', 'duration_ms', 'Nullable(Float64)') to extract request.duration_ms field from your logs.
    • Using 'Nullable(Float64)' parses the value as float and prevents errors when the field is missing. See all ClickHouse data types.
  • Aggregations: Choose aggregations you want to use in your queries. You can select multiple aggregations. For example, use avg and p95 to get the average and 95th percentile values.

    • Choose No aggregation to get a metric without any aggregations you can use for grouping.

Screenshot of adding a new metric

Prometheus-like metrics

Want to access metric labels?

Use metricTag('tag_name') to get value for any metric tag. Use the resulting tag value to select, group by, or anywhere else in your SQL queries.

Metrics you send to Better Stack are processed like logs for fast and efficient querying. We process your metrics and store them as ClickHouse -State combinators.

Everything about querying the metrics written above also applies to writing queries for Prometheus-like metrics.

The most significant difference when writing queries for Prometheus-like metrics is the Logs to metrics configuration. We automatically configure Metrics for your Prometheus-like metrics.

Let's take a look at the Logs to metrics configuration for metric sources.
You can find it in Sourcesyour sourceConfigureLogs to metrics tab:

Screenshot of Logs to metrics for Prometheus-like metrics

  • Value metric represents the immediate value of the metric. Use it with gauge metrics or anytime you want to chart the metric value directly.

  • Rate represents how much the metric value has increased per second. Common use for rates is to calculate the number of requests per second. Use rate with counter metrics.

What are gauges and counter metrics?

Gauge is a metric that can go up and down, used for, e.g., CPU or memory usage.
Counter is a metric that can only increase, commonly used for number of requests.
Read more about metric types

  • Tags contain all metric tags, commonly called labels. Use metricTag('tag_name') to get value for any tag. Use the resulting tag value to select, group by, or anywhere else in your SQL queries.

  • Name is the name of your metric. Use it to select, filter, or anywhere else in your SQL queries.

  • Series_id is a unique value for each metric series. Use GROUP BY series_id when aggregating your metrics.

What is a metric series?

Metric series represent individual series of values that make up your metric. For example, a metric with hostname tag will have one series for each hostname.

Examples with metrics

See complete SQL query examples using ClickHouse aggregator combinators and Prometheus-like metrics.

View Prometheus metrics

List your metrics with their tags and values as a Table chart type:

Query selecting raw metrics
SELECT {{time}} AS time, tags, name, anyLastMerge(value_last)
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, tags, name

Select metric name, tags, and time. Use anyLastMerge(value_last) to get an unaggregated value for the metric.

Visualize the data using a Table chart type. In the right sidebar, choose Chart settingsChartTable.

CPU load

Chart CPU load using cpu_load gauge metric:

Query calculating CPU load
-- sum values across series
SELECT time, series, SUM(series_maximum)/100.0 as value
    {{time}} AS time, 
    metricTag('host') AS series,
    -- calculate average for each series
    maxMerge(value_max) as series_maximum
  FROM {{source}}
  WHERE name = 'cpu_load' AND time BETWEEN {{start_time}} AND {{end_time}}
  GROUP BY time, series, series_id
GROUP BY time, series

Select cpu_load metric by its name using WHERE name = 'cpu_load'.

Use maxMerge(value_max) to get the maximal CPU load in each time interval. Calculating maximum gives you an accurate visualization of CPU load peaks. Using avgMerge(value_avg) is a good alternative if you care about average values more than peaks.

Write a subquery to aggregate and use a second query to get the final result:

  • Subquery calculates a maximum for each metric series. Using GROUP BY series_id ensures that you get a correct maximum for each series.
  • In the outer query, sum all metrics series together to get the final result.

Chart CPU for multiple servers by setting metricTag('host') as series.

Number of requests

Chart the average number of served requests using requests_total counter metric:

Query for number of requests
-- sum values across series to get total rate
SELECT time, 'Requests per second' AS series, SUM(avg_rate) as value
    {{time}} AS time, 
    -- calculate average rate for each series
    avgMerge(rate_avg) AS avg_rate
  FROM {{source}}
  WHERE name = 'requests_total' AND time BETWEEN {{start_time}} AND {{end_time}}
  GROUP BY time, series_id

Select requests_total metric using WHERE name = 'requests_total'.

Calculate the average number of served requests using avgMerge(rate_avg). The metric requests_total is a counter metric. It represents the total number of requests. Use rate_avg to get an average increase in the metric. Usually, you should avoid using value_* with counter metrics.

As in the previous example, you need to aggregate first to get the average for each metric series. Then, SUM() the averages in the second query to get the final average number of served requests.

Use SELECT 'Requests per second' as series to set the name of the series. It will be used in tooltips and the chart legend.

Number of running servers

Get the number of running servers by counting the number of values for hostname tag:

Query counting number of unique hosts
  {{time}} AS time, 
  COUNT(DISTINCT metricTag('hostname')) AS value
FROM {{source}}
WHERE name = 'up'
  AND time BETWEEN {{start_time}} AND {{end_time}}

Select the up metric using WHERE name = 'up' to only get hostname values for running servers. You should always select metrics by name to prevent other metrics tags and values from causing inaccuracies in your chart. Selecting only the metrics you need will also make your queries faster.

Count the total number of unique values for the hostname tag using COUNT(DISTINCT metricTag('hostname')).

Need help?

Please let us know at hello@betterstack.com.
We're happy to help! 🙏