# Writing SQL queries

<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
  <iframe 
    style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" 
    src="https://www.youtube.com/embed/kf97nwgL88M" 
    title="YouTube video player" 
    frameborder="0" 
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" 
    referrerpolicy="strict-origin-when-cross-origin" 
    allowfullscreen>
  </iframe>
</div>

Create tailored charts by writing your own custom SQL queries.

[info]
#### Want a dashboard that works out of the box? 
Use one of our ready-to-use templates to [create a dashboard](https://telemetry.betterstack.com/team/0/dashboards ";_blank")
[/info]

## Getting started

Create your first chart using a custom SQL query. Start by creating a dashboard in [Dashboards](https://telemetry.betterstack.com/team/0/dashboards ";_blank") → **Create dashboard**.

Create a new chart on your dashboard and use the query below:

```sql
[label Query counting the number of log lines by level]
SELECT {{time}} AS time,
  countMerge(events_count) AS value,
  label('level') AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
  AND _row_type = 'log'
GROUP BY time, series
```

![Chart preview](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/f3c07c18-9580-4964-c7d5-82112f7e1b00/lg2x =1932x958)


[warning]
#### Getting `Missing columns` error?
Try selecting a different source or [add metric for the column](https://betterstack.com/docs/logs/dashboards/logs-to-metrics/) to your source.
[/warning]

Dashboard queries use [ClickHouse SQL](https://clickhouse.com/docs/en/sql-reference), which is largely similar to ANSI SQL you’re likely familiar with. In the SQL query above, we leverage the `countMerge` [ClickHouse function](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/combinators#-merge). 


[info]
#### Want an in-depth guide on ClickHouse aggregations?

Check out the official blog post about [ClickHouse aggregate combinators](https://clickhouse.com/blog/aggregate-functions-combinators-in-clickhouse-for-arrays-maps-and-states)
[/info]

On top of SQL, dashboard queries feature variables like `{{source}}` or `{{time}}`. Variables conveniently insert selected source, current time range, and other selected values into your queries. Read more about [query variables](https://betterstack.com/docs/logs/dashboards/variables/).

Queries access a unified row-based metrics schema. This schema stores all data in a unified format with metrics stored as named values with labels in a `tags` map. This allows for fast and efficient querying. Read more about [querying the metrics](#how-to-query-the-metrics) below.

[info]
#### Not familiar with SQL?

Follow our examples below or start with [SQL tutorial from W3Schools](https://www.w3schools.com/sql/sql_syntax.asp)
[/info]

[info]
#### Curious about ClickHouse SQL?

Read more about [ClickHouse SQL syntax](https://clickhouse.com/docs/en/sql-reference/syntax) and [compatibility with ANSI SQL](https://clickhouse.com/docs/en/sql-reference/ansi)
[/info]

## How to query the metrics?

We stores all data in a unified format, making querying consistent.

*   **Accessing Custom Fields:** All custom or dynamic columns (labels/tags) use the `label()` function. For example, `response_status` becomes `label('response_status')`. Labels are always strings, so `toString(label(...))` is redundant.
*   **Standard Columns:** Columns like `time`, `dt`, `name`, `_row_type`, `series_id`, and `tags` are part of the schema and should **not** use the `label()` wrapper.
*   **Row Type Filtering:** Each ingested event creates multiple rows. Use `_row_type = 'log'` or `_row_type = 'span'` to get counts of the original events. Do **not** use `_row_type` when querying named metrics (where `name = 'metric_name'`).
*   **Type Handling for Labels:** Since `label()` returns a string, explicit type conversions (`toUInt16OrNull()`, `toFloat64OrNull()`) are required for numeric comparisons or operations.
*   **Event Counting:** Use `countMerge(events_count)` for aggregated metrics queries.
*   **No Nested Aggregate Functions:** ClickHouse does not allow nesting aggregate functions (e.g., `sum(countMerge(...))`). Merge functions (`countMerge`, `sumMerge`, `avgMerge`, etc.) already combine internal states.
*   **Time Filtering:** Use `time` in `WHERE` clauses for accurate filtering across bucket boundaries, and `{{time}}` in `SELECT`/`GROUP BY` for bucketing.

[warning]
#### Want to count the number of lines?
Use `countMerge(events_count)`. Avoid using `count(*)` as it returns incorrect results. Metric tables don't have the same number of rows as the original log tables.
[/warning]

For metrics with aggregations, use `avgMerge(value_avg)` to get the average value of a metric.

Similarly, use `minMerge(value_min)` and `maxMerge(value_max)` to get a metric's minimum and maximum values.
Make sure to always match the -Merge function with the metric suffix correctly:

- ✅ Correct: Using **sum**Merge with **_sum** suffix: `sumMerge(bucket_sum)` 
- ❌ Incorrect: Mixing **sum** and **count**: `sumMerge(bucket_count)`


**Want to use percentiles?**  
Check out the [complete example on percentiles](#percentile-request-duration) below. 


### Which metrics can you query?

We automatically create metrics for your logs based on the platform of your source.

Find the complete list of available metrics for your logs in [Sources](https://telemetry.betterstack.com/team/0/sources ";_blank") → **your source** → **Configure** → **Logs to metrics** tab.  

See **Query with** to know how to query the metric.  
Use either [ClickHouse aggregations](https://clickhouse.com/docs/en/sql-reference/aggregate-functions) or name of the metric depending on the configured aggregations.
 
You can [add more metrics](https://betterstack.com/docs/logs/dashboards/logs-to-metrics/) to use in your SQL queries.

![Screenshot of Logs to metrics tables](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/3f146f41-9e37-4862-bba1-cac0bd20e000/md1x =2052x1024)

## Query examples

See the examples below for full queries using **Metrics**.

### Number of errors

Chart the number of logs with the error log level:

```sql
[label Query counting number of errors]
SELECT {{time}} AS time,
  countMerge(events_count) AS value,
  label('level') AS series
FROM {{source}}
WHERE dt BETWEEN {{start_time}} AND {{end_time}}
  AND _row_type = 'log'
  AND series = 'error'
GROUP BY time, series
```

Use `countMerge(events_count)` to count the number of log lines. Avoid using `count(*)`.

Filter error log lines using the `label('level')` metric. The metric has to be present in **Logs to metrics** configuration with **No aggregation** selected.

### Average request duration

Chart the average duration of request:

```sql
[label Query for average request duration]
SELECT {{time}} AS time,
  avgMerge(value_avg) AS "Request duration"
FROM {{source}}
WHERE dt BETWEEN {{start_time}} AND {{end_time}}
  AND name = 'request_duration'
GROUP BY time
```

Use `avgMerge(value_avg)` to calculate an average for the `request_duration` metric. The metric has to be added in **Logs to metrics** → **Metrics** with `avg` aggregation.

Column names `Request duration` will be used in chart tooltips and the legend. Select which column should be charted on the Y-axis on the right side in **Chart setting** → **Axes**.


### Percentile request duration

Chart the 95th percentile request duration:

```sql
[label Query for percentile request duration]
SELECT {{time}} AS time,
  histogramQuantile(0.95) AS "95th percentile duration"
FROM {{source}}
WHERE dt BETWEEN {{start_time}} AND {{end_time}}
  AND name = 'request_duration_seconds'
GROUP BY time, series
```

Here we use the `histogramQuantile(0.95)` helper function to directly calculate the 95th percentile for the `request_duration_seconds` metric. This assumes `request_duration_seconds` is a histogram metric and `histogramQuantile` is defined in the database (which it is). If querying multiple metrics, use `quantilePrometheusHistogramArrayMergeIf` as described in the histogram documentation.
 

[info]
#### Want to learn how to use the aggregation functions?

Check out the official guide on [ClickHouse Aggregate Combinators](https://clickhouse.com/blog/aggregate-functions-combinators-in-clickhouse-for-arrays-maps-and-states)
[/info]

## Prometheus-like metrics

[info]
#### Want to access metric labels?

Use `label('tag_name')` to get value for any metric tag. Use the resulting tag value to select, group by, or anywhere else in your SQL queries.
[/info]

Metrics you send to Better Stack are processed like logs for fast and efficient querying.
We process your metrics and store them as [ClickHouse -State combinators](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/combinators#-state).   

Everything about [querying the metrics](#how-to-query-the-metrics) written above also applies to writing queries for Prometheus-like metrics.

The most significant difference when writing queries for Prometheus-like metrics is the **Logs to metrics** configuration. We automatically configure **Metrics** for your Prometheus-like metrics.
 
Let's take a look at the **Logs to metrics** configuration for metric sources.  
You can find it in [Sources](https://telemetry.betterstack.com/team/0/sources ";_blank") → **your source** → **Configure** → **Logs to metrics** tab: 


![Screenshot of Logs to metrics for Prometheus-like metrics](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/d508491e-a2bf-444e-5165-7ed995911100/lg1x =2250x1618)


- **Value** metric represents the immediate value of the metric. Use it with gauge metrics or anytime you want to chart the metric value directly.

- **Rate** represents how much the metric value has increased per second. Common use for rates is to calculate the number of requests per second. Use rate with counter metrics. 

[info]
#### What are gauges and counter metrics?

Gauge is a metric that can go up and down, used for, e.g., CPU or memory usage.    
Counter is a metric that can only increase, commonly used for number of requests.  
Read more about [metric types](https://prometheus.io/docs/concepts/metric_types/)
[/info]


- **Tags** contain all metric tags, commonly called labels. Use `label('tag_name')` to get value for any tag. Use the resulting tag value to select, group by, or anywhere else in your SQL queries.


- **Name** is the name of your metric. Use it to select, filter, or anywhere else in your SQL queries.


- **Series_id** is a unique value for each metric series. Use `GROUP BY series_id` when aggregating your metrics.


[info]
#### What is a metric series?

Metric series represent individual series of values that make up your metric. For example, a metric with `hostname` tag will have one series for each hostname. 
[/info]


## Examples with metrics

See complete SQL query examples using [ClickHouse aggregator combinators](https://clickhouse.com/blog/aggregate-functions-combinators-in-clickhouse-for-arrays-maps-and-states) and Prometheus-like metrics. 

### View Prometheus metrics

List your metrics with their tags and values as a **Table** chart type: 

```sql
[label Query selecting raw metrics]
SELECT {{time}} AS time, tags, name, anyLastMerge(value_last)
FROM {{source}}
WHERE dt BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, tags, name
LIMIT 10
```

Select metric `name`, `tags`, and `time`. Use `anyLastMerge(value_last)` to get an unaggregated value for the metric.


Visualize the data using a **Table** chart type. In the right sidebar, choose **Chart settings** → **Chart** → **Table**.


### CPU load

Chart CPU load using `cpu_load` gauge metric: 

```sql
[label Query calculating CPU load]
-- sum values across series
SELECT time, series, SUM(series_maximum)/100.0 as value
FROM (
  SELECT
    {{time}} AS time, 
    label('host') AS series,
    -- calculate average for each series
    maxMerge(value_max) as series_maximum
  FROM {{source}}
  WHERE name = 'cpu_load' AND dt BETWEEN {{start_time}} AND {{end_time}}
  GROUP BY series_id, time, series
)
GROUP BY time, series
```

Select `cpu_load` metric by its `name` using `WHERE name = 'cpu_load'`.  

Use `maxMerge(value_max)` to get the maximal CPU load in each time interval. Calculating maximum gives you an accurate visualization of CPU load peaks. Using `avgMerge(value_avg)` is a good alternative if you care about average values more than peaks.

Write a subquery to aggregate and use a second query to get the final result:

- Subquery calculates a maximum for each metric series. Using `GROUP BY series_id` ensures that you get a correct maximum for each series.
- In the outer query, sum all metrics series together to get the final result.

Chart CPU for multiple servers by setting `label('host')` as `series`.

### Number of requests

Chart the average number of served requests using `requests_total` counter metric: 

```sql
[label Query for number of requests]
-- sum values across series to get total rate
SELECT time, 'Requests per second' AS series, SUM(avg_rate) as value
FROM (
  SELECT
    {{time}} AS time, 
    -- calculate average rate for each series
    avgMerge(rate_avg) AS avg_rate
  FROM {{source}}
  WHERE name = 'requests_total' AND dt BETWEEN {{start_time}} AND {{end_time}}
  GROUP BY time, series_id
)
GROUP BY time
```

Select `requests_total` metric using `WHERE name = 'requests_total'`.  

Calculate the average number of served requests using `avgMerge(rate_avg)`. The metric `requests_total` is a counter metric. It represents the total number of requests. Use `rate_avg` to get an average increase in the metric. Usually, you should avoid using `value_*` with counter metrics. 

As in the previous example, you need to aggregate first to get the average for each metric series. Then, `SUM()` the averages in the second query to get the final average number of served requests.

Use `SELECT 'Requests per second' as series` to set the name of the series. It will be used in tooltips and the chart legend.  

### Number of running servers

Get the number of running servers by counting the number of values for `hostname` tag:

```sql
[label Query counting number of unique hosts]
SELECT 
  {{time}} AS time, 
  COUNT(DISTINCT label('hostname')) AS value
FROM {{source}}
WHERE name = 'up'
  AND dt BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time
```

Select the `up` metric using `WHERE name = 'up'` to only get `hostname` values for running servers. You should always select metrics by name to prevent other metrics tags and values from causing inaccuracies in your chart. Selecting only the metrics you need will also make your queries faster.

Count the total number of unique values for the `hostname` tag using `COUNT(DISTINCT label('hostname'))`.

## Need help?

Please let us know at hello@betterstack.com.  
We're happy to help! 🙏

