Querying events & time series

Create your first client-facing API in Queries or connect to Better Stack Warehouse to run exploratory queries with the Ad-hoc SQL API.

Start exploring your events in QueriesCreate query. Select a source. Use the following SQL expression:

SQL query counting number of events by level
SELECT {{time}} AS time,
  COUNT(*) AS value,
  JSONExtract(raw, 'level', 'Nullable(String)') AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

ClickHouse SQL queries are largely similar to ANSI SQL you’re likely familiar with.

In the query above, we use COUNT(*) to get the number of events for each level.

Extracting event fields

The easiest way to extract a particular event field is to use the JSON dot notation:

Extracting JSON attribute using JSON dot notation
SELECT json.my_field.my_value
FROM {{source}}

Alternatively, extract top-level or nested fields using JSONExtract() function:

  • Top-level field level: JSONExtract(raw, 'level', 'Nullable(String)')
  • Nested field context.request.status: JSONExtract(raw, 'context', 'request', 'status', 'Nullable(String)')

Case-insensitive field access

If the case of fields in your events can vary (for example, level and Level), you can use JSONExtractCaseInsensitive, in the same way as JSONExtract.

JSONExtractCaseInsensitive(raw, 'level', 'Nullable(String)') would fetch either level or Level from an event, depending on which is present.

Query variables

On top of SQL, SQL queries feature variables like {{source}} or {{time}}. Variables conveniently insert selected source, current time range, and other selected values into your queries. Read more about query variables.

Video overview

Examples: querying events

Chart number of events by host

Chart how many events your servers produce:

SQL query counting number of events by host
SELECT {{time}} AS time,
  COUNT(*) AS value,
  JSONExtract(raw, 'context', 'hostname', 'Nullable(String)') AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

Want to use the extracted field fast APIs?

Extract them as time series to be stored on fast NVMe SSD drives.

Add a time series in Source -> Your source -> Time series on NVMe SSD:

  • Metric: Name your metric hostname.
  • JSON dot notation: Write JSON path context.hostname.
  • Type: Select String.
  • Aggregations: Leave No aggregation selected.

Once added, you can use hostname column in Queries or Ad-hoc SQL API when you select time series in the source picker..

Track specific error in your events

Match specific errors in events message field using CASE and LIKE. The LIKE operator performs a case-sensitive search:

SQL query to match a specific error
SELECT {{time}} AS time, 
  COUNT(CASE WHEN 
    JSONExtractString(raw, 'message')
    LIKE '%connection lost%' 
    THEN TRUE ELSE FALSE
  END) AS value
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time

Create tailored charts by writing your own custom SQL queries.

Want a dashboard that works out of the box?

Use one of our ready-to-use templates to create a dashboard

Querying time series

Create a new time series query:

Query counting the number of events by level
SELECT {{time}} AS time,
  countMerge(events_count) AS value,
  level AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

Chart preview

Time series queries use ClickHouse SQL, which is largely similar to ANSI SQL you’re likely familiar with. In the SQL query above, we leverage the countMerge ClickHouse function.

On top of SQL, dashboard queries feature variables like {{source}} or {{time}}. Variables conveniently insert selected source, current time range, and other selected values into your queries. Read more about query variables.

Queries don't access your events directly. To make querying fast and efficient we process your events into time series and store them as ClickHouse -State combinators. Time series queries use ClickHouse -Merge functions to get the values of the time series. Read more about querying time series below.

How to query time series?

Want to count the number of lines?

Use countMerge(events_count). Avoid using count() as it returns incorrect results. Metric tables don't have the same number of rows as the original event tables.

Query time series with no aggregations by their name just like regular database columns.

For time series with aggregations, use avgMerge(metric_name_avg) to get the average value of a time series.

Similarly, use minMerge(metric_name_min) and maxMerge(metric_name_max) to get a time series' minimum and maximum values. Make sure to always match the -Merge function with the metric suffix correctly:

  • ✅ Correct: Using sumMerge with _sum suffix: sumMerge(metric_name_sum)
  • ❌ Incorrect: Mixing sum and count: sumMerge(metric_name_count)

Want to use percentiles?
Check out the complete example on percentiles below.

Which time series can you query?

You can add more time series to use in your SQL queries.

Screenshot of Logs to metrics tables

Examples: querying time series

See the examples below for full queries using Time series.

Number of errors

Chart the number of events with the error event level:

Query counting number of errors from NVMe SSD-stored time series
SELECT {{time}} AS time, countMerge(events_count) AS value, level AS series
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
  AND level = 'error'
GROUP BY time, series

Use countMerge(events_count) to count the number of events using time series. Avoid using count(*).

Filter error events using the level metric. The metric has to be present in Time series on NVMe SSDs configuration with No aggregation selected.

Average request duration

Chart the average duration of request:

Query for average request duration
SELECT {{time}} AS time,
  avgMerge(request_duration_avg) AS "Request duration"
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time

Use avgMerge(request_duration_avg) to calculate an average for the request_duration time series. The time series has to be added in Sources -> Your soruce -> *Time series on NVMe SSD * with avg aggregation.

Column names Request duration will be used in chart tooltips and the legend. Select which column should be charted on the Y-axis on the right side in Chart settingAxes.

Percentile request duration

Chart the 95th percentile request duration:

Query for percentile request duration
SELECT {{time}} AS time,
  quantilesMerge(0.5, 0.9, 0.95, 0.99)(request_duration_quantiles)[3]
    AS "95th percentile duration"
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
GROUP BY time, series

Here we have the request_duration metric with p50, p90, p95, and p99 percentiles set up in Time series expressions. Calculate the 95th percentile request duration using quantilesMerge. The quantilesMerge function gives us all four percentiles. We select the third percentile using [3].

Need help?

Please let us know at hello@betterstack.com.
We're happy to help! 🙏