# Wide events vs. metrics

Better Stack is powered by a **purpose-built data processing pipeline** for time-based events that leverages recent advances in stream processing, cloud storage, and data warehousing.

We developed the pipeline for processing massive internet-scale datasets: think petabytes to exabytes.

The pipeline has a unique combination of properties:

* Massively scalable
* No cardinality limitations
* Sub-second analytical queries
* Cost efficient

How do we achieve these seemingly mutually exclusive properties?

We work with two types of data: **wide events** and **metrics**.

## Wide event

**Wide event** is any time-based JSON with arbitrary structure smaller than 10MB stored in object storage.
Think [OpenTelemetry](https://betterstack.com/docs/logs/open-telemetry/) span or any structured log line.

![CleanShot 2025-05-18 at 3 .46.12.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/8a15cc33-758d-4180-d0c7-9c07be1ca900/lg2x =3680x2280)

## Metrics

**Metrics** are a set of highly compressed time-based data points with a pre-defined schema stored on local NVMe SSD drives.
[Prometheus metrics](https://betterstack.com/docs/logs/ingesting-data/metrics/prometheus-scrape/), for example.

![CleanShot 2025-05-18 at 4 .03.18.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/126061d2-87e5-48de-9238-3bf6b3402600/orig =3680x2280)

While you can ingest metrics data directly, the secret sauce of our data pipeline is the **integration of wide events with metrics directly via “logs-to-metrics” expressions.**

**Logs-to-metrics expressions** are SQL expressions that extract specific JSON attributes from wide events in real-time into highly compressed and locally stored metric data.

![CleanShot 2025-05-18 at 3 .55.37.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/24e983f9-483f-49e9-1e41-2f9f79810400/md1x =3680x2280)

**Example:**
Say your structured logs (wide events) contain the attribute `duration` that we want to track in a dashboard over large periods of massive data sets.

A logs-to-metrics expression `JSONExtract(raw, 'duration', 'Nullable(Float64)')` of type `Float64` aggregated via `avg`, `min`, and `max` generates metric data that you can chart in your dashboards at scale with a high query-speed with:

```sql
[label Logs-to-metrics query example]
SELECT
  {{time}} AS time,
  avgMerge(value_avg) AS avg_duration,
  minMerge(value_min) AS min_duration,
  maxMerge(value_max) AS max_duration
FROM {{source}}
WHERE name = 'duration'
GROUP BY time
```

*`Merge()` functions are a ClickHouse specialty. You don’t need to worry about them now — most of the time you will be charting trends with our Drag & drop query builder anyway.

Now, let’s talk about ad-hoc **Events → Explore logs** queries.

![CleanShot 2025-05-18 at 3 .56.51.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/15f77b69-ae46-471c-2238-6f80a04da800/md2x =3680x2280)

Imagine you want to see the trend of a different attribute, `client_ip` over time, **but you don’t have this attribute defined as a metric**.
You can run an ad-hoc query on your wide events in **Events → Explore logs** with:

```sql
[label Ad-hoc Explore query on wide events]
SELECT
  {{time}} AS time,
  JSONExtract(raw, 'client_ip', 'Nullable(String)') AS client_ip,
  COUNT(*)
FROM {{source}}
GROUP BY time, client_ip
```

Don’t get scared by this query — most of the time, you’ll use our intuitive Drag & drop query builder.

This Explore query will work great, but **it requires a lot more resources to process the raw wide events from object storage and will be thus much slower at scale**.
If needed, you can leverage our built-in [sampling](https://betterstack.com/docs/logs/faster-queries/#3-use-sampling-for-exploration) to make any Explore query faster, even for long time-intervals.

![CleanShot 2025-05-18 at 4 .11.11.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/762cc35b-044a-41b4-03e3-1919e3a33c00/md2x =2412x176)

**Define metrics for trends you need to chart frequently or trends you want to track long-term over large data sets**.

Metrics enable you to create very fast analytical **dashboards with sub-second queries** even for massive datasets.

For everything else there's wide events.

And the best thing?
You can always change your mind and **add more metrics later.**
We only bill you for the metrics you use.

## Overview: wide events vs. metrics

|                   | **Wide events (logs & spans)**                                                                             | **Metrics**                                                                                 |
| ----------------- | ---------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| **Examples**      | Any JSON such as structured logs, OpenTelemetry traces & spans, plain text logs                            | Prometheus metrics, OpenTelemetry metrics, metrics extracted from wide events                             |
| **Queried in**    | Live tail, Events                                                                             | Dashboards                                                                                                |
| **Best used for** | Filtering massive amounts of unstructured ad-hoc data; leveraging sampling to chart ad-hoc insights without predefined metrics | Fast dashboards charting metrics over long time periods, tracking long-term trends with metrics over time |
| **Storage**       | Scalable object storage in the cloud                                                                       | High-speed local NVMe drives                                                                              |
| **Cardinality**   | High cardinality                                                                                           | Low cardinality                                                                                           |
| **Compression**   | Somewhat compressed                                                                                        | Heavily compressed                                                                                        |
| **Data format**   | Row store                                                                                                  | Column store                                                                                              |
| **Sampling**      | Sampling available                                                                                         | Always unsampled                                                                                          |
| **Cost**          | Cost-effective                                                                                             | Optimized for performance                                                                                 |

[info]
Are you planning to ingest over 100 TB per month?
Need to store data in a custom data region or your own S3 bucket?
Need a fast query speed even for large datasets?
Please get in touch at **[hello@betterstack.com](mailto:hello@betterstack.com)**.
[/info]
