Welcome to Warehouse 👋

Better Stack Warehouse a time series data warehouse as an API.

warehouse-3.png

Build your application on the same infrastructure that powers Better Stack. Get a serverless ClickHouse without the scaling headache. Ingest petabytes and run analytical SQL queries at scale.

So affordable you’ll ask what’s wrong with it.

Warehouse leverages recent advances in stream processing, cloud storage, and analytical databases.

  • Massively scalable: Engineered to handle petabytes to exabytes of data without breaking a sweat.
  • No cardinality limitations: Slice and dice your data any way you want, without worrying about high-cardinality fields.
  • Sub-second analytical queries: Extract time series from your JSON events and store them efficiently on local NVMe drives to get fast analytical queries.
  • Save queries as client-facing APIs: Get APIs you can securely use in your frontend by simply saving a query in the dashboard and copying the JSON or CSV URL.
  • Unbeatable cost-efficiency: A smart storage architecture that balances performance and cost, so you only pay for what you need.

How it works: JSON events & Time series

The power of Warehouse comes from its ability to work with two distinct types of data: JSON events and time series. Understanding the difference is key to getting the most out of Warehouse.

JSON events stored on object storage

A JSON event is any time-based JSON object, like a structured log line or an OpenTelemetry span or any kind of JSON event with a timestamp stored in its dt attribute. JSON events are stored in highly scalable object storage.

JSON events are designed for ad-hoc data exploration, debugging, and high-cardinality analysis.

JSON events give you the full, detailed context for every event, making them perfect for digging into specific issues.

Time series stored on local NVMe SSD

A time series is a sequence of time-stamped values, like Prometheus metrics, extracted from your raw JSON events. These are highly compressed and stored on fast, local NVMe SSDs for rapid querying.

Time series are designed for building fast client-facing APIs, long-term dashboards, tracking trends.

Time series data is optimized for performance, enabling sub-second query speeds even over long time ranges.

Creating time series from JSON events

The true magic of the Warehouse is how it bridges these two data types. You can define logs-to-metrics expressions—simple SQL statements that extract numerical values from your wide events and convert them into time series in real-time.

For example, you can extract the duration from your request logs to create a duration_ms metric. You can then build a high-speed dashboard to visualize the average, p95, and max request latency over months of data, while still having the ability to drill down into the raw log events when needed.

To learn more about the technical details, read our guide on Wide Events vs. Time Series.

Need custom scaling or data storage?

Are you planning to ingest over 100 TB per month, need a custom data region, or require dedicated high-performance clusters? Please get in touch at hello@betterstack.com. We're here to help you build the perfect solution.