Writing fast s3Cluster queries

Optimize your query performance to get faster results when exploring your raw events.

Running the same query repeatedly?

Extract values as non-aggregated fields in Warehouse -> Sources -> Your source -> Time series on NVMe SSD instead of querying raw JSON events each time. This converts your frequently-accessed events into time series metrics for much faster retrieval.

Learn more about Extracting time series from events.

Understanding your data types

Our infrastructure handles JSON events and time series differently, with metrics offering much faster query performance. Learn more about JSON events vs. time series.

If you're frequently running the same queries, consider whether your use case would benefit from converting events to time series.

When to use time series vs. JSON events

Use time series when

Running the same query repeatedly.
Need sub-second query performance.
Working with numerical data and aggregations.

Use JSON events when

Performing ad-hoc exploration.
Need full context and details.
Debugging specific issues.
Working with unstructured data.

Optimizing ad-hoc event queries

For faster queries on your JSON events, try these optimization techniques:

1. Narrow your time range

Shorter time frames significantly reduce the amount of data processed.

2. Make your s3Cluster function more specific

Using more specific WHERE clause with the s3cluster function

Copied!

SELECT dt, raw
FROM (
  SELECT dt, raw
  FROM remote(t123456_your_source_logs)
  UNION ALL
  SELECT dt, raw
  FROM s3Cluster(primary, t123456_your_source_s3)
  WHERE _row_type = 1 -- include as many filters here, in the inner query  
)
WHERE raw LIKE '%My text%'
ORDER BY dt ASC
LIMIT 5000
FORMAT JSONEachRow

3. Query specific sources

Instead of searching across all sources, target the specific source containing your data.

Select individual sources in the source dropdown.
Avoid querying All sources when possible.

4. Use sampling for exploration

Enable Sampling to work with a representative subset of your data while developing and testing queries.

5. Request additional compute

For consistently slow queries on large datasets, we can add more compute power to your cluster:

Share a slow query link with our support team.
We'll analyze your data volume and query performance.
Small adjustments are often available at no charge.
Larger performance improvements for very large datasets may require a custom cluster for an additional cost.

Custom clusters for high performance

For applications requiring consistently fast queries over large datasets and long time periods, we can provision dedicated compute resources:

Tailored setup: Custom cluster sized for your specific needs.
Dedicated compute: No resource sharing with other workloads.
Faster speeds: Optimized for your query patterns and data volume.
Additional cost: Comes with dedicated infrastructure pricing.

Contact our support team at hello@betterstack.com to discuss custom cluster options for your use case 📩

Getting help

Generally speaking, we can make querying as fast as needed through query optimization or infrastructure scaling. If you're experiencing slow query performance:

Try the optimization techniques above.
Share a slow Logs & traces link with our support team using the in-app chat or at hello@betterstack.com.
Describe your performance requirements and use case.

We're happy to help find the right balance of performance and cost for your needs 🚀

-Merge & -State aggregators

Query speeds

Explore documentation

Writing fast s3Cluster queries

Running the same query repeatedly?

Understanding your data types

When to use time series vs. JSON events

Optimizing ad-hoc event queries

1. Narrow your time range

2. Make your s3Cluster function more specific

3. Query specific sources

4. Use sampling for exploration

5. Request additional compute

Custom clusters for high performance

Getting help

On this page

Explore documentation

Writing fast s3Cluster queries

Running the same query repeatedly?

Understanding your data types

When to use time series vs. JSON events

Optimizing ad-hoc event queries

1. Narrow your time range

2. Make your s3Cluster function more specific

3. Query specific sources

4. Use sampling for exploration

5. Request additional compute

Custom clusters for high performance

Getting help

On this page

Please accept cookies