# SQL API

Query your logs & metrics outside of the Better Stack dashboard with our read-only HTTP API.

You can connect from Grafana, the ClickHouse client via HTTP or any other HTTP client.

## Getting started

1. Create a username and password by navigating to [**Integrations**](https://telemetry.betterstack.com/team/0/dashboards/connections?tab=sql-api ";_blank") in the side menu.

2. Click **Connect** ClickHouse HTTP client.

3. **Follow the instructions** in form and click **Create connection**.

4. **Copy the password** shown in the flash message and store it securely. **You won't be able to access the password again.**

![CleanShot 2025-03-02 at 11 .56.13.png](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/ef75691d-cc2b-4472-c3e7-d6b630921c00/md2x =3248x2146)

[note]
#### Prefer using APIs?

You can manage your ClickHouse connections using [Connection API](https://betterstack.com/docs/logs/api/connections/).
[/note]

## Basic usage

Fetch your data using a simple curl command:

[code-tabs]
```bash
[label Recent logs]
curl -u $USERNAME:$PASSWORD \
  -H 'Content-type: plain/text' \
  -X POST 'https://eu-nbg-2-connect.betterstackdata.com?output_format_pretty_row_numbers=0' \
  -d "SELECT dt, raw FROM (
    SELECT dt, raw FROM remote(t123456_your_source_logs) 
    UNION ALL 
    SELECT dt, raw FROM s3Cluster(primary, t123456_your_source_s3)
      WHERE _row_type = 1
  ) ORDER BY dt DESC LIMIT 100 FORMAT JSONEachRow"
```
```bash
[label Recent spans]
curl -u $USERNAME:$PASSWORD \
  -H 'Content-type: plain/text' \
  -X POST 'https://eu-nbg-2-connect.betterstackdata.com?output_format_pretty_row_numbers=0' \
  -d "SELECT dt, raw FROM (
    SELECT dt, raw FROM remote(t123456_your_source_spans)
    UNION ALL
    SELECT dt, raw FROM s3Cluster(primary, t123456_your_source_s3)
      WHERE _row_type = 3
  ) ORDER BY dt DESC LIMIT 100 FORMAT JSONEachRow"
```
```bash
[label Hourly metrics]
curl -u $USERNAME:$PASSWORD \
  -H 'Content-type: plain/text' \
  -X POST 'https://eu-nbg-2-connect.betterstackdata.com?output_format_pretty_row_numbers=0' \
  -d "SELECT toStartOfHour(dt) AS time, countMerge(events_count) 
      FROM remote(t123456_your_source_metrics) 
      GROUP BY time 
      ORDER BY time DESC 
      LIMIT 24 
      FORMAT Pretty"
```
[/code-tabs]

Replace `$USERNAME:$PASSWORD` with your connection credentials and `t123456_your_source` depending on your available sources, which you can find in [MCP and API](https://telemetry.betterstack.com/team/0/dashboards/connections?tab=sql-api ";_blank"):

![Where to find the data sources](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/42ea7d55-7420-4fa5-f3db-13f97054fe00/lg1x =3818x2388)

## Data sources explained

The Query API provides access to three types of telemetry data for each source: logs, spans, and metrics. This data is split between hot storage for recent data and cold storage for historical data.

### Logs

`remote(t123456_your_source_logs)` for fast access to recent logs.

For historical data, use `s3Cluster(primary, t123456_your_source_s3)`. Filter with `WHERE _row_type = 1`.

Use `UNION ALL` to combine recent and historical data for complete results.

### Spans

`remote(t123456_your_source_spans)` for fast access to recent spans.

For historical data, use `s3Cluster(primary, t123456_your_source_s3)`. Filter with `WHERE _row_type = 3`.

### Metrics

`remote(t123456_your_source_metrics)` provides aggregated metrics data, similar to [Dashboards](https://betterstack.com/docs/logs/dashboards/).

[note]
Replace `t123456_your_source` depending on your available sources, which you can find in [MCP and API](https://telemetry.betterstack.com/team/0/dashboards/connections?tab=sql-api ";_blank").
[/note]

## Filtering and searching logs

### Search by log content

Filter logs containing specific text or fields:

```sql
[label Accessing all logs containing "My text"]
SELECT dt, raw
FROM (
  SELECT dt, raw
  FROM remote(t123456_your_source_logs)
  UNION ALL
  SELECT dt, raw
  FROM s3Cluster(primary, t123456_your_source_s3)
  WHERE _row_type = 1
)
WHERE raw LIKE '%My text%'
ORDER BY dt ASC
LIMIT 5000
FORMAT JSONEachRow
```

### Extract JSON fields

Access nested JSON fields in your logs:

```sql
[label Accessing recent error logs]
SELECT 
  dt,
  JSONExtract(raw, 'level', 'Nullable(String)') as severity,
  JSONExtract(raw, 'message', 'Nullable(String)') as message,
  JSONExtract(raw, 'context.hostname', 'Nullable(String)') as hostname
FROM remote(t123456_your_source_logs)
WHERE severity = 'ERROR'
LIMIT 100
FORMAT JSONEachRow
```

## Optimizing queries for large datasets

### Pagination

For large result sets, paginate using time-based ordering:

```sql
[label Using LIMIT and ORDER BY for pagination]
SELECT raw, dt
FROM s3Cluster(primary, t123456_your_source_s3)
WHERE _row_type = 1 
  AND dt BETWEEN toDateTime64(1748449431, 0, 'UTC') AND toDateTime64(1748535831, 0, 'UTC')
ORDER BY dt ASC
LIMIT 1000
FORMAT JSONEachRow
```

### S3 Sampling

To help run extremely fast S3 queries over very wide time ranges, you can apply sampling, which uses a percentage of the stored S3 files as opposed to all of them; this is great if you want results quickly, and are comfortable with an estimated result.

```bash
[label Optimizing using S3 sampling]
curl -u $USERNAME:$PASSWORD \
  -H 'Content-type: plain/text' \
  -X POST 'https://eu-nbg-2-connect.betterstackdata.com?output_format_pretty_row_numbers=0' \
  -d "SELECT toStartOfInterval(dt, INTERVAL 1 DAY) AS day, count(*) AS count
[highlight]
      FROM s3Cluster(primary, t123456_your_source_s3, sampling = 0.5)
[/highlight]
      WHERE _row_type = 1
        AND dt BETWEEN toDateTime64(1748449431, 0, 'UTC') AND toDateTime64(1748535831, 0, 'UTC')
      ORDER BY day ASC
      LIMIT 1000
      FORMAT JSONEachRow"
```

## Common issues and solutions

### Memory limit exceeded

If you encounter `MEMORY_LIMIT_EXCEEDED` errors:

- **Use shorter time ranges:** Limit your queries to smaller time windows.
- **Add specific filters:** Filter by `dt`, source, or other fields early in your query.
- **Limit result size:** Use `LIMIT` or `max_result_rows` settings.
- **Apply S3 sampling:** See [dedicated section](#s3-sampling) for details.

### Too many simultaneous queries

To avoid hitting the [concurrent query limit](#query-limits):

- **Add delays between requests:** Wait 1-2 seconds between API calls.
- **Use shorter time ranges:** Reduce the scope of each query.
- **Implement retry logic:** Detect the error and retry after a longer delay.
- **Switch to a higher query speed:** Higher [query speeds](https://betterstack.com/docs/warehouse/querying-data/query-speeds/) make all your queries faster and have higher limits for concurrency.

## Output formats

The Query API supports various [ClickHouse output formats](https://clickhouse.com/docs/interfaces/formats), most notably:

- `JSON` - A single JSON data structure.
- `JSONEachRow` - One JSON object per line, best for programmatic access.
- `Pretty` - Human-readable table format.
- `CSV` - Comma-separated values.
- `TSV` - Tab-separated values.

Specify the format at the end of your SQL query: `FORMAT JSONEachRow`

[info]
**Missing the brackets around your data when using JSONEachRow?**

You can use `SETTINGS output_format_json_array_of_rows = 1` in front of `FORMAT JSONEachRow` to make sure your data is wrapped by `[` and `]`, if you need valid JSON output.

Learn more about using [JSON in ClickHouse](https://clickhouse.com/docs/interfaces/formats/JSON).
[/info]

## Best practices

- **Always use LIMIT:** Prevent accidentally fetching too much data.
- **Order your results:** Use `ORDER BY dt DESC` or `ORDER BY dt ASC` for consistent pagination.
- **Filter early:** Apply `WHERE` conditions to reduce data processing.
- **Use appropriate time ranges:** Shorter ranges perform better.
- **Store credentials securely:** Never hardcode usernames and passwords in your scripts.
- **Handle errors gracefully when scripting:** Implement retry logic for temporary failures or rate limiting.

## Query limits

With Standard querying, you can run **up to 4 concurrent queries for logs** and **20 concurrent queries for metrics**.

You can [choose a higher Query speed](https://betterstack.com/docs/warehouse/querying-data/query-speeds/), which is applied to the data region for both your Telemetry and Warehouse sources.
