# Vector embedding data structures and indices

The shorter your vectors are, the lower their dimensions, the faster your queries will be. We recommend using `Array(BFloat16)` with our built-in model `embeddinggemma:300m` unless you know what you're doing.

## Indexing vector columns

If you're working with hundreds of millions of events, your queries might benefit from creating vector indices for your embeddings.

![Creating a vector index for embeddings](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/1119d9a3-14ce-4b36-f281-2d4435e1ef00/orig =3680x2278)

## Querying embeddings with exact vector distance

Better Stack Warehouse stores embeddings as vector columns that can be indexed and queried efficiently using ClickHouse’s vector type and similarity functions such as `distance()` or `cosineSimilarity()`.

```sql
[label Querying embeddings in Warehouse]
SELECT
  description,
  cosineDistance(embedding, embedding({{description}})) AS similarity
FROM {{source}}
ORDER BY similarity ASC
LIMIT 5;
```

This finds the events most semantically similar to a given description.
