Building Retrieval Augmented Generation (RAG)

This guide demonstrates how to build a Retrieval Augmented Generation (RAG) application using Better Stack Warehouse, all without needing to generate or store your own embeddings.

Overview

Imagine you need to classify incoming emails by comparing them against a large repository of existing emails. A great way to do this is generate vector embeddings for your existing emails, to allow you to perform semantic distance searches between those emails, and any new ones.

Not familiar with vector embeddings?

Feel free to read through an introduction to embeddings first 🙌

With Better Stack Warehouse:

No manual embedding needed. Let Warehouse automatically generate embeddings for your data at both insertion and query time.
Automatic public APIs with result caching. Utilize saved queries to easily retrieve search results in your application without writing complex SQL, and benefiting from automatic caching.

This guide will take you through setting up your Warehouse config, importing your existing emails into Warehouse, and then using a saved query with similarity search to find the most relevant historical emails based on a new search term or email body.

Create a source

The source will hold all your data and allow you to query them.

Go to Warehouse -> Sources -> Connect source.
Pick a name and your preferred data region.

Create an embedding definition

We need to tell Warehouse where to find the text to generate embeddings for first:

Go to Warehouse -> Sources -> Your sources -> Embeddings.
Define an embedding by specifying the text as JSON path to read text from and text_embed to write the generated embeddings to.

You can customize model options here. Currently, embeddinggemma:300m is offered, which is an excellent model.

For specific model requests, contact hello@betterstack.com.

Configure time series

Next, we'll optimize storage for fast querying of your embeddings.

A vector index will give you extremely fast similarity comparison, even with hundreds of millions of rows.

Go to Warehouse -> Sources → Your source → Time series on NVMe SSD.
Create two time series, both using No aggregations:
- text - a string column to hold the actual email content.
- text_embed - an Array(Float32) column to store the generated embeddings.

Important: Ensure the dimensions in the automatic vector index for text_embed match the output dimensions of the embedding model you defined in the previous step.

Send your data

Ingest your data into the Warehouse. Each record should have a text field containing the email content.

Send emails to Warehouse

Copied!

curl -X POST https://$INGESTING_HOST \
   -H "Authorization: Bearer $SOURCE_TOKEN" \
   -H "Content-Type: application/json" \
   -d '[
        { "text": "Hello, this is a test email about product features." },
        { "text": "We are experiencing issues with our new payment gateway." },
        { "text": "Regarding your recent inquiry about account settings and billing." }
      ]'

Warehouse will automatically generate embeddings for these emails based on your embedding definition.

If you visit Warehouse -> Sources -> Your source -> Live tail, you should see records containing both text and text_embed fields, with an array of numbers.

Create a query for similarity search

Saved queries let you define a query and get a publicly-accessible URL for that query to fetch JSON, CSV or TSV from any app or web-page, without needing to distribute credentials. As an added bonus, queries are cached automatically, making them ideal for invoking in a front-end app that may be loaded many times.

Go to Warehouse -> Queries as APIs -> Create query.
Pick your source, and make sure time series is selected.
Create a new query using the SQL below. This query automatically generates an embedding for a provided search term, then uses the cosineDistance ClickHouse function to compute the distance (similarity) between two vectors.

Query for similar emails

Copied!

SELECT text,
  cosineDistance(text_embed, embedding({{search}})) AS distance
FROM {{source}}
ORDER BY distance ASC LIMIT 100

Save the query to get a unique URL, such as: https://<cluster>.betterstackdata.com/query/<your_query_token>.json

You can now query this URL directly from your application, or even a front-end app, by appending a search query parameter:
https://<cluster>.betterstackdata.com/query/<your_query_token>.json?search=my%20%search%20text

The API will return the 100 most similar records from your dataset, with results automatically cached for performance 🚀

Wrap-up

By following these steps, you can set up a powerful RAG application without the overhead of managing embedding infrastructure or hardware.

Better Stack handles data ingestion, storage, embedding generation, and efficient querying, allowing you to focus on building your core application logic.

This guide demonstrates how to build a Retrieval Augmented Generation (RAG) application using Better Stack Warehouse, all without needing to generate or store your own embeddings.

Overview

Not familiar with vector embeddings?

Feel free to read through an introduction to embeddings first 🙌

With Better Stack Warehouse:

No manual embedding needed. Let Warehouse automatically generate embeddings for your data at both insertion and query time.
Automatic public APIs with result caching. Utilize saved queries to easily retrieve search results in your application without writing complex SQL, and benefiting from automatic caching.

Create a source

The source will hold all your data and allow you to query them.

Go to Warehouse -> Sources -> Connect source.
Pick a name and your preferred data region.

Create an embedding definition

We need to tell Warehouse where to find the text to generate embeddings for first:

Go to Warehouse -> Sources -> Your sources -> Embeddings.
Define an embedding by specifying the text as JSON path to read text from and text_embed to write the generated embeddings to.

You can customize model options here. Currently, embeddinggemma:300m is offered, which is an excellent model.

For specific model requests, contact hello@betterstack.com.

Configure time series

Next, we'll optimize storage for fast querying of your embeddings.

A vector index will give you extremely fast similarity comparison, even with hundreds of millions of rows.

Go to Warehouse -> Sources → Your source → Time series on NVMe SSD.
Create two time series, both using No aggregations:
- text - a string column to hold the actual email content.
- text_embed - an Array(Float32) column to store the generated embeddings.

Important: Ensure the dimensions in the automatic vector index for text_embed match the output dimensions of the embedding model you defined in the previous step.

Send your data

Ingest your data into the Warehouse. Each record should have a text field containing the email content.

Send emails to Warehouse

Copied!

curl -X POST https://$INGESTING_HOST \
   -H "Authorization: Bearer $SOURCE_TOKEN" \
   -H "Content-Type: application/json" \
   -d '[
        { "text": "Hello, this is a test email about product features." },
        { "text": "We are experiencing issues with our new payment gateway." },
        { "text": "Regarding your recent inquiry about account settings and billing." }
      ]'

Warehouse will automatically generate embeddings for these emails based on your embedding definition.

If you visit Warehouse -> Sources -> Your source -> Live tail, you should see records containing both text and text_embed fields, with an array of numbers.

Create a query for similarity search

Go to Warehouse -> Queries as APIs -> Create query.
Pick your source, and make sure time series is selected.
Create a new query using the SQL below. This query automatically generates an embedding for a provided search term, then uses the cosineDistance ClickHouse function to compute the distance (similarity) between two vectors.

Query for similar emails

Copied!

SELECT text,
  cosineDistance(text_embed, embedding({{search}})) AS distance
FROM {{source}}
ORDER BY distance ASC LIMIT 100

Save the query to get a unique URL, such as: https://<cluster>.betterstackdata.com/query/<your_query_token>.json

The API will return the 100 most similar records from your dataset, with results automatically cached for performance 🚀

Wrap-up

By following these steps, you can set up a powerful RAG application without the overhead of managing embedding infrastructure or hardware.

Better Stack handles data ingestion, storage, embedding generation, and efficient querying, allowing you to focus on building your core application logic.

Want to learn more?

Vector embedding data structures and indices

Controlling costs

Explore documentation

Building Retrieval Augmented Generation (RAG)

Overview

Not familiar with vector embeddings?

Create a source

Create an embedding definition

Configure time series

Send your data

Create a query for similarity search

Wrap-up

Overview

Not familiar with vector embeddings?

Create a source

Create an embedding definition

Configure time series

Send your data

Create a query for similarity search

Wrap-up

Want to learn more?

On this page

Explore documentation

Building Retrieval Augmented Generation (RAG)

Overview

Not familiar with vector embeddings?

Create a source

Create an embedding definition

Configure time series

Send your data

Create a query for similarity search

Wrap-up

Overview

Not familiar with vector embeddings?

Create a source

Create an embedding definition

Configure time series

Send your data

Create a query for similarity search

Wrap-up

Want to learn more?

On this page

Please accept cookies