Building Retrieval Augmented Generation (RAG)

This guide demonstrates how to build a Retrieval Augmented Generation (RAG) application using Better Stack Warehouse, all without needing to generate or store your own embeddings.

Overview

Imagine you need to classify incoming emails by comparing them against a large repository of existing emails. A great way to do this is generate vector embeddings for your existing emails, to allow you to perform semantic distance searches between those emails, and any new ones.

With Warehouse:

  • No manual embedding needed: Let Warehouse automatically generate embeddings for your data at both insertion and query time.

  • Automatic public APIs with result caching: Utilize saved queries to easily retrieve search results in your application without writing complex SQL, and benefiting from automatic caching.

This guide will take you through setting up your Warehouse config, importing your existing emails into Warehouse, and then using a saved query with similarity search to find the most relevant historical emails based on a new search term or email body.

Main Steps

Create an "Embedding" Definition:

We need to tell Warehouse where to find the text to generate embeddings for first:

  • Navigate to the Embeddings tab within your source definition.
  • Define an embedding by specifying the JSON path to read text from (e.g., text) and another JSON path to write the generated embeddings to (e.g., text_embed).
  • You can customize model options here. Currently, embeddinggemma:300m is offered, which is an excellent model. For specific model requests, contact hello@betterstack.com.

Configure Time Series on NVMe SSD:

Next, we'll optimize storage for fast querying of your embeddings; a vector index will give you extremely fast similarity comparison, even with 100s of millions of rows.

  • Go to Time series on NVMe SSD in your source configuration.
  • Add a text string time series column to hold the actual email content.
  • Add a text_embed Array(Float32) column to store the generated embeddings.
  • Important: Ensure the dimensions in the automatic vector index for text_embed match the output dimensions of the embedding model you defined in the previous step.

Send Your Data:

Ingest your data into the Warehouse. Each record should have a text JSON field containing the email content.

Send emails to Warehouse
curl -X POST https://eu-nbg-2-connect.betterstackdata.com \
   -H "Authorization: Bearer $SOURCE_TOKEN" \
   -H "Content-Type: application/json" \
   -d '[
        { "text": "Hello, this is a test email about product features." },
        { "text": "We are experiencing issues with our new payment gateway." },
        { "text": "Regarding your recent inquiry about account settings and billing." }
      ]'

Warehouse will automatically generate embeddings for these emails based on your embedding definition. If you visit live tail for your source, you should see records containing both "text" and "text_embed" fields, with your float vectors in an array.

Create a Saved Query for Similarity Search:

Saved queries let you define a query, and get a publicly-accessible URL for that query, to fetch JSON, CSV or TSV from any app or web-page, without needing to distribute credentials. As an added bonus, queries are cached automatically, making them ideal for invoking for a front-end app that may be loaded many times.

  • Go to Queries in the Warehouse dashboard.
  • Create a new query using the SQL below. This query automatically generates an embedding for a provided search term, then uses the cosineDistance ClickHouse function to compute the distance (similarity) between two vectors.
Query for similar emails
SELECT {{time}} AS time, text, cosineDistance(text_embed, embedding({{search_text}})) AS distance
FROM {{source}}
WHERE time BETWEEN {{start_time}} AND {{end_time}}
ORDER BY distance ASC LIMIT 100
  • Save this query. You will get a unique URL, such as: https://eu-nbg-2-connect.betterstackdata.com/query/<your_query_token>.json

You can now query this URL directly from your application (even a front-end app) by appending a search_text query parameter: ?search_text=my%20%search%20text. This will return the 100 most similar records from your dataset, with results automatically cached for performance.

Wrap-up

By following these steps, you can set up a powerful RAG application without the overhead of managing embedding infrastructure or hardware. Better Stack handles data ingestion, storage, embedding generation, and efficient querying, allowing you to focus on building your core application logic.