Full‑Text Search in Postgres with TypeScript
Introduction
Full-text search powers everything from e-commerce product discovery to documentation sites. While many developers reach for complete solutions like Algolia or Elasticsearch, PostgreSQL's built-in full-text search delivers millisecond response times on millions of documents with zero additional infrastructure. We'll build a complete search solution using Bun, Drizzle ORM, and PostgreSQL's tsvector to show you how simple yet powerful database-native search can be.
Prerequisites
Before we dive in, ensure you have the following set up:
- Bun runtime installed (Bun) - see our Introduction to Bun
- PostgreSQL database running and accessible. You'll need at least version
12to support all features used in this guide. - Basic knowledge of PostgreSQL and JavaScript/TypeScript
Project setup
Let's start by initializing our project and setting up the basic structure.
Initialize the project with Bun:
Feel free to delete the default index.ts file - we won't use it:
Now install dependencies:
Create a .env file with your PostgreSQL connection string. Note that Bun automatically loads .env files, so we don't need the dotenv package.
If you don't have a PostgreSQL database up and running, you can use the following command to start a new PostgreSQL database using Docker:
Now let's set up our database connection with Drizzle:
Creating the Basic Schema
We'll start with a simple articles table that we can use to demonstrate search capabilities:
Create the Drizzle configuration file:
Now generate the migration:
And run the migration:
If you would like to learn more about Drizzle, better understand the schema and configuration we just used, and the migrations, see our Getting started with Drizzle ORM guide.
Seeding Sample Data
Let's create some sample articles to work with, feel free to add your own examples:
Run the seed script:
And you should see output like this:
Building a Simple CLI Search App
Let's create a basic CLI application to demonstrate our search functionality. We'll start with a simple prompt-based interface:
There's nothing complex going on, just simple search using the built-in ILIKE wildcard matching, and a CLI around it. We can test it out by running:
Try searching for terms like "logging", "Better Stack", "PostgreSQL", or "TypeScript" to see the ILIKE-based search in action.
Understanding ILIKE Query Limitations
While our basic search works, ILIKE queries have significant limitations:
Performance Issues: ILIKE queries with wildcards can't use standard indexes effectively, leading to full table scans on larger datasets.
Limited Matching: ILIKE requires exact substring matches and doesn't understand word variations (e.g., "optimize" won't match "optimization" from our database articles).
No Relevance Ranking: Results appear in database order rather than relevance to the search query.
Introducing PostgreSQL tsvector
PostgreSQL's tsvector (text search vector) solves these limitations by preprocessing text into a normalized, searchable format. It handles:
- Stemming: Reduces words to their base forms ("logging" → "log")
- Stop word removal: Eliminates common words like "the", "and", "is"
- Language awareness: Understands linguistic rules for better matching
- Ranking: Provides relevance scores for search results
Let's enhance our schema to include full-text search capabilities using PostgreSQL's generated columns:
Let's break down what's happening in our enhanced schema:
to_tsvector('english', text) converts raw text into a tsvector by applying language-specific rules for stemming, stop word removal, and normalization. The 'english' configuration handles common English language processing. Learn more about text search configurations.
setweight(tsvector, weight) assigns importance levels to different parts of your content. Weights range from 'A' (highest) to 'D' (lowest), allowing you to prioritize title matches over body content in search rankings. Read about weight assignment.
Generated columns automatically maintain the search vector whenever the source columns change, eliminating synchronization issues between your content and search index. This PostgreSQL 12+ feature ensures your search data stays current without manual triggers.
GIN indexes (Generalized Inverted Index) are specifically designed for full-text search, providing fast lookups by creating an inverted index of all lexemes. They're typically 3x faster than GiST indexes for search operations but require more storage space. Understanding GIN indexes.
Now that we understand what our new schema does, let's generate and run the migration to update your database:
And then let's update our CLI to use both search methods at the same time and compare the results:
In searchArticlesFTS we are using some new concepts around FTS in PostgreSQL, let's break those down:
Query Processing: The query.replace(/\s+/g, ' & ') transforms user input into PostgreSQL's tsquery format, converting spaces to & operators for AND searches. You can learn more at Parsing queries
Match Filtering: The @@ operator, we use in the where clause, is PostgreSQL's text search match operator. It returns true when the tsvector contains all the terms specified in the tsquery.
Relevance Ranking: ts_rank() calculates how well each document matches the search query, returning a float value where higher numbers indicate better matches. The ranking considers term frequency, document length, and the weights we assigned during tsvector creation. You can learn more at Ranking Search Queries
Snippet Generation: ts_headline() creates context-aware excerpts with highlighted search terms. This function finds the parts of the text most relevant to the search query and creates a snippet with highlighted matches using terminal bold codes. The MaxWords and MinWords parameters control snippet length. You can learn more at Highlighting Results
Comparing Search Methods
With the updated CLI we can now compare both of the approaches at the same time. We can start our search CLI:
And then try the following comparisons:
- Search for "optimize":
- ILIKE search: no results (doesn't match "optimization" in our database article)
- Full-text search: finds the "Database Performance Optimization Techniques" article (understands stemming)
- Search for "application monitor":
- ILIKE search: requires the exact phrase to appear
- Full-text search: finds relevant articles even if words are separated
- Search for "postgres":
- ILIKE search: finds the result containing "PostgreSQL" in the title
- Full-text search: no results, as the English dictionary doesn't recognize "postgres" as a stem of "PostgreSQL", treating them as completely different terms
The side-by-side comparison shows that while full-text search is generally superior, the English text search configuration has limitations with technical terms, brand names, and domain-specific vocabulary. When the dictionary doesn't recognize word relationships (like "postgres" and "PostgreSQL"), full-text search can miss relevant results that ILIKE would find. This highlights the importance of choosing the right search approach based on your content and user search patterns.
Advanced tsvector Features
PostgreSQL's full-text search offers sophisticated features beyond basic matching:
Phrase Searches: Use <-> operator for exact phrase matching:
Proximity Searches: Find words within N positions of each other:
Boolean Operations: Complex search logic with AND, OR, NOT:
As an exercise you can extend the CLI to support these advanced features by modifying the query processing logic.
Performance Considerations
The performance difference between LIKE and tsvector becomes dramatic with larger datasets:
- Small datasets (< 10k records): Both methods perform reasonably well
- Medium datasets (10k-100k records):
tsvectorshows 5-10x improvement - Large datasets (> 100k records):
tsvectorcan be 50x faster with proper indexing
The GIN index on tsvector enables these performance gains by creating an inverted index structure optimized for text search operations.
Final Thoughts
PostgreSQL's built-in full-text search using tsvector provides a powerful alternative to external search engines for many applications. The combination of linguistic intelligence, relevance ranking, and excellent performance makes it an attractive choice for implementing search functionality.
Key advantages include simplified architecture (no additional services), ACID compliance with your main database, and sophisticated search features that rival dedicated search engines. For applications that don't require features like faceted search or complex aggregations, PostgreSQL's full-text search often provides everything needed with significantly less operational complexity.
The code examples in this guide provide a solid foundation for implementing full-text search in your own applications. For more advanced features and configuration options, consult the PostgreSQL documentation on full-text search.
Thanks for reading, and happy searching!