Full-Text Search: Building Search with Just Your Database

Every time I needed to add search functionality, my first thought was always "I should use Elasticsearch." It felt like the obvious choice—search needs a search engine, right? But when I actually looked at my requirements, I realized I wasn't building Google. I just needed to search through a few hundred blog posts or a few thousand products.

So I tried implementing search with PostgreSQL, which I was already using. The verdict? Way more powerful than I expected. Without any additional infrastructure, I could build a solid search experience using just my existing database. Sure, I might need Elasticsearch someday, but PostgreSQL's Full-Text Search was more than enough for now.

Aha! The LIKE Query Trap

I started with LIKE queries because, well, that's what everyone does first.

SELECT * FROM posts
WHERE title LIKE '%search term%' OR content LIKE '%search term%';

This worked fine at small scale. But once I had a few hundred records, things got noticeably slower. The reason is simple: LIKE '%search term%' can't use indexes at all. The database has to scan every single row and compare strings. It's like going through every book in a library one by one to find a specific word in the title.

The bigger problem was search quality. Searching for "database" wouldn't find "DB", and searching for "PostgreSQL" would miss "postgres". Case sensitivity and exact spacing mattered. This wasn't search—it was just string matching.

That's when I discovered Full-Text Search. PostgreSQL already had search engine-level features built in.

Deep Dive: tsvector and tsquery Explained

Data Structures for Search

The core of Full-Text Search is tsvector and tsquery. At first glance they seemed cryptic, but once I understood them, the design made perfect sense.

tsvector transforms a document into a searchable format. Instead of storing raw text, it extracts and normalizes words. Think of it like an index at the back of a book—a pre-organized list of what words exist and where.

SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
-- 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2

Notice how words are stemmed to their root form. "jumps" becomes "jump", "lazy" becomes "lazi". Stopwords like "the" and "over" are removed entirely. The numbers after each word indicate their position in the document.

tsquery represents a search query. It transforms user input into a format that can be matched against tsvector.

SELECT to_tsquery('english', 'fox & dog');
-- 'fox' & 'dog'

SELECT to_tsquery('english', 'quick | slow');
-- 'quick' | 'slow'

The & is AND, | is OR. Now we can match these together:

SELECT to_tsvector('english', 'The quick brown fox') @@ to_tsquery('english', 'fox');
-- true

The @@ operator checks if they match. That's the heart of it.

Real Implementation: Blog Search

Theory aside, I needed to implement this for my blog's posts table.

First, I added a search column:

ALTER TABLE posts
ADD COLUMN search_vector tsvector;

-- Combine title and content into tsvector
UPDATE posts
SET search_vector =
  to_tsvector('english', coalesce(title, '') || ' ' || coalesce(content, ''));

The coalesce handles NULL values—titles or content might be empty.

Now the search query looks like this:

SELECT title, content, ts_rank(search_vector, query) AS rank
FROM posts, to_tsquery('english', 'database & search') query
WHERE search_vector @@ query
ORDER BY rank DESC;

Here's where ts_rank comes in. It calculates relevance scores for search results. If a search term appears multiple times or in important positions (like the title), the score is higher. Just like Google, I can now sort by relevance.

Speeding Things Up with GIN Indexes

This was already faster than LIKE, but with lots of data it could still slow down. That's where GIN (Generalized Inverted Index) comes in.

CREATE INDEX posts_search_idx ON posts USING GIN(search_vector);

GIN is an index type specialized for Full-Text Search. It uses an inverted index approach—pre-organizing which words appear in which documents. Elasticsearch uses the same technique.

After adding the index, search speed on 10,000 records went from 200ms → 15ms. That's a difference you can feel.

Auto-Update with Triggers

The problem was that every time I created or edited a post, I had to manually update search_vector. Easy to forget and annoying. So I automated it with a trigger.

CREATE FUNCTION posts_search_vector_update() RETURNS trigger AS $
BEGIN
  NEW.search_vector :=
    to_tsvector('english', coalesce(NEW.title, '') || ' ' || coalesce(NEW.content, ''));
  RETURN NEW;
END
$ LANGUAGE plpgsql;

CREATE TRIGGER posts_search_vector_trigger
BEFORE INSERT OR UPDATE ON posts
FOR EACH ROW
EXECUTE FUNCTION posts_search_vector_update();

Now whenever a post is inserted or updated, search_vector updates automatically. One less thing to worry about.

Korean Search Challenges and Trigram Solutions

Everything above was for English. Korean introduced new problems.

PostgreSQL's Full-Text Search doesn't do morphological analysis out of the box. Searching for "데이터베이스" (database) only finds exact matches, not "데이터" (data) or "베이스" (base). English words are separated by spaces, but Korean has particles and compound words that make it much more complex.

Proper Korean morphological analyzers require external extensions (like pg_mecab), which adds maintenance overhead. Instead, I found pg_trgm (Trigram).

Trigram splits strings into 3-character chunks for indexing. "데이터베이스" becomes "데이터", "이터베", "터베이", "베이스". It's not morphological analysis, but it enables partial matching.

CREATE EXTENSION pg_trgm;

CREATE INDEX posts_title_trgm_idx ON posts USING GIN(title gin_trgm_ops);
CREATE INDEX posts_content_trgm_idx ON posts USING GIN(content gin_trgm_ops);

Now similarity search is possible:

SELECT title, similarity(title, '데이터베이스') AS sim
FROM posts
WHERE title % '데이터베이스'
ORDER BY sim DESC;

The % operator checks similarity, and similarity() calculates the score. This handles typos and spacing variations reasonably well.

Combining Full-Text Search and Trigram

I ended up using both. Full-Text Search for English and exact word matching, Trigram for Korean and fuzzy matching.

SELECT
  title,
  GREATEST(
    ts_rank(search_vector, to_tsquery('english', 'database')),
    similarity(title, '데이터베이스') * 0.5
  ) AS final_rank
FROM posts
WHERE
  search_vector @@ to_tsquery('english', 'database')
  OR title % '데이터베이스'
ORDER BY final_rank DESC;

I take the higher of the two scores. Trigram gets a 0.5 weight to prioritize Full-Text Search results.

Using Full-Text Search with Supabase

My project uses Supabase, which is built on PostgreSQL, so Full-Text Search works out of the box.

I added the same columns, indexes, and triggers in Supabase's SQL Editor, then exposed search through an API.

// Using Full-Text Search from Supabase client
const { data, error } = await supabase
  .rpc('search_posts', { search_query: 'database search' });

The RPC function looks like this:

CREATE OR REPLACE FUNCTION search_posts(search_query TEXT)
RETURNS TABLE (
  id UUID,
  title TEXT,
  content TEXT,
  rank REAL
) AS $
BEGIN
  RETURN QUERY
  SELECT
    posts.id,
    posts.title,
    posts.content,
    ts_rank(posts.search_vector, to_tsquery('english', search_query)) AS rank
  FROM posts
  WHERE posts.search_vector @@ to_tsquery('english', search_query)
  ORDER BY rank DESC
  LIMIT 20;
END;
$ LANGUAGE plpgsql;

On the frontend, I just call the function. Search logic is encapsulated in the database, keeping the code clean.

When to Switch from PostgreSQL to Elasticsearch

After using Full-Text Search for months, I kept asking myself: "When do I actually need Elasticsearch?" Here's what I figured out.

PostgreSQL Full-Text Search is enough when:

You have tens of thousands to hundreds of thousands of records
Search is a secondary feature, not the main functionality
You only need basic relevance ranking
You want to minimize infrastructure complexity
You're already using PostgreSQL and want to keep search in the database

You need Elasticsearch when:

You have millions of records
Search is a core feature requiring sophisticated ranking algorithms
You need advanced features like multi-language analysis, synonyms, autocomplete
Search performance is critical and needs independent scaling
You're searching massive amounts of unstructured data like logs or time-series data

For me, with a few thousand blog posts and projects, PostgreSQL was plenty. Having search logic in the database actually made data consistency easier. I'll migrate to Elasticsearch when the scale demands it, but until then, this approach works great.

Summary: Start with What You Already Have

You don't need new tools just because you need search functionality. PostgreSQL already provides powerful Full-Text Search capabilities, and for most cases, that's enough.

Key takeaways:

Use tsvector and tsquery to transform data into searchable format
Add GIN indexes for massive performance improvements
Rank results by relevance with ts_rank
Auto-update with triggers to keep search_vector in sync
Handle Korean and fuzzy matching with pg_trgm (Trigram)
Combine Full-Text Search + Trigram for accuracy and flexibility

Search engines aren't complicated—they start with properly using what your database already offers. You'll know when you need Elasticsearch, and when that time comes, you can migrate. For now, PostgreSQL Full-Text Search builds a great search experience with zero extra infrastructure.

Full-Text Search: Building Search with Just Your Database

Related Posts

BFS vs DFS: Graph Traversal

Normalization: Kill Duplicates

Binary Search Tree (BST): The Basics of Data Search and Balancing Aesthetics

B-Tree: The Algorithm Behind Database Indexes