Vector DB: New Database for AI Era
1. "Why Is My DB So Slow?"
While studying RAG (Retrieval-Augmented Generation) systems, I came across this pattern:
In cases dealing with large volumes of documents, what happens when you store embedding vectors in PostgreSQL and use the pgvector plugin for search?
It works—but as the document count grows, search performance becomes a real problem.
When a user asks a question, it takes several seconds to get an answer. "2.5 seconds just to search the DB?" The more data you add, the worse this gets linearly. At some point, the service becomes unusable.
The explanation I found in the documentation: "RDBs (Relational DBs) are specialized for finding exact values. For vector search, you need to use a Vector Database."
So what happens when you migrate to a dedicated vector DB like Pinecone? According to the benchmarks and case studies I've read, search performance improves dramatically compared to the naive approach. Same data, so why such a huge difference?
2. Initially, Why Was I Confused?
Confusion 1: "How is it different from a regular DB?"
"You store data and SELECT it. It's the same mechanism. Why learn a new DB?" We have search engines like Elasticsearch. Why do we need another new thing? I couldn't accept it.
Confusion 2: "How is it so fast?"
Thinking back to high school math, Dot Product calculations for 100,000 vectors should require massive computation. How is it possible in 0.05 seconds? It was a mystery.
3. The 'Aha!' Moment
The decisive analogy was "Library."
Regular DB (RDB) = Finding by Call Number (ID)
When you type "Call Number 800.12-34" at the kiosk, the librarian goes to that exact shelf and picks the book.
- Query:
SELECT * FROM books WHERE id = 123 - Feature: Exact Match only. If book 123 isn't there, you find nothing.
- Speed: Extremely fast thanks to Indexes (B-Tree).
Vector DB = Finding "Semantically Similar" Books
It's like asking the librarian, "Do you have a novel where the protagonist travels to space, but it's kind of sad?" The librarian needs to know the Content (Meaning) of the books, not just numbers.
- Query:
Find books similar to [0.1, 0.8, 0.3, ...] - Feature: Semantic Similarity Search. No need for exact keyword match.
- Speed: Standard method (checking every book) is incredibly slow. So special tech (HNSW) is needed.
I understood it with this analogy. Vector DBs are specialized for "Finding the Approximate Nearest Neighbor rather than the exact value."
4. Vector DB vs Regular DB: Critical Differences
| Feature | Regular DB (MySQL, PostgreSQL) | Vector DB (Pinecone, Chroma) |
|---|---|---|
| Search Method | Exact Keyword/ID Match | Semantic Similarity (Cosine Similarity) |
| Data Type | Integer, String, Date | 1536-dim Vector (Float Array) |
| Index | B-Tree (Sort-based) | HNSW, IVF (Graph/Cluster-based) |
| Query Example | WHERE content LIKE '%AI%' | vector_search(embedding, top_k=5) |
| Main Use | Payments, User Mgmt, Boards | Chatbots, Recommender Sys, Image Search |
RDB handles "Data that must not be wrong" (Money, Inventory). Vector DB handles "Data where similar is good" (Recommendations, Search).
5. Core Tech: Indexing (HNSW)
The secret to Vector DB being dramatically faster than naive approaches lies in an indexing algorithm called HNSW (Hierarchical Navigable Small World). The name is long and scary, but the principle is simple. "Highways and Local Roads."
Suppose we have 1 million data points. The brute-force method (Flat Search) compares my query vector with all 1 million vectors one by one. Naturally, it's slow.
HNSW divides data into Layers.
- Layer 2 (Highway): Data points are sparsely connected. We find the approximate location here. ("Let's go towards Seoul")
- Layer 1 (Arterial): A bit denser. ("Let's go towards Gangnam District")
- Layer 0 (Local): All data is connected. ("Let's go to 123 Yeoksam-dong")
When searching, you take the highway to jump near the destination instantly, then drop to local roads for a precise search. Thanks to this, you only need to compare a few hundred vectors to find the answer among millions.
6. Practical Guide: Using Major Vector DBs
1. Pinecone (Most Popular Managed Service)
No installation or server management required. Just an API Key.
from pinecone import Pinecone
# 1. Initialize
pc = Pinecone(api_key="your-api-key")
# 2. Create Index (Run once)
pc.create_index(
name="my-index",
dimension=1536, # OpenAI embedding dimension
metric="cosine" # Similarity metric
)
# 3. Upsert Data
index = pc.Index("my-index")
index.upsert([
("id1", [0.1, 0.2, ...], {"text": "Delicious Apple"}),
("id2", [0.3, 0.4, ...], {"text": "Red Fruit"})
])
# 4. Query
results = index.query(
vector=[0.15, 0.25, ...], # Vector for 'Apple'
top_k=5,
include_metadata=True
)
for match in results.matches:
print(f"Score: {match.score}, Text: {match.metadata['text']}")
2. Chroma (Open Source, Local Run)
Great for testing locally or if you're worried about costs.
import chromadb
client = chromadb.Client() # Runs in memory
collection = client.create_collection("my_collection")
# Automatically embeds text (Convenient!)
collection.add(
documents=["This is an apple", "This is a banana"],
ids=["id1", "id2"]
)
# Search by text directly
results = collection.query(
query_texts=["Find something similar to apple"],
n_results=1
)
7. Mathematics of Similarity: Distance Metrics
How do we measure "Similarity"? There are three main ways.
- Cosine Similarity: Measures the angle between two vectors. Great for text where document length doesn't matter. (Most common in NLP).
1: Same direction,-1: Opposite direction.
- Euclidean Distance (L2): Measures the straight-line distance between two points. Used when magnitude matters (e.g., image brightness).
- Dot Product: Multiplies magnitudes and cosine of the angle. Used when both magnitude and direction matter (e.g., recommendation systems where popularity matters).
Most vector DBs default to Cosine Similarity for text search.
8. Use Cases: Not Just Chatbots
Vector DBs are versatile.
- Image Search: "Find a dress that looks like this photo." (Embed images, not text)
- Recommendation Systems: "Users who bought this also bought..." (Embed user behavior)
- Anomaly Detection: "Is this transaction weird?" (If the vector is far from normal clusters, it's fraud.)
- Audio Search: "Find songs with a similar vibe." (Embed audio waveforms)
Any data that can be turned into numbers (Vectors) can be searched instantly.
9. Practical Tips: Checklist Before Adopting Vector DB
-
How much data?
- Under 100k: Just use PostgreSQL (
pgvector) or a local library (FAISS). Don't increase management points. - Over 1 million: Definitely use a specialized Vector DB (Pinecone, Milvus).
- Under 100k: Just use PostgreSQL (
-
Embedding Dimension?
- OpenAI
text-embedding-3-smallis 1536 dimensions. - Open Source models (
all-MiniLM-L6-v2) are 384 dimensions. Smaller dimensions are faster and cheaper.
- OpenAI
-
Need Metadata Filtering?
- If you have requirements like "Search only within 2024 data," choose a DB with good metadata filtering performance (Pinecone, Weaviate).
8. Summary
Vector DB is a database specialized for storing high-dimensional vectors and searching them at high speed (HNSW) for semantic similarity.
It is the core storage that allows AI to 'understand' text, images, and audio, and is the heart of RAG (Retrieval-Augmented Generation) systems.
We have moved beyond the era of SQL (SELECT *) to the era of Vectors (Find Similar).