Database Indexing: Why B-Tree is the King of Databases
1. The Anatomy of a Slow Query
Imagine you visit a library with 1 million books, but they are piled up randomly on the floor.
You ask the librarian: "Can I have 'Harry Potter'?"
The librarian has to check every single book stack by stack until they find it. In database terms, this is a Full Table Scan. It creates huge I/O/ CPU load and kills performance.
Now, imagine the books are sorted on shelves by Category, then Author, then Title.
The librarian walks to 'Fiction', finds 'Rowling', and picks the book instantly.
This is an Index Scan.
In Computer Science, there are many data structures for searching: Arrays, Linked Lists, Hash Maps, Binary Trees.
But why does almost every Relational Database (MySQL, PostgreSQL, Oracle) choose the B-Tree (or its variant B+Tree) as the default indexing structure?
Why not a Red-Black Tree? Why not a Hash Map?
The answer lies in the hardware: The Disk Drive.
2. The Bottleneck is the Disk
In Algorithm class, we learn that Binary Search Trees (BST) have a time complexity of O(log N).
This assumes accessing a node takes constant time. This is true in RAM (Random Access Memory).
But databases are too big for RAM. They live on Disks (HDD/SSD).
- Accessing RAM: ~100 nanoseconds.
- Accessing Disk: ~10 milliseconds (HDD).
Disk access is 100,000 times slower than RAM access.
Therefore, the goal of a Database Index is not just to minimize comparisons (CPU), but to minimize Disk Reads (I/O).
The Problem with Binary Trees
A standard Binary Tree node has at most 2 children.
If you have 1 million records (2^20), the tree height is roughly 20.
To find a leaf node, you might need to jump to 20 different places on the disk.
20 Disk seeks = Slow.
The Solution: B-Tree (Fat Trees)
The B-Tree (Balanced Tree) was designed specifically for storage systems.
Instead of 2 children, a B-Tree node can have hundreds or thousands of children.
Imagine a node is size of a Disk Block (e.g., 4KB or 16KB). We cram as many keys as possible into that one block.
If a node can hold 1000 keys:
- Level 1: 1 block (1,000 keys)
- Level 2: 1,000 blocks (1,000,000 keys)
- Level 3: 1,000,000 blocks (1,000,000,000 keys)
With a B-Tree of Height 3, we can index 1 Billion rows.
Finding any record takes only 3 Disk Reads. This is the magic of B-Trees.
3. B-Tree vs Hash Index
Hash Maps provide O(1) access time. That sounds faster than B-Tree's O(log N).
Indeed, for "Exact Match" queries (SELECT * FROM users WHERE id = 123), Hash Indexes are unbeatable.
However, Hash Indexes are useless for Range Queries.
SELECT * FROM users WHERE age > 25;
-- or
SELECT * FROM orders ORDER BY date DESC;
A Hash function scatters data randomly. age=25 might be at address 0x100, and age=26 at 0x999.
There is no concept of "next". To find everyone older than 25, a Hash Index has to scan everything.
B-Trees Keep Data Sorted.
In a B-Tree (specifically B+Tree), the leaf nodes form a Linked List.
Once you traverse the tree to find age=25, you just follow the horizontal pointer to read 26, 27, 28... efficiently.
This "Sequential Access" is incredibly fast on disks.
4. Clustered vs Non-Clustered Indexes
Understanding this distinction is vital for schema design.
Clustered Index
- Organizes the physical storage of data.
- The table rows are stored in the leaf nodes of the index.
- Sorting the clustered index sorts the actual data on disk.
- One per table (Data can only be sorted in one way physically).
- Usually the Primary Key.
Non-Clustered Index (Secondary Index)
- A separate structure from the data rows.
- The leaf nodes contain the index key and a pointer (usually the Primary Key) to the actual data row.
- Performance penalty: It requires a "Double Lookup" (Lookup Index -> Get PK -> Lookup Clustered Index -> Get Data). This is often called a Bookmarking Lookup or Key Lookup.
- Covering Index: If the Index contains all the columns requested by the SELECT query, the database doesn't need to look up the actual table row. This effectively turns a Non-Clustered index into a fast Clustered-like read.
5. Indexing Anti-Patterns: Why is my query slow?
Creating an index is not enough. You must write queries that can use the index.
1. Functions on Columns
-- WRONG: The database must calculate YEAR(date) for every row.
SELECT * FROM sales WHERE YEAR(sale_date) = 2023;
-- RIGHT: We search against the raw column values.
SELECT * FROM sales WHERE sale_date BETWEEN '2023-01-01' AND '2023-12-31';
2. The Leading Wildcard
-- WRONG: Cannot use index.
SELECT * FROM users WHERE name LIKE '%Son';
-- RIGHT: Can use index (B-Tree knows where 'S' starts).
SELECT * FROM users WHERE name LIKE 'Son%';
Think of a dictionary. You can easily find words starting with 'App', but you cannot easily find words ending with 'ple' without reading the whole dictionary.
3. OR Conditions
If you verify WHERE col1 = A OR col2 = B, and only col1 is indexed, the database might abandon the index and do a full scan, because it has to check col2 anyway.
Solution: Use UNION or ensure both columns have indexes (Index Merge).
4. Low Cardinality
Creating an index on a Gender column (Male/Female) is usually wasteful.
If an index helps you filter out 50% of the rows, the optimizer might decide it's faster to just read the whole table sequentially than to jump around randomly (Random I/O) reading 50%.
Indexes work best when they filter out 95-99% of the rows.
6. The Cost of Indexes
"Why not index every column?"
Each index is a data structure that must be maintained.
- Insert Penalty: When adds a row, DB must add entries to all indexes.
- Update Penalty: If you change an indexed column, the index tree must be rebalanced.
- Storage Cost: Indexes take up disk space and RAM.
The Golden Rule: Index primarily for your WHERE, JOIN, and ORDER BY clauses, but keep write-heavy tables light on indexes.
7. Structure of B+Tree (The Standard)
Most modern databases actually use B+Trees, a variation of B-Trees.
- B-Tree: Data can be stored in internal nodes or leaf nodes.
- B+Tree: Data is stored ONLY in leaf nodes. Internal nodes only hold keys for routing.
- Pro 1: Internal nodes are lighter -> More keys per block -> Lower tree height -> Fewer Disk I/Os.
- Pro 2: Leaf nodes are linked together -> Faster Full Scans (range queries).
Summary
- B-Tree minimizes Disk I/O by being "short and fat".
- Hash Index is fast for exact match, but useless for ranges/sorting.
- Clustered Index is the table itself; Non-Clustered is a pointer.
- Don't apply functions to indexed columns in
WHERE clauses.
- Indexing is a balance between Read Speed vs Write Speed.
Mastering indexes distinguishes a "Coder" from a "Senior Backend Engineer". It's the bridge between logical SQL and physical hardware.