SSD vs HDD: How Storage Devices Work

SSD vs HDD: The Mechanics of Storage

"I tuned the query because the DB was slow"

Back when I was a junior, my service slowed down. My senior asked, "What storage is the DB running on?" "Probably... just a hard drive?" He looked at me with pity and told me to switch to an SSD. Like magic, all problems vanished. I had spent all night tuning SQL queries, but it was just hardware lag.

What is the difference between SSD and HDD that creates such a gap? It's the difference between a Vinyl Record and a USB Thumb Drive.

1. HDD fetches data by "walking" like a librarian

At first I thought HDD was just "a slower SSD." Same kind of storage, different speed. I was wrong. The mechanism itself was completely different.

HDD is a Vinyl Record (LP).

Structure: A magnetic platter spins at 7,200 RPM (or 10,000 RPM). A needle (Head) reads the data.
Problems:
1. Must wait for the platter to spin (Rotational Latency).
2. Must wait for the needle to move (Seek Time).
3. To read random data, the needle has to dance frantically.

Because of this "Physical Movement," there is a definite speed limit.

I thought of a librarian analogy. An HDD is a librarian walking to fetch books from shelves. The farther the book, the slower. If 10 books are scattered across different shelves? The librarian has to walk back and forth 10 times.

That's why using an HDD for a DB server is a disaster. Databases often "read 1,000 tiny pieces of data one at a time." The HDD needle goes crazy trying to find all 1,000 locations.

2. SSD teleports

SSD is a USB Thumb Drive.

Structure: Traps electrons in semiconductor chips (Flash Memory).
Traits: No moving parts at all.
Pros:
1. Access speed is the same regardless of location (Random Access).
2. No need to wait for a needle. Reads at the speed of electricity.

I imagined a teleporting librarian. The moment the librarian knows where a book is, they teleport there and grab it. 10 scattered books? No problem. Just teleport 10 times.

This is when it clicked. "SSD is fast at random access, HDD is fast at sequential access."

Sequential Read: Reading a 1GB movie file from start to finish → HDD is OK (needle moves in one direction)
Random Read: Reading 1,000 scattered database records → HDD hell, SSD heaven

3. NAND Flash: SSD isn't perfect either

SSD is not magic. It traps electrons in semiconductor chips, but this "electron prison" isn't eternal.

NAND Flash Types: How many bits per cell?

NAND Flash, the core of SSDs, comes in types. It depends on how many bits you store per cell.

Type	Bits/Cell	Lifespan (P/E Cycles)	Speed	Price	Use Case
SLC (Single-Level Cell)	1	100,000	Fastest	Very expensive	Enterprise servers, industrial
MLC (Multi-Level Cell)	2	10,000	Fast	Expensive	High-end laptops, workstations
TLC (Triple-Level Cell)	3	3,000	Moderate	Affordable	Consumer SSDs (most of us)
QLC (Quad-Level Cell)	4	1,000	Slow	Cheap	High-capacity storage (read-heavy)

P/E Cycles means Program/Erase Cycles. Simply put, "how many times can you write and erase?"

When I first saw these numbers, I panicked. "TLC can only handle 3,000 writes? Won't it break soon?"

Turns out Wear Leveling makes it fine.

Wear Leveling: Spread the wear evenly

SSD controllers are smart. They don't keep writing to the same cells. They use all cells evenly.

For example, say you write and delete a 10GB file daily on a 256GB SSD. The controller writes this file to a different physical location each time. It rotates through the entire 256GB.

Result: A TLC SSD (3,000 P/E Cycles, 256GB) can theoretically write 256GB × 3,000 = 768TB. If you write 10GB per day? You can use it for 210 years.

Now I get it. SSDs last way longer than I thought.

Write Amplification: You write more than you think

But SSDs have one more trap. If you write 1GB, you actually write 1.5GB.

Why? SSDs can't overwrite in place.

HDD: Overwrites the same spot directly.
SSD: Must erase existing data first. But "erase" only works at block level.

A block contains multiple pages.

Write/Read: Page level (4KB ~ 16KB)
Erase: Block level (256KB ~ 4MB)

To modify a 4KB file?

Read the entire block (256KB).
Modify only the changed part.
Erase the block.
Write the entire thing back.

You modified 4KB but wrote 256KB. This is Write Amplification.

This problem is mitigated by the TRIM command.

TRIM: OS tells the SSD "this data can be erased"

When you delete a file, the OS doesn't actually erase the data. It just marks the space as "free" in the file table.

Problem: The SSD doesn't know this. It still thinks valid data exists there. When you try to write new data to that space? You go through the "read → modify → erase → write" process.

TRIM command tells the SSD: "These blocks are no longer used. You can erase them in advance."

The SSD erases these blocks during idle time. Writing new data later becomes much faster.

Checking if TRIM is enabled on Linux:

# Check if TRIM schedule is active
systemctl status fstrim.timer

# Manually run TRIM (when SSD performance drops)
sudo fstrim -v /

I once had an SSD slow down after a few months because I didn't know this. One TRIM run brought it back to full speed.

4. Real World: IOPS (Input/Output Operations Per Second)

One of the most critical metrics in server performance is IOPS (Input/Output Operations Per Second). Simply put: "How many times can you read/write per second?"

HDD vs SSD Comparison

HDD (7200 RPM): Approx. 100 ~ 200 IOPS
SSD (SATA): Approx. 50,000 ~ 100,000 IOPS
NVMe SSD: Approx. 500,000+ IOPS

Do you see it? 100 vs 50,000. Running a DB (which reads/writes tiny data constantly) on an HDD is like putting a Ferrari engine (CPU) on a car with millstones for wheels (HDD).

AWS EBS Volume Types: Buying IOPS with money

When you run a server on AWS, you have to choose an EBS (Elastic Block Store) volume. At first I didn't know what this was, so I just used the default (gp2).

Later I realized IOPS affects pricing dramatically.

Type	IOPS	Throughput	Use Case	Price (approx.)
gp3 (General Purpose SSD)	3,000 ~ 16,000 (configurable)	125 ~ 1,000 MB/s	Most workloads	Cheap (default)
gp2 (Old General Purpose SSD)	100 ~ 16,000 (scales with size)	250 MB/s	Old default	More expensive than gp3
io1 (Provisioned IOPS)	Up to 64,000	1,000 MB/s	DB servers, high performance	Very expensive
io2 (Next-gen io1)	Up to 64,000	1,000 MB/s	Mission-critical DB	Even more expensive (99.999% durability)
st1 (HDD)	500	500 MB/s	Logs, big data	Cheap
sc1 (Cold HDD)	250	250 MB/s	Archives, backups	Cheapest

Key insight:

gp3 lets you configure IOPS and throughput separately. gp2 automatically sets IOPS based on volume size.
io1/io2 guarantees minimum IOPS. You pay more, but performance never drops.

In my experience, gp3 was enough for most cases. Unless your DB server handles tens of thousands of queries per second.

If RDS performance is slow, first check IOPS usage in CloudWatch.

# Monitor disk I/O on EC2 instance
iostat -x 1

# Check which process is using disk the most
sudo iotop

I didn't know these commands at first. When my server slowed down, I only checked CPU and memory, then discovered disk I/O was the bottleneck way too late.

Disk Benchmarking: Measuring IOPS with fio

"Is my server SSD actually fast?" I sometimes wonder. Especially with cloud servers, since they're invisible.

You can benchmark directly with fio (Flexible I/O Tester).

# Install fio (Ubuntu)
sudo apt install fio

# Random read IOPS test (4KB blocks)
fio --name=random-read \
    --ioengine=libaio \
    --iodepth=64 \
    --rw=randread \
    --bs=4k \
    --direct=1 \
    --size=1G \
    --numjobs=4 \
    --runtime=60 \
    --group_reporting

# Random write IOPS test (4KB blocks)
fio --name=random-write \
    --ioengine=libaio \
    --iodepth=64 \
    --rw=randwrite \
    --bs=4k \
    --direct=1 \
    --size=1G \
    --numjobs=4 \
    --runtime=60 \
    --group_reporting

Interpreting results:

IOPS: Look for read: IOPS=12345 in the output.
If HDD: 100 ~ 200 IOPS
If SATA SSD: 10,000 ~ 100,000 IOPS
If NVMe SSD: 100,000+ IOPS

I ran this test once and realized: "Oh, my local MacBook SSD is genuinely fast. AWS gp3 is slower than I thought."

Cloud uses shared storage, so if a neighboring server uses disk heavily, my performance drops too. That's why provisioned IOPS (io1/io2) is expensive. It guarantees dedicated performance.

5. Hot Data vs Cold Data: Right storage for the right data

So is SSD always the answer? The problem is Cost. HDD is overwhelmingly cheaper per gigabyte.

HDD: ~$20 ~ $30 per TB
SSD: ~$100 ~ $150 per TB

5x difference.

So you have to separate storage based on "how often you use the data."

Hot Data: Frequently used data

Examples: DB, Cache (Redis), OS, log files being analyzed
Storage: Must use SSD
Why: Heavy random access, needs high IOPS

Warm Data: Occasionally used data

Examples: Last 3 months of logs, backup files (restored weekly)
Storage: SSD (gp3) or slower SSD
Why: Not used frequently, but needs fast reads when accessed

Cold Data: Rarely used data

Examples: 3-year-old logs, CCTV footage, legally required archive data
Storage: HDD (S3 Glacier, AWS st1/sc1)
Why: Large capacity, almost never read

How I applied this in practice:

1. RDS DB Server

Main server: io2 (Provisioned IOPS, 64,000 IOPS guaranteed)
Read replicas: gp3 (cheaper)

2. Log Storage

Last 7 days: EBS gp3 (ElasticSearch indexing)
8 ~ 90 days: S3 Standard
91+ days: S3 Glacier (legal retention)

3. Backups

Daily backups (last 30 days): S3 Standard (fast recovery)
Monthly backups (1 year): S3 Glacier (cheap)

This cut storage costs by 60%.

6. Summary: Rotating Things Are Slow

Type	HDD	SSD
Analogy	Vinyl Record (LP)	Smartphone Memory
Mechanism	Physical Rotation (Motor)	Semiconductor (Chips)
IOPS	100 ~ 200	50,000+
Use Case	Backups, Video Archives	OS, DB, Web Server

"Anything with moving parts is slow and prone to failure."

Remembering just this will prevent half of your server outages.

And one more thing. "Separate storage based on how often you use the data."

HDD is slow but cheap. Perfect for Cold Data. SSD is fast but expensive. Use it only for Hot Data to avoid cost explosion.

From my trial-and-error experience, following these two rules eliminates most storage problems.