
Race Condition: Bugs depending on Timing
Code is perfect, but money disappears occasionally. Worst bugs created by timing issues.

Code is perfect, but money disappears occasionally. Worst bugs created by timing issues.
Why does my server crash? OS's desperate struggle to manage limited memory. War against Fragmentation.

Two ways to escape a maze. Spread out wide (BFS) or dig deep (DFS)? Who finds the shortest path?

Fast by name. Partitioning around a Pivot. Why is it the standard library choice despite O(N²) worst case?

Establishing TCP connection is expensive. Reuse it for multiple requests.

I had exactly 1 million won in my bank account. I went to ATM A and deposited 100,000 won. At the exact same time, my friend used ATM B to deposit another 100,000 won into my account. Simple math says I should have 1.2 million won now, right?
When I checked my balance, it showed 1.1 million won. 100,000 won just vanished into thin air.
I called the bank. "System error," they said. "Let me check again," they said. "This is impossible," they insisted. I looked at the code myself - it was perfect. I checked the logs - both deposits were processed successfully. Yet the money was gone.
This is the worst kind of bug because it doesn't reproduce consistently. It works fine in testing. It works 99 times out of 100. It only breaks when the timing is just right. This is what we call a race condition.
When I first encountered this problem, I was completely lost. The code was obviously correct. How can balance = balance + 10 possibly fail? It's just addition!
The answer lies in the fact that this single line doesn't execute atomically. From the CPU's perspective, this innocent-looking line of code actually breaks down into three distinct steps:
// What you see: one line
balance = balance + 10;
// What the CPU actually does (machine code level)
// 1. READ: Load balance value from memory into register (100)
// 2. MODIFY: Add 10 to the register value (110)
// 3. WRITE: Store register value back to memory (110)
This three-step sequence is called the Read-Modify-Write pattern. The problem emerges when two threads (or processes) try to execute this pattern simultaneously. The execution order can get interleaved in dangerous ways:
Time Thread A Thread B
1 READ 100 -
2 - READ 100 (A hasn't written yet!)
3 MODIFY 110 -
4 WRITE 110 -
5 - MODIFY 110 (100 + 10)
6 - WRITE 110 (Overwrites A's result!)
Both threads read balance = 100. Each adds 10 to get 110. Both write 110 to memory. The result is 110. We should have added 20, but only 10 got added. The other 10 evaporated.
If this was a bank account, money disappeared. If this was inventory management, stock counts are corrupted. The code itself is flawless, but the outcome depends on execution timing. That's the essence of a race condition.
The concept really clicked for me when someone explained it using a bathroom analogy.
There's a single bathroom. The door has no lock. Two people (A and B) approach simultaneously. Both check the door (READ). Both see it's unlocked. Both think "It's free!" Both enter (WRITE). Collision. Disaster.
The solution is simple: install a lock. Person A enters and locks the door. Person B arrives, sees it's locked, and waits. Person A finishes and unlocks the door. Person B enters. This is exactly what a Mutex (Mutual Exclusion) does.
In code, it looks like this:
// Problematic code
async function deposit(amount) {
const current = await getBalance(); // READ
const updated = current + amount; // MODIFY
await setBalance(updated); // WRITE
}
// If two deposits run concurrently, race condition occurs
The code region where Read-Modify-Write happens is called a critical section. This section must allow only one thread at a time. Ensuring this is called mutual exclusion.
JavaScript doesn't have native locks, but conceptually it would look like this:
// Protected with Mutex
const lock = new Mutex();
async function deposit(amount) {
await lock.acquire(); // Acquire lock (other threads wait here)
// Critical Section begins
const current = await getBalance();
const updated = current + amount;
await setBalance(updated);
// Critical Section ends
lock.release(); // Release lock
}
Now even if two deposit calls run simultaneously, only one can be in the critical section at any moment. The first one in must finish before the second one can enter.
Another approach is using atomic operations. "Atomic" means "indivisible" - it cannot be split into smaller parts. The idea is to make Read-Modify-Write happen as a single, uninterruptible operation.
// Atomic update in database
await db.query('UPDATE accounts SET balance = balance + ? WHERE id = ?', [amount, accountId]);
The database handles this SQL atomically. No other query can interfere midway. Another technique is using CPU-level instructions like Compare-And-Swap (CAS).
// CAS concept (pseudocode)
function compareAndSwap(addr, expected, newValue) {
// This entire operation executes atomically
if (memory[addr] === expected) {
memory[addr] = newValue;
return true;
}
return false;
}
// Usage example
while (true) {
const current = balance;
const updated = current + 10;
if (compareAndSwap(&balance, current, updated)) {
break; // Success
}
// Failed - another thread modified it first, retry
}
CAS says "if the value is still what I read, update it; otherwise fail." We loop until it succeeds. This approach is called lock-free programming.
Race conditions aren't limited to application code. Databases face them too.
Lost Update Problem: Two transactions read the same row, both modify it, both write back. One modification disappears. Just like our bank account example.
Dirty Read Problem: Transaction A modifies data but hasn't committed yet. Transaction B reads that uncommitted data. If A rolls back, B has read data that never existed.
To prevent these issues, databases provide transaction isolation levels. Setting it to SERIALIZABLE makes transactions behave as if they executed sequentially. Race conditions are completely eliminated. The tradeoff is performance.
Another approach is optimistic locking vs pessimistic locking:
SELECT ... FOR UPDATE)-- Optimistic locking example
UPDATE accounts
SET balance = 110, version = version + 1
WHERE id = 123 AND version = 5;
-- If version isn't 5 (another transaction modified it), returns 0 rows affected
Another race condition variant is the TOCTOU bug. The state changes between the "time of check" and the "time of use."
// TOCTOU bug example
async function deleteFile(filePath) {
if (await fileExists(filePath)) { // Time Of Check
// ... time passes ...
// Another process might delete the file!
await fs.unlink(filePath); // Time Of Use
}
}
The file existed when we checked, but might be gone when we use it. The solution is to make check and use atomic.
// Fix: just delete and handle errors
try {
await fs.unlink(filePath);
} catch (err) {
if (err.code !== 'ENOENT') throw err; // Ignore if file doesn't exist
}
Frontend code has race conditions too. A common one in React involves useState:
const [count, setCount] = useState(0);
// Buggy code
function increment() {
setCount(count + 1); // Reads previous value and adds 1
}
// What if button is clicked twice rapidly?
// Both increments run almost simultaneously
// Both read count = 0
// Both call setCount(1)
// Result: count = 1 (not 2!)
The solution is functional updates:
function increment() {
setCount(prev => prev + 1); // Receives previous value as function parameter
}
// React guarantees order internally
// First increment: prev = 0, result = 1
// Second increment: prev = 1, result = 2
Race condition bugs are incredibly frustrating. They don't reproduce consistently, don't show up in logs, and only appear with specific timing. Here are techniques I use:
Thread Sanitizer: In C/C++, the -fsanitize=thread compiler flag automatically detects race conditions.
Log timestamps and thread IDs: Helps track which thread did what and when.
Artificially inject delays: Adding sleep() in suspicious areas can manipulate timing to reproduce the bug.
Use immutable data structures: If you never modify data (always create new instances), race conditions can't happen. This is core to functional programming.
Race conditions aren't about buggy code - they're about buggy execution environments. The code is correct, but concurrent execution with unfortunate timing breaks it. Understanding this concept connects the dots: why locks are needed in multithreading, why transactions are essential, why immutability matters in async programming and distributed systems.
Like the bank account example, in systems dealing with money, these bugs translate to real financial loss. That's why you must identify critical sections precisely and use appropriate synchronization mechanisms (Mutex, Atomic operations, Transactions). If you don't, you'll find yourself waking up at 3 AM to production bugs. Trust me, I've been there.
The key insight I've internalized: perfect code can produce imperfect results when execution order matters. Race conditions force you to think not just about what your code does, but about when it does it relative to other concurrent operations. It's a shift from sequential thinking to temporal thinking.
And honestly, the scariest part is these bugs often appear after deployment, when real users are hammering your system concurrently in ways your tests never imagined. That first production race condition taught me more about concurrency than any textbook ever could. Now I look at every shared state access with suspicion, every async operation with caution. Because I know that somewhere in that innocent-looking code, timing could be waiting to strike again.