Race Condition: Bugs depending on Timing

The Day My Money Disappeared

I had exactly 1 million won in my bank account. I went to ATM A and deposited 100,000 won. At the exact same time, my friend used ATM B to deposit another 100,000 won into my account. Simple math says I should have 1.2 million won now, right?

When I checked my balance, it showed 1.1 million won. 100,000 won just vanished into thin air.

I called the bank. "System error," they said. "Let me check again," they said. "This is impossible," they insisted. I looked at the code myself - it was perfect. I checked the logs - both deposits were processed successfully. Yet the money was gone.

This is the worst kind of bug because it doesn't reproduce consistently. It works fine in testing. It works 99 times out of 100. It only breaks when the timing is just right. This is what we call a race condition.

Why Does This Even Happen?

When I first encountered this problem, I was completely lost. The code was obviously correct. How can balance = balance + 10 possibly fail? It's just addition!

The answer lies in the fact that this single line doesn't execute atomically. From the CPU's perspective, this innocent-looking line of code actually breaks down into three distinct steps:

// What you see: one line
balance = balance + 10;

// What the CPU actually does (machine code level)
// 1. READ: Load balance value from memory into register (100)
// 2. MODIFY: Add 10 to the register value (110)
// 3. WRITE: Store register value back to memory (110)

This three-step sequence is called the Read-Modify-Write pattern. The problem emerges when two threads (or processes) try to execute this pattern simultaneously. The execution order can get interleaved in dangerous ways:

Time  Thread A            Thread B
1     READ 100            -
2     -                   READ 100  (A hasn't written yet!)
3     MODIFY 110          -
4     WRITE 110           -
5     -                   MODIFY 110 (100 + 10)
6     -                   WRITE 110  (Overwrites A's result!)

Both threads read balance = 100. Each adds 10 to get 110. Both write 110 to memory. The result is 110. We should have added 20, but only 10 got added. The other 10 evaporated.

If this was a bank account, money disappeared. If this was inventory management, stock counts are corrupted. The code itself is flawless, but the outcome depends on execution timing. That's the essence of a race condition.

The Bathroom Metaphor That Made It Click

The concept really clicked for me when someone explained it using a bathroom analogy.

There's a single bathroom. The door has no lock. Two people (A and B) approach simultaneously. Both check the door (READ). Both see it's unlocked. Both think "It's free!" Both enter (WRITE). Collision. Disaster.

The solution is simple: install a lock. Person A enters and locks the door. Person B arrives, sees it's locked, and waits. Person A finishes and unlocks the door. Person B enters. This is exactly what a Mutex (Mutual Exclusion) does.

In code, it looks like this:

// Problematic code
async function deposit(amount) {
  const current = await getBalance(); // READ
  const updated = current + amount;    // MODIFY
  await setBalance(updated);           // WRITE
}

// If two deposits run concurrently, race condition occurs

Solution 1: Protecting the Critical Section

The code region where Read-Modify-Write happens is called a critical section. This section must allow only one thread at a time. Ensuring this is called mutual exclusion.

JavaScript doesn't have native locks, but conceptually it would look like this:

// Protected with Mutex
const lock = new Mutex();

async function deposit(amount) {
  await lock.acquire(); // Acquire lock (other threads wait here)

  // Critical Section begins
  const current = await getBalance();
  const updated = current + amount;
  await setBalance(updated);
  // Critical Section ends

  lock.release(); // Release lock
}

Now even if two deposit calls run simultaneously, only one can be in the critical section at any moment. The first one in must finish before the second one can enter.

Solution 2: Atomic Operations

Another approach is using atomic operations. "Atomic" means "indivisible" - it cannot be split into smaller parts. The idea is to make Read-Modify-Write happen as a single, uninterruptible operation.

// Atomic update in database
await db.query('UPDATE accounts SET balance = balance + ? WHERE id = ?', [amount, accountId]);

The database handles this SQL atomically. No other query can interfere midway. Another technique is using CPU-level instructions like Compare-And-Swap (CAS).

// CAS concept (pseudocode)
function compareAndSwap(addr, expected, newValue) {
  // This entire operation executes atomically
  if (memory[addr] === expected) {
    memory[addr] = newValue;
    return true;
  }
  return false;
}

// Usage example
while (true) {
  const current = balance;
  const updated = current + 10;
  if (compareAndSwap(&balance, current, updated)) {
    break; // Success
  }
  // Failed - another thread modified it first, retry
}

CAS says "if the value is still what I read, update it; otherwise fail." We loop until it succeeds. This approach is called lock-free programming.

Race Conditions in Databases

Race conditions aren't limited to application code. Databases face them too.

Lost Update Problem: Two transactions read the same row, both modify it, both write back. One modification disappears. Just like our bank account example.

Dirty Read Problem: Transaction A modifies data but hasn't committed yet. Transaction B reads that uncommitted data. If A rolls back, B has read data that never existed.

To prevent these issues, databases provide transaction isolation levels. Setting it to SERIALIZABLE makes transactions behave as if they executed sequentially. Race conditions are completely eliminated. The tradeoff is performance.

Another approach is optimistic locking vs pessimistic locking:

Pessimistic locking: Assumes conflicts are frequent. Locks data upfront to prevent others from touching it. (SELECT ... FOR UPDATE)
Optimistic locking: Assumes conflicts are rare. Doesn't lock when reading. Checks version number when updating - if someone else modified it first, the update fails.

-- Optimistic locking example
UPDATE accounts
SET balance = 110, version = version + 1
WHERE id = 123 AND version = 5;

-- If version isn't 5 (another transaction modified it), returns 0 rows affected

TOCTOU Bugs: Time Of Check To Time Of Use

Another race condition variant is the TOCTOU bug. The state changes between the "time of check" and the "time of use."

// TOCTOU bug example
async function deleteFile(filePath) {
  if (await fileExists(filePath)) {        // Time Of Check
    // ... time passes ...
    // Another process might delete the file!
    await fs.unlink(filePath);             // Time Of Use
  }
}

The file existed when we checked, but might be gone when we use it. The solution is to make check and use atomic.

// Fix: just delete and handle errors
try {
  await fs.unlink(filePath);
} catch (err) {
  if (err.code !== 'ENOENT') throw err; // Ignore if file doesn't exist
}

Race Conditions in React

Frontend code has race conditions too. A common one in React involves useState:

const [count, setCount] = useState(0);

// Buggy code
function increment() {
  setCount(count + 1); // Reads previous value and adds 1
}

// What if button is clicked twice rapidly?
// Both increments run almost simultaneously
// Both read count = 0
// Both call setCount(1)
// Result: count = 1 (not 2!)

The solution is functional updates:

function increment() {
  setCount(prev => prev + 1); // Receives previous value as function parameter
}

// React guarantees order internally
// First increment: prev = 0, result = 1
// Second increment: prev = 1, result = 2

Real-World Debugging Tips

Race condition bugs are incredibly frustrating. They don't reproduce consistently, don't show up in logs, and only appear with specific timing. Here are techniques I use:

Thread Sanitizer: In C/C++, the -fsanitize=thread compiler flag automatically detects race conditions.
Log timestamps and thread IDs: Helps track which thread did what and when.
Artificially inject delays: Adding sleep() in suspicious areas can manipulate timing to reproduce the bug.
Use immutable data structures: If you never modify data (always create new instances), race conditions can't happen. This is core to functional programming.

Final Thoughts

Race conditions aren't about buggy code - they're about buggy execution environments. The code is correct, but concurrent execution with unfortunate timing breaks it. Understanding this concept connects the dots: why locks are needed in multithreading, why transactions are essential, why immutability matters in async programming and distributed systems.

Like the bank account example, in systems dealing with money, these bugs translate to real financial loss. That's why you must identify critical sections precisely and use appropriate synchronization mechanisms (Mutex, Atomic operations, Transactions). If you don't, you'll find yourself waking up at 3 AM to production bugs. Trust me, I've been there.

The key insight I've internalized: perfect code can produce imperfect results when execution order matters. Race conditions force you to think not just about what your code does, but about when it does it relative to other concurrent operations. It's a shift from sequential thinking to temporal thinking.

And honestly, the scariest part is these bugs often appear after deployment, when real users are hammering your system concurrently in ways your tests never imagined. That first production race condition taught me more about concurrency than any textbook ever could. Now I look at every shared state access with suspicion, every async operation with caution. Because I know that somewhere in that innocent-looking code, timing could be waiting to strike again.

Race Condition: Bugs depending on Timing

Related Posts

Memory Management: Contiguous vs Non-Contiguous Allocation

BFS vs DFS: Graph Traversal

Quick Sort: Divide and Conquer

Keep-Alive: Don't hang up yet