Event Sourcing: Persisting the 'Journey', Not Just the Destination
1. The Single Source of Truth
In most systems, the database stores the current state of an entity. If you change a user's address, the old address is overwritten and lost forever.
In Event Sourcing, the primary source of truth is not the state, but the sequence of events that led to that state.
State is merely a derivative; a "left-fold" over the stream of events.
$$ State = f(InitialState, Event_1, Event_2, ... Event_n) $$
This paradigm shift is akin to version control systems like Git. You don't just store the final code; you store the commits (diffs). If you lose the final code, you can always reconstruct it by replaying the commits from the beginning.
2. Command Sourcing vs. Event Sourcing
A common confusion is between storing Commands (CreateOrder) and Events (OrderCreated).
- Command: An intent. "Do this". It can be rejected (e.g., insufficient funds).
- Event: A fact. "This has happened". It cannot be rejected because it's already history.
In Event Sourcing, we store Events, not Commands.
Storing commands is called Command Sourcing, and it's dangerous because replaying a command (like PayCreditCard) might have side effects (charging the user twice) or might fail differently depending on the time of replay. Events are safe to replay because they simply state facts.
3. The Synergy with CQRS
Event Sourcing is rarely used alone; it is almost always paired with CQRS (Command Query Responsibility Segregation).
Why? Because querying an Event Store is hard.
Imagine trying to ask an Event Store: "Show me all users who live in New York and bought a toaster last month."
To answer this, you would have to replay the events of every single user in the system to determine their current address and purchase history. It's computationally impossible.
By applying CQRS:
- The Write Side (Command): Accepts commands, validates them, and appends events to the Event Store (e.g., Kafka, EventStoreDB). It performs roughly O(1) writes.
- The Read Side (Query): Listens to these events and updates a separate "Read Database" (e.g., ElasticSearch, PostgreSQL). This database is denormalized and optimized specifically for the queries the UI needs.
This separation allows for Polyglot Persistence. You can project the same stream of user events into a Graph DB for social recommendations, a Relational DB for billing, and a Search Engine for text search.
4. The Hard Parts: Consistency and Complexity
Event Sourcing is not a silver bullet. It introduces significant complexity.
Eventual Consistency
Since the Read Model is updated asynchronously after the Write Model records the event, there is a lag (milliseconds to seconds). A user might add an item to the cart, refresh the page, and see an empty cart. Dealing with this UI/UX challenge requires tricks like Optimistic UI updates or Websockets to push updates.
Event Schema Evolution
Code changes, business requirements change. The event structure you designed today will be obsolete next year.
Since events are immutable and persisted forever, you cannot simply ALTER TABLE.
You must handle schema evolution strategies:
- Multiple Versions: Your code must be able to handle
OrderPlaced_v1 and OrderPlaced_v2.
- Upcasting: A middleware layer that transforms old events into the new schema on-the-fly during replay.
- Copy-and-Replace: In extreme cases, creating a new event stream by migrating and transforming the old one (expensive).
5. Implementation Challenges
Snapshots
Replaying 1 million events to get the current balance of a bank account is slow.
To mitigate this, we periodically save the calculated state (a Snapshot) to a separate store.
When loading an aggregate, we load the latest snapshot and only replay the events that occurred after that snapshot. This keeps load times constant regardless of history length.
EventStoreDB
While you can use PostgreSQL or MongoDB as an event store, specialized databases like EventStoreDB exist.
They offer features like:
- Optimistic Concurrency Control: Preventing two users from modifying the same aggregate at the same time using
expectedVersion.
- Projections: Built-in JavaScript functions to generate new streams from existing ones.
- Subscriptions: Easy APIs to subscribe to event streams for building Read Models.
6. Implementing Event Store with PostgreSQL
If you don't want to use a specialized DB, PostgreSQL is a fantastic Event Store thanks to JSONB.
Here is a simplified schema:
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
aggregate_id UUID NOT NULL,
aggregate_type VARCHAR(255) NOT NULL,
event_type VARCHAR(255) NOT NULL,
event_data JSONB NOT NULL,
version BIGINT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(aggregate_id, version) -- Optimistic Concurrency Control
);
Indexing Best Practices:
- Composite Index:
CREATE INDEX idx_aggregate ON events (aggregate_id, version ASC);
This allows you to fetch events for a specific entity efficiently: SELECT * FROM events WHERE aggregate_id = ? ORDER BY version ASC.
- GIN Index:
CREATE INDEX idx_data ON events USING GIN (event_data);
This allows querying the payload itself: "Find all events where color is red".
This approach gives you the ACID guarantees of Postgres with the flexibility of a document store.
7. Case Study: LMAX Exchange (High Performance)
The most famous example of Event Sourcing is LMAX Exchange, a high-frequency trading platform in London.
They needed to process 6 million orders per second with less than 100-microsecond latency.
Traditional database locks (ACID) were too slow.
They invented the Disruptor pattern (an in-memory ring buffer) and used Event Sourcing purely in-memory.
- All incoming orders are serialized into an event stream.
- A single-threaded processor reads these events and updates the in-memory state.
- Since there is only one thread, there are no locks, no contention, and no context switching.
- If the server crashes, they just replay the event log from disk to rebuild the memory state in seconds.
This proved that Event Sourcing isn't just for complex business logic; it's also a pattern for extreme performance.
8. When NOT to Use Event Sourcing
Given the complexity, knowing when not to use it is as important as knowing when to use it.
Any architecture decision is a trade-off. Event Sourcing pays off only when the complexity of the domain model is high.
Do NOT use if:
- Simple CRUD App: If your app is mostly forms over data, Event Sourcing is 10x over-engineering.
- No need for Audit: If you don't care about history, don't pay the price of storing it.
- Low Latency Read Requirements: If you need instant consistency on reads (Read-Your-Writes) without complexity, standard relational DBs are better.
Use if:
- Audit is critical: Banking, Medical, Legal logs.
- Complex Business Logic: Workflow engines, Order processing.
- Analytics / Deep Insight: You want to derive new insights from old behavior data later.
9. Summary
Event Sourcing is a powerful pattern for complex domains where auditability, history, and flexibility are paramount. It decouples the recording of "what happened" from the interpretation of those facts. However, the operational overhead, learning curve, and the challenge of eventual consistency mean it should be adopted with caution. Don't use it for a simple blog or a TODO app; use it for accounting systems, inventory management, or complex collaborative software where the history of interactions is as valuable as the current state.