Event Sourcing: CDC vs. Outbox Pattern

Key Points
  • CDC captures changes directly from the database transaction log, avoiding application code modifications but creating dependency on specialized tools like Debezium.
  • Outbox Pattern ensures strong consistency by saving operation and event in the same transaction, eliminating inconsistencies but increasing transactional overhead.
  • The choice between CDC and Outbox depends on context: legacy systems favor CDC, while new applications with strong consistency requirements benefit from Outbox.
  • Hybrid implementations can use Outbox for critical operations and CDC for auxiliary data, combining benefits of both approaches.
  • The real challenge lies in capturing business context: CDC loses information about operation intent, while Outbox allows enriching events with contextual data.

Event sourcing is an architectural pattern that captures all state changes of an application as a sequence of immutable events. Instead of storing only the current state, we maintain a complete log of all changes that have occurred over time.

When implementing event sourcing, one of the main architectural decisions we face is: how to reliably capture and publish these events? Two approaches stand out: Change Data Capture (CDC) and Outbox Pattern.

Change Data Capture (CDC): Intelligent Monitoring

CDC monitors database changes and transforms them into events. It's like having a silent observer that records everything that happens, without interfering with the normal operation of the application.

How CDC Works

  1. Monitoring: An external tool monitors the database transaction log
  2. Capture: Detects changes (INSERT, UPDATE, DELETE) in real-time
  3. Transformation: Converts these changes into structured events
  4. Publishing: Sends events to a message broker (Kafka, RabbitMQ, etc.)

CDC Advantages

✅ Non-Intrusive Approach

  • Doesn't require application code modifications
  • Works with legacy systems without refactoring
  • Low coupling between event capture and business logic

✅ Comprehensive Capture

  • Monitors all changes, even those made directly to the database
  • Doesn't lose events due to application failures
  • Ensures no change goes unnoticed

✅ Performance

  • Minimal overhead on the main application
  • Asynchronous event processing
  • Doesn't impact critical transactions

CDC Disadvantages

❌ Infrastructure Complexity

  • Requires specialized tools (Debezium, Maxwell, etc.)
  • More complex configuration and maintenance
  • Dependency on external tools

❌ Eventual Consistency

  • Small delay between change and event publication
  • Doesn't guarantee exact order in high concurrency scenarios
  • Possible duplications in failure scenarios

❌ Context Limitations

  • Events based only on data changes
  • Loss of business context
  • Difficulty capturing operation intent

Outbox Pattern: Full Control in the Application

The Outbox Pattern implements event capture directly in the application, using a special table (outbox) to store events that will be published later.

How Outbox Pattern Works

  1. Unified Transaction: Within a database transaction, the application:
    • Executes the business operation
    • Inserts the corresponding event in the outbox table
  2. Publishing: A separate process reads the outbox table and publishes events
  3. Cleanup: Removes successfully published events

Outbox Pattern Advantages

✅ Strong Consistency

  • Operation and event are saved in the same transaction
  • Ensures events are generated only for successful operations
  • Event order preserved by design

✅ Full Control

  • Application defines exactly which events to generate
  • Rich business context in events
  • Flexibility to enrich events with additional information

✅ Conceptual Simplicity

  • Straightforward pattern to implement
  • Doesn't require complex external tools
  • Easy to test and debug

Outbox Pattern Disadvantages

❌ Code Modification

  • Requires changes to existing application
  • Coupling between business logic and event generation
  • More code to maintain

❌ Transaction Overhead

  • Each business operation includes writing to outbox table
  • Possible impact on transaction performance
  • Increases transaction size

❌ Publishing Complexity

  • Need to implement or use a reliable event publisher
  • Retry and fallback management
  • Duplication risks if not properly implemented

Practical Comparison: When to Use Each Approach

Use CDC when:

  • Legacy system: You have an existing application that can't be easily modified
  • Multiple applications: Several applications write to the same database
  • Performance priority: Transaction overhead is critical
  • Strong infrastructure team: Has expertise to configure and maintain CDC tools

Use Outbox Pattern when:

  • Control is important: You want full control over which events are generated
  • Rich context: Events need information not in the database
  • Strong consistency: Can't accept any inconsistency between operation and event
  • New application: You're developing a system from scratch

Hybrid Implementation: Best of Both Worlds

In many cases, you can combine both approaches:

┌─────────────────┐    ┌─────────────────┐
│   Critical      │    │   Auxiliary     │
│   Operations    │    │   Changes       │
│                 │    │                 │
│ Outbox Pattern  │    │      CDC        │
└─────────────────┘    └─────────────────┘
         │                       │
         └───────┬───────────────┘
                 │
         ┌─────────────────┐
         │  Unified Event  │
         │     Stream      │
         └─────────────────┘

Hybrid Strategy

  1. Outbox for critical events: Use Outbox Pattern for important business operations where consistency is fundamental
  2. CDC for auxiliary data: Use CDC to capture changes in support tables, logs, auditing
  3. Unified stream: Combine both streams into a single ordered event stream

Tools and Technologies

For CDC:

  • Debezium: Open-source platform for CDC with multiple database support
  • AWS DMS: AWS managed service for data migration and replication
  • Maxwell: CDC for MySQL focused on simplicity
  • Kafka Connect: Connectors for CDC integration with Apache Kafka

For Outbox Pattern:

  • Transactional Outbox: Implementations in frameworks like Spring Boot
  • Axon Framework: Framework for CQRS/Event Sourcing with integrated Outbox
  • NServiceBus: .NET platform with native Outbox Pattern support

Performance Considerations

CDC Performance:

  • Pros: Low impact on main application
  • Cons: Possible database overhead for log parsing

Outbox Performance:

  • Pros: Fine control over when and how events are generated
  • Cons: Additional overhead in each business transaction

Conclusion: Choosing the Right Approach

There's no single answer. The choice between CDC and Outbox Pattern depends on your specific context:

For legacy systems with multiple applications: CDC offers a non-intrusive solution that can be implemented without major refactoring.

For new systems with strict consistency requirements: Outbox Pattern provides full control and ensures events are generated only for successful operations.

For complex systems: A hybrid approach can leverage the benefits of both patterns.

Decision Principles:

  1. Evaluate current context: How much can you modify the existing application?
  2. Consider consistency requirements: Can you accept eventual consistency?
  3. Analyze team capability: Which approach can your team better implement and maintain?
  4. Think about evolution: How will the solution scale and evolve over time?
Insight

"Event sourcing is a journey, not a destination."

Event sourcing is a journey, not a destination. Start with the approach that makes most sense for your current context and evolve as needed. The important thing is to reliably and usefully capture the events that represent your business history.


References