Skip to content
STUB

Two-Phase Commit vs Sagas

The pattern: two answers to “how do you make a transaction across multiple services consistent?” 2PC: a coordinator asks all participants “can you commit?” then “commit.” Atomic if all online; blocks if anyone is unreachable. Sagas: each step is a local transaction with a compensating action; if step 5 fails, run compensations 4→1. Eventually consistent; never blocks.

The trade-off: consistency speed vs. failure tolerance. 2PC gives strong atomicity but is fragile under failure (the famous “blocking protocol”). Sagas are fault-tolerant but require designing compensations (and accepting a window of inconsistency). Modern systems trend toward sagas + idempotency — the saga is the only protocol that survives realistic network failure.

[Deepen Year 2 Phase 8 + revisit when designing cross-service workflows in Year 4-5.]

  • Idempotency — saga steps and their compensations must be idempotent or replay corrupts state.
  • Consensus — 2PC is consensus on a single decision; sagas opt out of consensus entirely.
  • Delivery semantics — choreographed sagas ride on at-least-once event delivery; orchestrated sagas hide that with a workflow engine.
  • Eventual consistency — the window between saga step and compensation is exactly this.
  • Append-only log — choreographed sagas use the log as the coordination substrate.

First touched in Year 2 Phase 8; revisited in Year 4 Phase 21 and Year 5 Phase 27 when cross-service workflows show up inside services/llm-gateway/ and services/aiops/.