Lambda and Kappa Architectures
The pattern: two answers to “how do I serve fast + accurate analytics over events?” Lambda runs a fast streaming layer (approximate, low-latency) AND a batch layer (slow, accurate); results merge at query time. Kappa makes stream the only path; batch is just slow stream replay. One codebase, one source of truth.
The trade-off: complexity vs. flexibility. Lambda has two codebases (stream + batch) — drift between them is the operational nightmare. Kappa has one codebase — but requires the broker (Kafka) to retain enough history for replay, and stream code must be capable of “do exactly what batch does.” Modern systems trend toward Kappa; Lambda persists where streaming reprocessing isn’t viable.
Deepens in Year 3 Phase 17: Batch Processing once both halves exist on abukix.commits. Prerequisites are the streaming side from Year 3 Phase 16: Stream Processing and the batch side from Phase 17 itself — the architecture only makes sense after both have been operated independently.
Related patterns
- stream-processing: the speed layer in Lambda; the only layer in Kappa.
- batch-processing: the accuracy layer in Lambda; the replay path in Kappa.
- append-only-log: Kappa is only viable when the log retains enough history to replay.
- materialized-views: both architectures end at a queryable view; the question is how it gets maintained.
- idempotency: re-running stream replay over the same events must converge to the same result.
- delivery-semantics: replay correctness depends on at-least-once + idempotent or exactly-once.
- eventual-consistency: the speed layer is approximate until the batch layer (Lambda) or replay (Kappa) catches up.