ADR Template

Architecture Decision Record. Use when you make a significant architectural choice with trade-offs. Never accept a decision without documented alternatives.

An ADR is the receipt for a decision that will outlast the moment of making it. It captures the context (what was true when you decided), the alternatives you weighed, the choice you made, and the consequences you accepted. Six months from now, when you (or a hiring panel reading ops-handbook) ask “why did we pick Iceberg?” — the ADR is the answer.

ADRs are append-only history. You don’t edit accepted ADRs to reflect new thinking; you write a new ADR that supersedes the old one. The chain of supersession is the artifact — it shows how your reasoning evolved as the platform did.

This template fits the broader ROOT discipline that decisions need to be traceable to be learnable. A decision without context is folklore; a decision with documented alternatives and trade-offs is a pattern you can apply again.

Template

Copy into ops-handbook/adrs/{NNNN}-{slug}.md:

---
title: "ADR NNNN: <decision>"
slug: NNNN-<slug>
tags: [adr]
status: proposed | accepted | superseded | deprecated
date: YYYY-MM-DD
supersedes: NNNN-<other>     # if applicable
superseded-by: NNNN-<other>  # if this is superseded later
---

# ADR NNNN: <decision in one sentence>

## Status

`accepted` (or `proposed` for in-flight, `superseded` once replaced)

## Context

What problem are we solving? What constraints exist? Why is this decision being made *now* and not later?

2-4 paragraphs. Concrete example:

> basecamp's lakehouse tier (Year 3) needs a table format for analytics workloads on top of MinIO. The format must support schema evolution, time-travel reads, and concurrent writers. Current state: no table format chosen; raw Parquet with directory-based partitioning is the placeholder. Constraints: single-operator homelab; must run on K3s with <16GB RAM headroom; must be Trino- and Spark-compatible.

## Decision

The decision in 1-2 sentences. Specific. No hedging.

> "We will use Iceberg as the table format for basecamp's lakehouse, with Nessie as the catalog."

## Alternatives considered

> **A decision without alternatives is not an ADR — it's a fait accompli. Force yourself to enumerate at least 2 alternatives.**

### Alternative 1: Delta Lake

- What it is: Databricks-originated table format with strong Spark integration
- Why it's a viable choice: mature, broad ecosystem, good time-travel
- Why we didn't pick it: catalog story is weaker outside Databricks; Trino support lags Iceberg by ~6 months; ROOT's Year 3 phase doc on lakehouse explicitly compares the two and the COMPARE step needs the more open ecosystem

### Alternative 2: Apache Hudi

- What it is: streaming-first table format with upsert semantics
- Why it's a viable choice: best-in-class for CDC ingestion patterns
- Why we didn't pick it: ROOT's Year 3 ingest is batch-first (Airflow + Spark); upsert workloads land in Year 4 and can revisit

### Alternative 3: Raw Parquet + directory partitioning

- What it is: the status quo placeholder
- Why it's a viable choice: zero new dependencies; lowest operational surface
- Why we didn't pick it: no schema evolution, no atomic commits, no time-travel — exactly the trade-offs the lakehouse pattern exists to solve

## Consequences

### Positive

- Schema evolution is a first-class operation, not a migration project
- Time-travel reads enable cheap "what did this table look like last Tuesday" debugging
- Trino and Spark both read Iceberg natively — the COMPARE step in Year 3 Phase 16 stays cheap

### Negative

- Nessie adds a JVM service to the platform tier; one more thing to monitor
- Iceberg metadata files accumulate; a `OPTIMIZE`/`VACUUM` cadence becomes a runbook
- Catalog migration if we ever swap Nessie for Polaris will be painful

### Neutral

- Storage cost is identical to raw Parquet (Iceberg is a metadata layer, not a new file format)
- Query performance is comparable to partitioned Parquet for the workloads we run

## When to revisit

Specific conditions that would make us reconsider:
- If basecamp adds 5+ analytics tenants and Nessie becomes a bottleneck
- If the Spark-Iceberg ergonomics change meaningfully in Iceberg v2 release
- If we hit the metadata-file-count ceiling and `OPTIMIZE` cadence exceeds 1hr/week
- If Year 4's ML workloads need upsert semantics that Iceberg can't deliver

## References

- Related ADR: [0001-minio-as-lakehouse-storage](#)
- Pattern: [../../patterns/data/table-formats.md](#)
- External: [Iceberg spec](https://iceberg.apache.org/spec/), [Nessie docs](https://projectnessie.org)

How to write a good ADR

Five rules:

Numbered + dated. ADRs are append-only history. Don’t renumber.
Context first, decision second. Future-you needs to know why this decision made sense at the time — not just what was decided.
At least 2 alternatives. No alternatives = no real decision. If you can’t name two plausible options, the decision isn’t ripe.
Specific consequences. Vague “improves maintainability” is not a consequence; “lets us swap Trino for Spark without changing storage” is.
When to revisit. Decisions age. Document the conditions that would reopen this one — that’s the difference between a decision and a dogma.

When to write an ADR (vs. just commit)

Write an ADR when:

The decision will outlast the person making it
Reverting requires migration effort (not just a code revert)
Future-you will ask “why did we pick X?”
More than one reasonable alternative exists

Don’t write an ADR for:

Linter rules
Code style choices
One-off bug fixes

Numbering

Sequential, zero-padded to 4 digits:

0001-cilium-cni.md
0002-iceberg-over-delta.md
0003-mcp-over-custom-rest.md

Never reuse numbers. Never renumber. ADRs are immutable history.

Cross-reference

Style: doc-style-guide.md
Related templates: postmortem-template.md (postmortems often spawn ADRs as action items), pattern-template.md (ADRs reference the patterns they implement)
Program context: Master Plan, AI Learning Protocol, ops-handbook plan