Schema Evolution

Change the schema without rewriting the data. Add columns, drop columns, widen types via metadata changes only. The pattern that prevents the 18-month 'the table changed' outage.

Add a column. Drop a column. Widen a type. Only metadata changes; data files stay put. Readers and writers cooperate via schema versioning. Status: STUB — promoted to OUTLINE in Y4 Phase 31.

What this pattern is

Schema evolution is the discipline of changing a schema without rewriting the data. Modern table formats (Iceberg, Delta Lake, Hudi) and serialization formats (Avro, Protobuf) support cheap schema changes: adding columns (new column reads default value for old data), dropping columns (data stays, just isn’t read), widening types (int → bigint), renaming via column-id mapping. Hidden partitioning means partition changes don’t rewrite either. Readers and writers cooperate via schema versioning — a reader written against schema v2 can still read data written under v1.

The pattern separates schema (metadata) from data (files). Adding a column is a metadata operation that costs microseconds; the same operation on a naive design would require rewriting every existing row to include a value for the new column, potentially hours or days of I/O. This cost difference is what determines whether schema changes are casual (add a column during a lunch break) or terrifying (six-week migration project). Well-designed formats make them casual.

The cooperation between readers and writers is the load-bearing invariant. When a writer produces data under schema v3, older readers on schema v2 must still be able to read it correctly. This requires backward-compatible changes only: adding optional fields, widening types, dropping fields that readers were tolerant of. Breaking changes (renaming without column IDs, narrowing types, changing default values in incompatible ways) require version-aware readers or coordinated deploys. Avro, Protobuf, and Iceberg all encode compatibility rules explicitly; teams that violate them regret it.

The pattern prevents the canonical 18-month outage: somebody changed the column shape, an upstream pipeline didn’t notice, and ML model training starts producing garbage two months later when nobody can reproduce the issue. Disciplined schema evolution makes the change explicit, testable, and reversible.

Concrete instances in the wild

Iceberg schema evolution. Iceberg tracks columns by ID, not name. Renames are metadata-only. Adding a column is ALTER TABLE ADD COLUMN; dropping is ALTER TABLE DROP COLUMN; widening is ALTER TABLE ALTER COLUMN. Zero data-file rewrites.
Avro schemas with schema registry. Confluent Schema Registry enforces compatibility rules (backward, forward, full). Producers register new schema versions; consumers read the schema version each record was written with.
Protobuf field numbers. Protobuf fields are identified by number, not name. Adding a field with a new number is compatible; renaming a field is a metadata-only change; removing a field is compatible as long as consumers don’t require it.
Delta Lake schema evolution. Similar to Iceberg. Adding, dropping, and widening columns are metadata operations. Delta supports schema-enforced writes to prevent unintended schema drift.
Postgres logical replication with pglogical. Schema changes flow through replication; consumers handle them with column-name matching.
JSON with backward-compatible parsers. REST APIs that add optional fields and let old clients ignore them. The web’s default schema-evolution strategy, informal but effective.
GraphQL schema deprecation. Fields marked @deprecated remain queryable; new fields are added; old clients keep working. The pattern applied to API contracts.
Database migration tools. Alembic, Flyway, and Liquibase codify schema-evolution operations as sequenced migrations. Each migration is designed to be backward-compatible with the previous schema version, allowing rolling deploys.

Why this pattern matters

The alternative — every schema change requires rewriting the data — makes schemas terrifying. Teams accumulate schemas they wish they could change but can’t afford to. Column names become meaningless because they can never be renamed. Types stay too narrow because widening requires downtime. Deprecated fields never get removed because the migration is scarier than the accumulated technical debt. Over years, this compounds into schemas that everyone hates and no one can fix.

With disciplined schema evolution, the fear disappears. Renaming a column is trivial. Widening a type is trivial. Deprecating a field is a metadata operation followed by a retention wait. Teams can iterate on their schemas the way they iterate on code, and the schema stays clean instead of becoming a museum of past decisions.

The pattern also enables data-lake sanity. Data lakes without schema evolution devolve into “the schemaless swamp” where every table has slightly different column shapes based on when data was ingested. Every consumer has to handle every historical variant. Iceberg-style schema evolution keeps the table shape unified across time, even as the schema changes — old data files logically appear to have the new schema (with defaults for new columns), and consumers just query the table.

The failure mode to watch: schema evolution doesn’t protect against semantic changes. Renaming a column named revenue from “gross revenue” to “net revenue” is a metadata change with catastrophic downstream implications. The pattern gives you syntactic evolution safely; semantic evolution still requires human coordination. Every schema change ADR should name both the syntactic change (column added, type widened) and the semantic implication (what the change means for downstream consumers).

Depth progression

STUB     ← you are here.
OUTLINE  Promoted when Y4 Phase 31 (lakehouse) exercises Iceberg's schema-
         evolution operations directly.
DEEP     Promoted after Y4 end with at least one real schema-evolution event
         that flowed through readers + writers without breakage.

Preview: what OUTLINE will answer

When Y4 Phase 31 promotes this entry to OUTLINE, it will name:

PROBLEM. How do you change a schema in a live system without rewriting the underlying data or breaking existing consumers?
PRINCIPLES. Column identity by ID, not name. Backward-compatible defaults. Explicit compatibility rules (backward, forward, full). Metadata operations, not data operations. Coordination via schema versioning.
TRADE-OFFS. Backward compatibility (older readers work) vs forward compatibility (older writers work) vs full compatibility (both). Column-ID mapping (Iceberg-style) vs field-number mapping (Protobuf-style) vs name-based (fragile). Rolling deploys (partial schema versions coexist) vs coordinated deploys (all consumers move together).
TOOLS (time-stamped as of 2026-06): Iceberg, Delta Lake, Hudi, Avro + Confluent Schema Registry, Protobuf, database migration tools (Alembic, Flyway, Liquibase, Django migrations).

The DEEP promotion, after Y4 with real schema-evolution experience, will add MASTERY (executing multiple schema changes through the Iceberg + CDC + downstream-consumer chain), COMPARE (Iceberg schema evolution vs Avro schema registry vs Protobuf field numbers), OPERATE (a real incident where a schema change broke a downstream consumer), and CONTRIBUTE (a documentation fix or clarification to Iceberg or Avro).

Canonical references

Apache Iceberg specification, schema-evolution section — the definitive reference for modern lakehouse schema evolution.
Confluent’s blog posts on schema compatibility rules — backward, forward, and full compatibility explained with examples.
Google’s Protobuf language guide, particularly the sections on field numbers and reserved fields.
Netflix Iceberg adoption blog posts — practical stories about schema-evolution in production.
Martin Fowler’s Refactoring Databases — the canonical text on schema-evolution as continuous practice.

Cross-references

Y4 Phase 31: Lakehouse
Related: snapshot-plus-delta, lakehouse, cdc
Canonical reference: Iceberg spec, Avro schema-evolution rules