Model Lifecycle
The pattern: train → evaluate → register → deploy → monitor → retrain. Each hand-off needs a contract. Reproducibility (data version + code version + params + seeds). Gated promotion (eval threshold). Versioned serving (canary). Drift monitoring (compare prod vs. train distributions). Automatic retrain trigger when drift crosses threshold.
The trade-off: MLOps overhead vs. silent model rot. ML systems without lifecycle discipline degrade quietly — features drift, the world changes, accuracy drops, no one notices until a quarterly review. The discipline is real overhead (MLflow, KServe, drift, retrain pipeline) — but the alternative is “we deployed this model 18 months ago and forgot about it.” That’s career-limiting at Staff/Principal level.
Deepens in Year 4 Phase 20: MLOps Foundations (MLflow as the registry frame everything hangs off) and reaches DEEP in Phase 25: GPU Infrastructure when drift detection auto-rolls back a regressed model. Phase 21: ML Serving + mlship v0 is where canary serving first lands.
Related patterns
- train-serve-skew — the failure that makes “monitor” a non-optional step.
- feature-store — provides the reproducible feature view at every promotion.
- inference-shapes — each shape has its own canary + rollback story.
- rag-as-pattern — applies the same lifecycle to embeddings, indexes, and prompts.
- prompt-as-program — same versioning shape, applied to prompts.
- snapshot-plus-delta — versioned data is half of reproducibility.
mlship— capstone CLI that turns this lifecycle into one command.basecamp— MLflow + KServe + Kubeflow live in Tier 5/6.