Operator Pattern

Custom CRD + custom controller. The K8s-native shape for encoding domain-specific operational knowledge — built with kubebuilder + controller-runtime. The senior-IC differentiator of /root.

Custom resource describes the what. Custom controller implements the how. kubebuilder + controller-runtime make this accessible at any scale. Status: STUB — promoted to OUTLINE in Y3 Phase 20; built in Y3 Phase 26.

What this pattern is

The operator pattern extends Kubernetes’ control-loop model with domain-specific knowledge. A custom resource (CRD) declares the desired state of a domain concept (a database, a Workload, an IncidentReport). A custom controller watches that CRD and reconciles instances into underlying Kubernetes primitives (Deployments, Services, ConfigMaps, NetworkPolicies). The pattern turns Kubernetes into a platform for authoring domain-specific automation, not just a container orchestrator.

The pattern is operationally pervasive in 2026. CloudNativePG (Postgres operator), Strimzi (Kafka), KubeRay (Ray), KServe (model serving), Karpenter (node autoscaling), Cilium operator, Flux’s source/kustomization controllers — every major platform component on basecamp arrives as an operator. This isn’t accident; it’s convergent design. The K8s API is a general-purpose declarative-reconciliation substrate. Every operational domain that fits that shape becomes an operator.

The /root differentiator: you build two of these yourself. Phase 26’s platform-ctl operator reconciles the Workload CRD. Phase 50’s services/aiops/ operator reconciles IncidentReport CRDs. The exercise of building one with kubebuilder + controller-runtime is the senior-IC signal — it demonstrates K8s-native fluency in a way that “I configured ArgoCD” does not. Reading operator source code is a common Y3-Y5 activity; writing one is what separates operators from platform engineers.

The pattern composes tightly with control-loops (operators are control loops applied to CRDs), declarative-vs-imperative-infrastructure (CRDs are declarations; controllers are reconcilers), and gitops (Flux applies CRD instances; operators reconcile them). Everything K8s-native uses this pattern; understanding it is understanding modern platform engineering.

Concrete instances in the wild

  • CloudNativePG (Postgres operator). basecamp’s Postgres runs under this operator. Postgres cluster is a CRD; the operator handles failover, backup, restore, PITR.
  • Strimzi (Kafka operator). Kafka clusters as CRDs. Operator handles broker lifecycle, topic management, user creation.
  • KubeRay (Ray operator). Ray clusters as CRDs. Autoscaling, GPU scheduling, cluster management.
  • KServe (model serving operator). InferenceService CRD. Model serving lifecycle managed as K8s resources.
  • Karpenter (node autoscaler). Provisioner CRDs describe node pools; Karpenter provisions nodes as workloads demand.
  • Cilium operator. Manages Cilium CNI configuration, network policies, mesh identities.
  • Flux controllers. GitRepository, Kustomization, HelmRelease are all CRDs; each has a dedicated controller.
  • cert-manager. Certificate CRDs; controller handles issuance, renewal, rotation with Let’s Encrypt or private CAs.
  • External Secrets Operator. ExternalSecret CRDs sync secrets from external stores (Vault, cloud secret managers) into K8s Secrets.
  • basecamp platform-ctl operator (Y3 Phase 26). Workload CRD; controller reconciles into Deployment + Service + Ingress + NetworkPolicy + SLO.
  • basecamp aiops operator (Y5 Phase 50). IncidentReport CRD; controller performs triage, suggests runbook, requests human approval, executes.

Why this pattern matters

Kubernetes without operators is a container orchestrator. Kubernetes with operators is a platform for authoring domain-specific automation. The pattern turns any operational domain that fits the declarative-reconciliation shape into a first-class K8s resource. Databases become as manageable as Deployments. Message queues become as manageable as Services. Machine learning workloads become as manageable as Pods.

The pattern also encodes operational knowledge in code. A senior DBA’s understanding of Postgres failover, backup, and recovery lives in the CloudNativePG operator. A senior data engineer’s understanding of Kafka topic partitioning lives in the Strimzi operator. Anyone using the operator inherits that expertise without having to develop it independently. The operator is the mechanism for scaling operational knowledge.

Building operators is the platform-engineering equivalent of building compilers in software engineering. It requires deep understanding of the target domain, careful design of the CRD interface, robust handling of failure modes, and mature software engineering practices. Building one well is a signal of platform-engineering maturity. Building two is a demonstration that the pattern is internalized.

The failure modes are worth naming. Operators that reinvent basic K8s primitives (a “custom Deployment” operator) add ceremony without value. Operators that don’t handle failure gracefully leave orphaned resources or stuck reconciliation. Operators that hide too much prevent debugging when things go wrong. The senior discipline: operators only for domains that genuinely need domain-specific automation, with reconciliation loops that handle failure cleanly, and observability that lets operators be debugged like any other workload.

Modern platforms make operator development accessible. kubebuilder and Operator SDK scaffold operator projects in minutes. controller-runtime provides high-level abstractions over the K8s API. Testing frameworks (envtest, kuttl) support operator TDD. What used to be arcane Kubernetes-internals expertise is now well-documented practice.

Depth progression

STUB     ← you are here.
OUTLINE  Promoted when Y3 Phase 20 introduces operators as a category of K8s component.
DEEP     Promoted after Y5 Phase 50 — by then you've BUILT two operators (Phase 26,
         Phase 50) and operated 8-10 third-party operators on basecamp.

Preview: what OUTLINE will answer

When Y3 Phase 20 promotes this entry to OUTLINE, it will name:

  • PROBLEM. How do you encode operational knowledge in a form that scales across an organization?
  • PRINCIPLES. Domain concept becomes a CRD. Controller reconciles CRD instances into underlying primitives. Reconciliation is idempotent, level-triggered, and rate-limited. Status subresource reports actual state. Finalizers handle cleanup. Webhooks (validating, mutating) enforce invariants at admission.
  • TRADE-OFFS. Build vs buy (existing operator vs custom). CRD granularity (fine = flexible, coarse = simple). Push vs pull reconciliation. In-cluster vs out-of-cluster deployment. Language choice (Go dominant, but Rust and Python operators exist).
  • TOOLS (time-stamped as of 2026-06): kubebuilder (scaffold operators in Go), Operator SDK (Red Hat’s alternative), controller-runtime (Go library), Metacontroller (author operators in any language), Python operators (kopf), Rust operators (kube-rs).

The DEEP promotion, after Y5 Phase 50 with two operators built, will add MASTERY (operating the two custom operators for months, plus 8-10 third-party operators on basecamp), COMPARE (platform-ctl operator design vs aiops operator design), OPERATE (a specific operator-related incident and its resolution), and CONTRIBUTE (an upstream contribution to kubebuilder, controller-runtime, or a specific operator project).

Canonical references

  • kubebuilder book. Free at book.kubebuilder.io. The canonical operator-authoring reference.
  • Operator SDK documentation. Free at sdk.operatorframework.io.
  • controller-runtime documentation. Free at pkg.go.dev/sigs.k8s.io/controller-runtime.
  • CNCF operator whitepaper. Free at CNCF.
  • Programming Kubernetes (Michael Hausenblas, Stefan Schimanski, O’Reilly) — practical operator engineering reference.

Cross-references