Platform Engineering & Data

Months 25-36. Add observability at depth (kernel-level via eBPF) + the data engineering layer (lakehouse, stream + batch, serving, governance) on top of Year 2’s platform. By year-end, basecamp goes public — and your “Homelab life API” runs end-to-end on it. Exit ramp: Senior DevOps / Data Platform Engineer

The role of Year 3

Year 3 is the inflection point where the platform stops being something you run and starts being something you offer. Year 1 gave you a single-machine intuition; Year 2 made that intuition distributed and multi-cloud. Year 3 is what turns the resulting infrastructure into a product surface — observable from the kernel up, durable at the storage tier, queryable from a notebook, and credible enough to put github.com/abukix/basecamp in someone else’s hands.

The Master Plan calls out three transitions where the role identity changes. Year 3 sits squarely on the second one: platform-as-tool → platform-as-product. Tier 3 (Lakehouse) and Tier 4 (Processing) of basecamp come online, Tier 8 (Data Serving) gets its first occupant, and the JupyterHub-as-a-service entry point makes the platform feel like something a user logs into rather than something you SSH into.

It’s also the year observability graduates from “Prometheus + Grafana, mostly working” to a discipline. eBPF gives you kernel-level events without sidecar instrumentation. OpenTelemetry unifies traces/logs/metrics under a single emission contract. And cardinality discipline — the unglamorous one — is what keeps the bill from tripling every time you add a label.

What you’ll know at the end of Year 3

Observability at depth — three pillars (metrics, logs, traces) unified under OpenTelemetry; kernel-level events via eBPF without per-service instrumentation; cardinality-as-cost discipline so labels don’t silently 10x storage; SLI/SLO from Y2 reinforced with telemetry that actually maps to user pain.
Lakehouse architecture — MinIO + Iceberg + Nessie as the open-source 2025 stack; storage/compute separation as a first-class design choice; schema-on-read vs schema-on-write chosen per dataset, not per-team religion; snapshot-plus-delta (Iceberg snapshots) for time travel and rollback.
Stream + batch processing — Redpanda + Flink for stream, Spark + Airflow + dbt for batch, Lambda vs Kappa chosen as an architectural decision rather than a default. Delivery-semantics reinforced from Y2.
Data serving — Trino as the federated query layer over heterogeneous sources; Superset as the analytics frontend; query caching reinforced from Y1 fundamentals.
Data governance + lineage — DataHub or OpenMetadata as the catalog; OpenLineage as the cross-tool lineage contract; column-level masking; audit logs; dbt tests gating production deploys.
JupyterHub-as-a-service — notebooks as a first-class platform surface, not a one-off install. The entry point for every Studio composition recipe Y4/Y5 will land.
Operating a public platform — basecamp goes public mid-year. You’ll know what it takes to invite strangers to clone your infrastructure.

You’ll be deployable as a Senior DevOps / Data Platform Engineer or SRE with a data specialty. By end of Year 3 your homelab platform serves the same data engineering shape as production-scale platforms at Netflix, Spotify, Uber — same patterns, smaller scale.

Phase map

Phase	Title	Approx. weeks	Approx. hours	Pattern depth focus
14	Observability + eBPF	8	100	three-pillars-and-unified-telemetry, cardinality-as-cost, runbook-as-code, blameless-postmortem
15	Lakehouse: MinIO + Iceberg + JupyterHub	8	100	oltp-vs-olap (DEEP), schema-on-read-vs-write, append-only-log, snapshot-plus-delta
16	Stream Processing: Redpanda + Flink	8	90	stream-processing, lambda-and-kappa, delivery-semantics (reinforced)
17	Batch Processing: Spark + Airflow + dbt	8	90	batch-processing, materialized-views, idempotency (reinforced)
18	Data Serving: Trino + Superset	7	80	caching (reinforced from Y1)
19	Data Governance (capstone)	8	100	all Y3 patterns reach DEEP; blameless-postmortem (DEEP)
	Year 3 Final Exam	2	24	—
Total		~49 weeks	~584 hrs	~12 patterns deepened

12 hrs/week × 52 weeks = 624 hrs. Year 3 fits with ~40 hrs slack — by far the most slack of any year, because the data layer has the most “wait, actually” moments and you’ll need it.

What ships during Year 3 (the data tier of basecamp goes operational)

Year 3 is the year the data tier of basecamp comes online. Tier 3 (Lakehouse), Tier 4 (Processing), and Tier 8 (Serving) from the 9-tier stack all land here. By the end of P19, the platform is a real data engineering surface — and it goes public.

Project	Phase	Launch energy
`basecamp`	P19 (capstone)	GOES PUBLIC — sanitized via Sealed Secrets, README, blog post, “basecamp at end of Year 3” announcement. The biggest launch of the year.
`personal-api`	P17 + P18	First personal service running on the platform. GitHub commits → Airflow → Iceberg → Trino → REST API. Demonstrates the platform works for you.
`terralabs`	continuous	Adds data-infra modules (Redpanda K8s, MinIO operator, Iceberg catalog)
`platform-ctl`	continuous	Adds `cluster bootstrap` + data-pipeline subcommands; still private
`ops-handbook`	continuous	Runbooks for every new tier; first eBPF-driven postmortem; cardinality-budget ADR

personal-api is the year’s narrative payoff. It proves the platform isn’t a portfolio piece — it actually runs your stuff. Cinematic content writes itself: “I logged 5 years of GitHub activity into my own homelab lakehouse and queried it with Trino.”

The basecamp public release is the credibility moment. Up to now, basecamp has been a private repo. At P19 it gets sanitized (Sealed Secrets, no real domains, generic READMEs), tagged, and pushed to github.com/abukix/basecamp. From this point on, anyone can clone the platform and run an equivalent of it. That’s the moat described in the Master Plan — and Year 3 is when it becomes real.

Patterns deepened in Year 3

Roughly ~12 patterns reach DEEP this year — heavy on storage-and-data, observability-and-ops, and stream-vs-batch:

observability-and-ops/three-pillars-and-unified-telemetry (P14)
observability-and-ops/cardinality-as-cost (P14)
observability-and-ops/runbook-as-code (P14 — formalize the discipline that’s been building since Y1)
observability-and-ops/blameless-postmortem (P19 capstone)
storage-and-data/oltp-vs-olap (DEEP from Y1’s first OUTLINE)
storage-and-data/schema-on-read-vs-write (P15)
storage-and-data/append-only-log (P15 — Parquet as concrete example)
storage-and-data/snapshot-plus-delta (P15 — Iceberg snapshots)
storage-and-data/materialized-views (P17 dbt)
storage-and-data/lsm-vs-btree (Y1 OUTLINE → DEEP via real workload comparison)
storage-and-data/write-ahead-logging (Y1 OUTLINE → DEEP via Postgres + Iceberg WAL parallels)
stream-vs-batch/stream-processing (P16)
stream-vs-batch/batch-processing (P17)
stream-vs-batch/lambda-and-kappa (P16 → P17 architectural choice)

Each is promoted via the depth ladder described in the Master Plan: STUB → OUTLINE on first phase touch → DEEP after 3+ months of operating something that depends on the pattern. By Y3 end the cumulative DEEP count crosses ~35.

The Studio composition recipe that lands this year

“Homelab life API” — the Y3 composition recipe:

GitHub events → Airflow (P17) → Iceberg lakehouse (P15)
                                         ↓
                              Trino (P18) → REST API → portal command palette (Y5)

Every component lands across Y3 phases. By Y3 end, you can ask “how many commits did I make to mlship in March 2027?” and get an answer that ran through your own platform end-to-end.

This is also the first explicit composition recipe documented as a runnable example in basecamp/examples/. The other 4 recipes (RAG, AI-incident-triage, train+register+deploy, AI on-call) land in Y4/Y5 — but they all build on the lakehouse + serving layer this year stands up.

Cloud requirements

Year 3 cloud spend: $0
  All work runs on the homelab + cloud accounts you already have from Y2.
  Year 3 is a "deepen what you have" year, not a "spread to new clouds" year.

The exception: if your homelab can’t fit MinIO + Iceberg + Spark + Airflow + Trino simultaneously (it probably can on 64GB at the Month-25 upgrade), you’ll need to either upgrade or temporarily lean on cloud resources for one workload. Plan ahead.

Hardware milestone: Month 25 upgrade

RAM:     32GB → 64GB DDR5 (sell 32GB kit for ~$40)
Cost:    ~$150-200 net
Why:     Year 3 stack is RAM-hungry. MinIO + Postgres + Spark + Airflow + Trino
         + Loki + Prometheus + dbt + JupyterHub does not fit in 32GB.

Schedule the upgrade for week 1 of Phase 14 — observability tooling alone wants ~12GB. Full SKU breakdown and sourcing notes: homelab/hardware.

Reading order

This index — orient on the year’s arc and what crosses the public-release line.
phase-14 — observability is the foundation everything else stands on. eBPF + OTel + cardinality discipline before anything else gets added to the platform.
phase-15 — the lakehouse is the data-layer foundation. JupyterHub lands here too, since notebooks are how you’ll actually exercise the lakehouse.
phase-16, phase-17, phase-18 in order — stream, then batch, then serving. They each consume the lakehouse from P15.
phase-19 — capstone + basecamp public release. Governance is what makes the public release responsible rather than reckless.
final-exam ~2 weeks before end of P19.

DDIA Ch. 10 (Batch Processing) and Ch. 11 (Stream Processing) are Year 3’s reading spine. Pace 1 chapter / 4 weeks alongside the phase work.

Year 3 graduation

You can:
- Design + operate observability for a multi-cluster platform (incl. eBPF kernel-level events)
- Architect + run a lakehouse (MinIO + Iceberg)
- Build streaming pipelines (Redpanda + Flink) with exactly-once-ish semantics
- Build batch pipelines (Spark + Airflow + dbt) with backfill discipline
- Federate queries across data sources (Trino) + serve to analysts (Superset)
- Govern data (catalog, lineage, access control, quality tests)
- Run notebooks-as-a-service (JupyterHub on the platform)
- Operate a public OSS platform (basecamp now public)

Exit ramp: Senior DevOps / Data Platform Engineer / Site Reliability Engineer
Confidence: ~35 patterns DEEP, multi-cloud platform operational, basecamp public,
            personal-api running on your own platform

→ program/year-4/index.md — ML & AI Infrastructure builds Tiers 5/6/7 on top of the data tier you just stood up.