Platform Portal + Governance: Abukix Studio launches

Fourth phase of Year 5. The unified Web UI that ties every layer together. Abukix Studio launches publicly at studio.abukix.dev. Plus AI security: prompt injection defense, output filtering, signed model verification. ~10 weeks, ~120 hrs.

Phase 29 is the public-launch phase of Year 5. P26, P27, and P28 built the agent layer; P29 wraps it (and every prior tier) in a coherent product surface and ships it as a public artifact — three simultaneous launches in 10 weeks: Abukix Studio at studio.abukix.dev, platform-ctl going public after living private since Year 2, and the services/aiops/ marketing push. See brand/identity for how Studio and the program are positioned together.

The frame the Master Plan sets for this phase: the portal is integration, not invention. Year 1-4 built the components; this phase wires them into something a stranger can land on, poke at, and conclude reasons in patterns. The other half of P29 is AI security — prompt injection defense, output filtering, capability allowlisting, signed model verification — because launching an LLM-fronted public surface without OWASP-LLM-Top-10 controls is how you turn a brand into a postmortem.

By phase end, three patterns reach DEEP: infrastructure-and-platform/platform-as-product (the proof Studio is a product, not a stack), security/zero-trust-security (applied to the AI surface), and security/threat-modeling (applied to the agent attack surface). See patterns/ml-and-ai/ for the framing on AI security as a discipline.

Prerequisites

Phase 28 complete — services/aiops/ live in production

basecamp Tier 1-9 all operational

You accept: the portal is integration, not invention. Year 1-4 built the components; this phase wires them into a coherent product. AI security is co-equal — applied to the LLM surface with the same rigor as Y2 platform security.

Why this phase exists

By end of Year 5 you exit as Staff/Principal AI Platform Engineer. The portal is the artifact that proves “I can architect AND ship an internal platform product end-to-end.” This phase ships Abukix Studio as a public surface.

Plus: AI security is the discipline that makes Year 5 launches not embarrass you. Prompt injection defense, output filtering, capability allowlisting (already from P27), signed model verification — these are the OWASP-LLM-Top-10 controls applied to your platform.

1. PROBLEM

basecamp has 30+ services across 9 tiers. Users (you + future-collaborators + visitors at studio.abukix.dev) need:

One UI surfacing everything (services, models, agents, pipelines, alerts)
Self-service workflows (create service, run pipeline, deploy model — already enabled by Backstage + platform-ctl; surface in portal)
Composition recipes runnable via UI (the 5 recipes documented in Y3-Y4 + this phase’s recipe 5)
Governance (policies, approval, audit)
AI security (prompt injection, model exfil, MCP server sandboxing)

Abukix Studio is the answer.

2. PRINCIPLES

2.1 Portal as platform UX

Backstage (from Year 2) is the catalog. The portal extends Backstage with:

Cloud view (Proxmox + EKS + GKE health)
Infrastructure view (basecamp ArgoCD state)
Studio view — the user-visible product (notebooks, ML, agents, composition recipes)
command palette AI assistant (uses query-helper agent from P26 + basecamp-mcp from P27)

→ Pattern: platform-as-product (reinforced — Studio is the product)

Investigate:

Map Abukix Studio’s structure: Cloud / Infrastructure / Studio (notebooks/ML/agents) / command palette
Build the homelab equivalent of an enterprise developer portal pattern
What does a visitor at studio.abukix.dev see? Define the public-vs-private surface

2.2 Approval workflows

For sensitive operations (destructive actions, policy exceptions, prompt-store changes), require approval.

Investigate:

Build approval workflow: request → reviewer assigned → approve/deny → execute
Wire to platform-ctl: high-risk ops route through approval
Wire to aiops: destructive runbook steps route through approval (already from P28)

2.3 AI security: the OWASP LLM Top 10 applied

→ Pattern: zero-trust-security (DEEP for AI surface) → Pattern: threat-modeling (DEEP — apply to the agent attack surface)

The 10 LLM threats: prompt injection, insecure output handling, training data poisoning, model DoS, supply chain, sensitive info disclosure, insecure plugin design, excessive agency, overreliance, model theft.

Investigate:

Read OWASP Top 10 for LLM Applications (current version)
Map each to a portal-level control:
1. Prompt injection — input filter on llm-gateway (regex + classifier)
2. Insecure output — output filter (PII redaction, code-execution prevention)
3. Training data poisoning — verify model checksums on load
4. Model DoS — rate limit + token budget (already from Y4 P24)
5. Supply chain — Cosign-verify model weights (extends Y2 P12)
6. Sensitive info disclosure — output redaction; audit logs
7. Insecure plugin/MCP — capability allowlist (P27)
8. Excessive agency — approval gates (P28)
9. Overreliance — eval + human-in-loop discipline
10. Model theft — auth + rate limit + watermarking (preview)

2.4 Service registry + ownership at scale

Every service has: owner, tier, dependencies, SLO, runbook, on-call, deployment lineage.

Investigate:

Extend Backstage catalog with custom plugins for ML/agents
Surface in portal: “who owns this service?” answerable in 1 click
Lineage view: “what depends on llm-gateway?” — graph view

2.5 The hosted demo at studio.abukix.dev

The public-facing surface. Visitors (limited, free, rate-limited) can:

Run a demo notebook (preloaded data, sandboxed K8s namespace, 10-min timeout)
Try a small RAG demo over public sample data
Deploy a tiny model via mlship (preview only — read-only execution)
Talk to command palette agent (read-only, sandboxed tools)

CPU-only models (Phi-3-mini quantized via llama.cpp). No GPU. Cost cap: $30-50/month.

Investigate:

Cloudflare Pages for the static frontend
Cloud Run / Fargate for API tier (small, scale-to-zero)
Per-IP rate limit
Demo data (sanitized) seeded from your real abukix.commits + sample weekly logs
“Sign up to clone basecamp and run your own” CTA at the bottom

3. TRADE-OFFS

Decision	Option A	Option B	When
Frontend	Backstage extended	Next.js custom	both
Approval	Backstage TaskRunner	GitHub PR	custom workflow
Audit	Loki (logs)	Postgres (queryable)	Both
Demo hosting	Cloudflare Pages + Workers	Self-host on basecamp	Cloudflare for the frontend; basecamp for the demo backend

4. TOOLS (as of 2025-10)

Backstage (extended, from Y2)
Next.js + Tailwind CSS (custom Studio surface)
Cloudflare Pages (frontend hosting)
OPA Gatekeeper OR Kyverno (policy enforcement; already from Y2)

5. MASTERY

5.1 Reading list

Required	Why
Backstage Plugin documentation (if extending)	The implementation
OWASP LLM Top 10 (current version)	The threats
NIST AI Risk Management Framework	The discipline
Anthropic’s Acceptable Use Policy + Trust & Safety blog	The pattern

5.2 Operational depth checklist

[ ] Abukix Studio frontend (Next.js + Tailwind) deployed to Cloudflare Pages
[ ] Backstage extended with custom Studio plugin (ML + agents views)
[ ] Portal home: surfaces overall platform health, recent runs, alerts
[ ] Per-service page: runbook + on-call + SLO + recent deploys (Backstage already does this)
[ ] ML/agents view: registered models, deployed agents, current health
[ ] Approval workflow: request → review → execute
[ ] AI security: prompt injection detection on llm-gateway input
[ ] AI security: output filter (PII redaction)
[ ] AI security: signed model weight verification on load
[ ] AI security: per-agent capability + audit log review
[ ] command palette agent in portal (uses query-helper from P26 via basecamp-mcp from P27)
[ ] Composition recipe 5 (AI-assisted on-call) lands as runnable example
[ ] home-dash personal service: pulls from triage + personal-api + notes-rag + aiops; surfaces in command palette
[ ] studio.abukix.dev public hosted demo: read-mostly, rate-limited, $30-50/month cap
[ ] platform-ctl: goes PUBLIC (with sanitized internal references)
[ ] Service registry with owner, tier, SLO, runbook per service

5.3 The Year 5 launches

This phase has THREE simultaneous public launches:

1. Abukix Studio (studio.abukix.dev)
   - Hosted demo of the platform
   - Blog post: "Abukix Studio — what 5 years built"
   - LinkedIn announcement
   - Show HN ("the homelab AI platform you can clone or visit")
   - Demo video

2. platform-ctl (github.com/abukix/platform-ctl)
   - Goes from private to public
   - README + 5 example workflows
   - Blog post: "platform-ctl: one CLI for the platform"
   - LinkedIn announcement

3. services/aiops/ documentation push
   - Already public via basecamp; this is the marketing
   - Blog post: "AIOps in practice: an agent that operates the platform you built"
   - LinkedIn announcement

Three distinct launches in 10 weeks. Each gets its own day; no collisions.

These set the stage for the big one in P30 — mlship v2 + pattern paper.

6. COMPARE: Backstage extension vs Next.js custom

If you start fresh in Next.js you get UX freedom but lose the Backstage catalog. If you extend Backstage you keep the catalog but accept its UX.

For Abukix Studio: 600 words, defend the choice. The likely answer is both — Backstage for the internal/operator surface (catalog, lineage, approvals) and a Next.js Studio shell for the public-facing demo at studio.abukix.dev. Argue the boundary.

7. OPERATE

4+ runbooks (portal-incident, ai-security-event-investigation, studio-demo-cost-spike, cmd-k-agent-misfire)
2+ postmortems
Weekly log

8. CONTRIBUTE

Backstage plugins (custom MLflow plugin?), MCP-server-discovery extensions, OWASP LLM resources.

Validation criteria

[ ] All 14 operational depth checks
[ ] Portal live with: Home / Catalog / ML+Agents / Approvals / Studio (public) / command palette
[ ] AI security controls verified (block prompt injection, redact PII, verify signed weights)
[ ] studio.abukix.dev public + cost-capped
[ ] platform-ctl public with launch
[ ] aiops marketing push done
[ ] home-dash personal service running
[ ] Composition recipe 5 (AI-assisted on-call) runnable
[ ] 4+ runbooks; 2+ postmortems
[ ] All security patterns DEEP:
    - zero-trust-security, threat-modeling
[ ] Exit Test passed

Exit Test

Time: 3 hours.

Build (90 min): add a new view to portal: “cost-per-model” aggregated across all KServe deployments. Pull from Prometheus + MLflow. Surface in Studio.
AI security exercise (60 min): defend llm-gateway against 5 attempted prompt injections (provided). Block all 5 + document each.
Articulate (30 min): 600 words: “Walk through the security stack of llm-gateway: from user prompt through to model response. What controls fire where?”

Anti-patterns

Anti-pattern	Why
Studio launch without rate limits	$1000+ cloud bill in week 1
AI security retrofitted after launch	Pre-launch is when threats are cheap to fix
Custom frontend “for design freedom” with no design budget	Worse UX than Backstage default
Approval workflow no one reviews	”Approval” becomes rubber stamp
Forgetting to lock down studio.abukix.dev demo to CPU-only	GPU spam empties wallet

Patterns deepened this phase

platform-as-product → reinforced
zero-trust-security → DEEP
threat-modeling → DEEP

→ Next: Phase 30: Capstone — mlship v2 + Pattern Paper