ai
AI Operations

aiops

AI-augmented on-call. Not autonomous chaos.

aiops applies LLMs to operations tasks that are boring, tedious, and rules-based: incident summarization, log clustering, runbook selection, alert-quality assessment. Every action is auditable. Nothing runs without a human OK.

Capabilities

What you can do

01

Incident summaries

When triage pages, aiops generates a first-cut summary from the trace and logs.

02

Log clustering

Similar errors get grouped. You see 5 patterns, not 500 lines.

03

Runbook retrieval

Given an alert, aiops finds the closest runbook and suggests it as a starting point.

04

Alert quality

After each incident, aiops rates the alerts: which fired usefully, which were noise.

05

Fully audited

Every LLM call is logged with input, output, model, and cost. Reviewable per incident.

06

No autonomous action

aiops proposes. A human accepts. There is no auto-remediation, by design.

Explore the rest of /root

Get started with aiops

Clone the repo, read the plan, and start building your own version.

All projects