lg
LLM Routing

llm-gateway

One endpoint. Every model.

llm-gateway is the LLM proxy that fronts every model call in basecamp. Uniform request shape (OpenAI-compatible), per-caller rate limiting, per-model cost tracking, and pluggable routing to any provider or a local vLLM instance.

Capabilities

What you can do

01

OpenAI-compatible interface

Any OpenAI SDK client works. No provider-specific SDKs needed.

02

Multi-provider routing

Anthropic, OpenAI, Cohere, local vLLM — pick per request or per caller.

03

Rate limits per caller

Prevent one runaway agent from burning through the org's daily budget.

04

Cost tracking

Every token attributed. Slack alert when a caller crosses a threshold.

05

Streaming everywhere

Server-sent events end-to-end. No buffering, no head-of-line blocking.

06

Prompt caching

Anthropic-style prompt caching passed through for the providers that support it.

Explore the rest of /root

Get started with llm-gateway

Clone the repo, read the plan, and start building your own version.

All projects