Now in public beta

The fast lane to
production AI

Gateway, SDK, and observability for LLM-powered applications. Rate limiting, caching, cost tracking, and deep tracing — all in one.

99.9%
Uptime SLA
2 SDKs
Python & TS
3 modes
Gateway, SDK, both
ProvidersOpenAIAnthropicGoogleMistralsoonGroqsoon

Integration

Choose your integration mode

Start simple, add depth when you need it.

Gateway Only

Zero code changes

Point your existing SDK to our gateway. Instant observability, caching, and rate limiting.

  • Universal LLM proxy
  • Response caching
  • Rate limiting
  • Cost tracking

SDK Only

Deep tracing

Wrap your LLM clients for detailed traces and spans. Zero latency with background batching.

  • Trace hierarchies
  • @observe decorators
  • Span nesting
  • Background batching

SDK + Gateway

Full power

Combine SDK tracing with gateway features. Deep observability AND infrastructure benefits.

  • All features combined
  • Unified logging
  • Cache hit tracking
  • Gateway timing

Features

Everything for production AI

Universal Gateway

Single endpoint for all LLM providers

Response Caching

Semantic caching reduces costs

Rate Limiting

Per-user and per-key limits

Deep Tracing

Hierarchical traces and spans

Cost Tracking

Real-time cost per request

Live Logs

Search and filter instantly

Code Examples

Integrate in seconds

import OpenAI from "openai";

// Just change the base URL - that's it!
const client = new OpenAI({
  baseURL: "https://gateway.muxx.dev/v1",
  apiKey: process.env.MUXX_API_KEY,
});

// Your existing code works instantly
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Try it

See Muxx in action

Run twice to see caching

Request

Response

Click "Run Request" to see response

Latency
Tokens
Cost
Cache

Ready to ship AI with confidence?

Join developers who trust Muxx in production.

Free tier: 10,000 requests/month. No credit card required.