Universal LLM API Proxy & Resilience Library

One proxy. Any LLM provider. Zero code changes.

A self-hosted proxy that provides OpenAI and Anthropic compatible API endpoints for all your LLM providers. Works with any application that supports custom OpenAI or Anthropic base URLs—including Claude Code, Opencode, and more—no code changes required in your existing tools.

This project consists of two components:

The API Proxy — A FastAPI application providing universal /v1/chat/completions (OpenAI) and /v1/messages (Anthropic) endpoints
The Resilience Library — A reusable Python library for intelligent API key management, rotation, and failover

Why Use This?

Universal Compatibility — Works with any app supporting OpenAI or Anthropic APIs: Claude Code, Opencode, Continue, Roo/Kilo Code, Cursor, JanitorAI, SillyTavern, custom applications, and more
One Endpoint, Many Providers — Configure Gemini, OpenAI, Anthropic, and any LiteLLM-supported provider once. Access them all through a single API key
Anthropic API Compatible — Use Claude Code or any Anthropic SDK client with non-Anthropic providers like Gemini, OpenAI, or custom models
Built-in Resilience — Automatic key rotation, failover on errors, rate limit handling, and intelligent cooldowns
Exclusive Provider Support — Includes custom providers not available elsewhere, including Gemini CLI

Quick Start

Windows

Download the latest release from GitHub Releases
Unzip the downloaded file
Run proxy_app.exe — the interactive TUI launcher opens

macOS / Linux

# Download and extract the release for your platform
chmod +x proxy_app
./proxy_app

Docker

Using the pre-built image (recommended):

# Pull and run directly
docker run -d \
  --name llm-api-proxy \
  -p 8000:8000 \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/oauth_creds:/app/oauth_creds \
  -v $(pwd)/logs:/app/logs \
  -v $(pwd)/usage:/app/usage \
  -e SKIP_OAUTH_INIT_CHECK=true \
  ghcr.io/mirrowel/llm-api-key-proxy:latest

Using Docker Compose:

# Create your .env file and usage directory first, then:
cp .env.example .env
mkdir usage
docker compose up -d

Important: Create the usage/ directory before running Docker Compose so usage stats persist on the host.

Note: For OAuth providers, complete authentication locally first using the credential tool, then mount the oauth_creds/ directory or export credentials to environment variables.

From Source

git clone https://github.com/Mirrowel/LLM-API-Key-Proxy.git
cd LLM-API-Key-Proxy
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
python src/proxy_app/main.py

Tip: Running with command-line arguments (e.g., --host 0.0.0.0 --port 8000) bypasses the TUI and starts the proxy directly.

Connecting to the Proxy

Once the proxy is running, configure your application with these settings:

Setting	Value
Base URL / API Endpoint	`http://127.0.0.1:8000/v1`
API Key	Your `PROXY_API_KEY`

Model Format: `provider/model_name`

Important: Models must be specified in the format provider/model_name. The provider/ prefix tells the proxy which backend to route the request to.

gemini/gemini-2.5-flash          ← Gemini API
openai/gpt-4o                    ← OpenAI API
anthropic/claude-3-5-sonnet      ← Anthropic API
openrouter/anthropic/claude-3-opus  ← OpenRouter
gemini_cli/gemini-2.5-pro        ← Gemini CLI (OAuth)

Usage Examples

Python (OpenAI Library)

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key="your-proxy-api-key"
)

response = client.chat.completions.create(
    model="gemini/gemini-2.5-flash",  # provider/model format
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

curl

curl -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-api-key" \
  -d '{
    "model": "gemini/gemini-2.5-flash",
    "messages": [{"role": "user", "content": "What is the capital of France?"}]
  }'

JanitorAI / SillyTavern / Other Chat UIs

Go to API Settings
Select "Proxy" or "Custom OpenAI" mode
Configure:
- API URL: http://127.0.0.1:8000/v1
- API Key: Your PROXY_API_KEY
- Model: provider/model_name (e.g., gemini/gemini-2.5-flash)
Save and start chatting

Continue / Cursor / IDE Extensions

In your configuration file (e.g., config.json):

{
  "models": [
    {
      "title": "Gemini via Proxy",
      "provider": "openai",
      "model": "gemini/gemini-2.5-flash",
      "apiBase": "http://127.0.0.1:8000/v1",
      "apiKey": "your-proxy-api-key"
    }
  ]
}

Claude Code

Claude Code natively supports custom Anthropic API endpoints. The recommended setup is to edit your Claude Code settings.json:

{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "your-proxy-api-key",
    "ANTHROPIC_BASE_URL": "http://127.0.0.1:8000",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "gemini/gemini-3-pro",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "gemini/gemini-3-flash",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "openai/gpt-5-mini"
  }
}

Now you can use Claude Code with Gemini, OpenAI, or any other configured provider.

Anthropic Python SDK

from anthropic import Anthropic

client = Anthropic(
    base_url="http://127.0.0.1:8000",
    api_key="your-proxy-api-key"
)

# Use any provider through Anthropic's API format
response = client.messages.create(
    model="gemini/gemini-3-flash",  # provider/model format
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.content[0].text)

API Endpoints

Endpoint	Description
`GET /`	Status check — confirms proxy is running
`POST /v1/chat/completions`	Chat completions (OpenAI format)
`POST /v1/messages`	Chat completions (Anthropic format) — Claude Code compatible
`POST /v1/messages/count_tokens`	Count tokens for Anthropic-format requests
`POST /v1/embeddings`	Text embeddings
`GET /v1/models`	List all available models with pricing & capabilities
`GET /v1/models/{model_id}`	Get details for a specific model
`GET /v1/providers`	List configured providers
`POST /v1/token-count`	Calculate token count for a payload
`POST /v1/cost-estimate`	Estimate cost based on token counts

Tip: The /v1/models endpoint is useful for discovering available models in your client. Many apps can fetch this list automatically. Add ?enriched=false for a minimal response without pricing data.

Managing Credentials

The proxy includes an interactive tool for managing all your API keys and OAuth credentials.

Using the TUI

Run the proxy without arguments to open the TUI
Select "🔑 Manage Credentials"
Choose to add API keys or OAuth credentials

Using the Command Line

python -m rotator_library.credential_tool

Credential Types

Type	Providers	How to Add
API Keys	Gemini, OpenAI, Anthropic, OpenRouter, Groq, Mistral, NVIDIA, Cohere, Chutes	Enter key in TUI or add to `.env`
OAuth	Gemini CLI	Interactive browser login via credential tool

The `.env` File

Credentials are stored in a .env file. You can edit it directly or use the TUI:

# Required: Authentication key for YOUR proxy
PROXY_API_KEY="your-secret-proxy-key"

# Provider API Keys (add multiple with _1, _2, etc.)
GEMINI_API_KEY_1="your-gemini-key"
GEMINI_API_KEY_2="another-gemini-key"
OPENAI_API_KEY_1="your-openai-key"
ANTHROPIC_API_KEY_1="your-anthropic-key"

Copy .env.example to .env as a starting point.

The Resilience Library

The proxy is powered by a standalone Python library that you can use directly in your own applications.

Key Features

Async-native with asyncio and httpx
Intelligent key selection with tiered, model-aware locking
Deadline-driven requests with configurable global timeout
Automatic failover between keys on errors
OAuth support for Gemini CLI
Stateless deployment ready — load credentials from environment variables

Basic Usage

from rotator_library import RotatingClient

client = RotatingClient(
    api_keys={"gemini": ["key1", "key2"], "openai": ["key3"]},
    global_timeout=30,
    max_retries=2
)

async with client:
    response = await client.acompletion(
        model="gemini/gemini-2.5-flash",
        messages=[{"role": "user", "content": "Hello!"}]
    )

Library Documentation

See the Library README for complete documentation including:

All initialization parameters
Streaming support
Error handling and cooldown strategies
Provider plugin system
Credential prioritization

Interactive TUI

The proxy includes a powerful text-based UI for configuration and management.

TUI Features

🚀 Run Proxy — Start the server with saved settings
⚙️ Configure Settings — Host, port, API key, request logging, raw I/O logging
🔑 Manage Credentials — Add/edit API keys and OAuth credentials
📊 View Provider & Advanced Settings — Inspect providers and launch the settings tool
📈 View Quota & Usage Stats (Alpha) — Usage, quota windows, fair-cycle status
🔄 Reload Configuration — Refresh settings without restarting

Configuration Files

File	Contents
`.env`	All credentials and advanced settings
`launcher_config.json`	TUI-specific settings (host, port, logging)
`quota_viewer_config.json`	Quota viewer remotes + per-provider display toggles
`usage/usage_<provider>.json`	Usage persistence per provider

Features

Core Capabilities

Universal OpenAI-compatible endpoint for all providers
Multi-provider support via LiteLLM fallback
Automatic key rotation and load balancing
Interactive TUI for easy configuration
Detailed request logging for debugging

🛡️ Resilience & High Availability

Global timeout with deadline-driven retries
Escalating cooldowns per model (10s → 30s → 60s → 120s)
Key-level lockouts for consistently failing keys
Stream error detection and graceful recovery
Batch embedding aggregation for improved throughput
Automatic daily resets for cooldowns and usage stats

🔑 Credential Management

Auto-discovery of API keys from environment variables
OAuth discovery from standard paths (~/.gemini/)
Duplicate detection warns when same account added multiple times
Credential prioritization — paid tier used before free tier
Stateless deployment — export OAuth to environment variables
Local-first storage — credentials isolated in oauth_creds/ directory

⚙️ Advanced Configuration

Model whitelists/blacklists with wildcard support
Per-provider concurrency controls (OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER> and MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>)
Rotation modes — balanced (distribute load) or sequential (use until exhausted)
Priority multipliers — higher concurrency for paid credentials
Model quota groups — shared cooldowns for related models
Temperature override — prevent tool hallucination issues
Weighted random rotation — unpredictable selection patterns

🔌 Provider-Specific Features

Gemini CLI:

Zero-config Google Cloud project discovery
Internal API access with higher rate limits
Automatic fallback to preview models on rate limit
Paid vs free tier detection

NVIDIA NIM:

Dynamic model discovery
DeepSeek thinking support

📝 Logging & Debugging

Per-request file logging with --enable-request-logging
Raw I/O logging with --enable-raw-logging (proxy boundary payloads)
Unique request directories with full transaction details
Streaming chunk capture for debugging
Performance metadata (duration, tokens, model used)
Provider-specific logs for active custom providers

Advanced Configuration

Environment Variables Reference

Proxy Settings

Variable	Description	Default
`PROXY_API_KEY`	Authentication key for your proxy	Required
`OAUTH_REFRESH_INTERVAL`	Token refresh check interval (seconds)	`600`
`SKIP_OAUTH_INIT_CHECK`	Skip interactive OAuth setup on startup	`false`

Per-Provider Settings

Pattern	Description	Example
`<PROVIDER>_API_KEY_<N>`	API key for provider	`GEMINI_API_KEY_1`
`OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`	Soft spread-before-stacking target	`OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_OPENAI=1`
`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`	Hard concurrent request ceiling (`<=0` means unlimited)	`MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI=-1`
`ROTATION_MODE_<PROVIDER>`	`balanced` or `sequential`	`ROTATION_MODE_GEMINI=sequential`
`IGNORE_MODELS_<PROVIDER>`	Blacklist (comma-separated, supports `*`)	`IGNORE_MODELS_OPENAI=-preview`
`WHITELIST_MODELS_<PROVIDER>`	Whitelist (overrides blacklist)	`WHITELIST_MODELS_GEMINI=gemini-2.5-pro`

Advanced Features

Variable	Description
`ROTATION_TOLERANCE`	`0.0`=deterministic, `3.0`=weighted random (default)
`CONCURRENCY_MULTIPLIER_<PROVIDER>_PRIORITY_<N>`	Concurrency multiplier per priority tier
`QUOTA_GROUPS_<PROVIDER>_<GROUP>`	Models sharing quota limits
`OVERRIDE_TEMPERATURE_ZERO`	`remove` or `set` to prevent tool hallucination
`GEMINI_CLI_QUOTA_REFRESH_INTERVAL`	Quota baseline refresh interval in seconds (default: 300)

Model Filtering (Whitelists & Blacklists)

Control which models are exposed through your proxy.

Blacklist Only

# Hide all preview models
IGNORE_MODELS_OPENAI="*-preview*"

Pure Whitelist Mode

# Block all, then allow specific models
IGNORE_MODELS_GEMINI="*"
WHITELIST_MODELS_GEMINI="gemini-2.5-pro,gemini-2.5-flash"

Exemption Mode

# Block preview models, but allow one specific preview
IGNORE_MODELS_OPENAI="*-preview*"
WHITELIST_MODELS_OPENAI="gpt-4o-2024-08-06-preview"

Logic order: Whitelist check → Blacklist check → Default allow

Concurrency & Rotation Settings

Concurrency Limits

# Balanced mode defaults to optimal=1 and max=-1, so it spreads first
# but will stack on busy keys instead of blocking when every key is busy.
ROTATION_MODE_OPENAI=balanced
OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_OPENAI=1
MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI=-1

# Sequential mode defaults to optimal=-1 and max=-1 for sticky/unlimited use.
ROTATION_MODE_GEMINI=sequential
OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_GEMINI=-1
MAX_CONCURRENT_REQUESTS_PER_KEY_GEMINI=-1

# Constrained providers can set optimal and max to the same value.
MAX_CONCURRENT_REQUESTS_PER_KEY_GEMINI_CLI=1
OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_GEMINI_CLI=1

# Mode-specific forms override provider-wide values only for that mode.
MAX_CONCURRENT_REQUESTS_PER_KEY_OPENAI_BALANCED=-1
OPTIMAL_CONCURRENT_REQUESTS_PER_KEY_OPENAI_BALANCED=1

optimal is a soft target used for capacity phases: the rotator prefers credentials below optimal, then stacks on healthy credentials when no below-optimal credential remains. max is the hard safety ceiling; 0 or any negative value means unlimited.

Rotation Modes

# sequential (default): Use one key until it errors/exhausts, preserving provider-side cache locality
ROTATION_MODE_GEMINI=sequential

# balanced: Distribute load evenly - opt in for per-minute rate limits
ROTATION_MODE_OPENAI=balanced

Priority Multipliers

Paid credentials can handle more concurrent requests. Legacy priority multipliers apply to hard max concurrency; provider-specific optimal multipliers can also raise the soft target where supported:

# Priority 1: 10x concurrency
CONCURRENCY_MULTIPLIER_GEMINI_CLI_PRIORITY_1=10

# Priority 2: 3x
CONCURRENCY_MULTIPLIER_GEMINI_CLI_PRIORITY_2=3

Model Quota Groups

Models sharing quota limits:

# Example: group provider models that share quota
QUOTA_GROUPS_GEMINI_CLI_PRO="gemini-2.5-pro,gemini-3-pro-preview"

Timeout Configuration

Fine-grained control over HTTP timeouts:

TIMEOUT_CONNECT=30              # Connection establishment
TIMEOUT_WRITE=30                # Request body send
TIMEOUT_POOL=60                 # Connection pool acquisition
TIMEOUT_READ_STREAMING=180      # Between streaming chunks (3 min)
TIMEOUT_READ_NON_STREAMING=600  # Full response wait (10 min)

Recommendations:

Long thinking tasks: Increase TIMEOUT_READ_STREAMING to 300-360s
Unstable network: Increase TIMEOUT_CONNECT to 60s
Large outputs: Increase TIMEOUT_READ_NON_STREAMING to 900s+

OAuth Providers

Gemini CLI

Uses Google OAuth to access internal Gemini endpoints with higher rate limits.

Setup:

Run python -m rotator_library.credential_tool
Select "Add OAuth Credential" → "Gemini CLI"
Complete browser authentication
Credentials saved to oauth_creds/gemini_cli_oauth_1.json

Features:

Zero-config project discovery
Automatic free-tier project onboarding
Paid vs free tier detection
Smart fallback on rate limits
Quota baseline tracking with background refresh (accurate remaining quota estimates)
Sequential rotation mode (uses credentials until quota exhausted)

Quota Groups: Models that share quota are automatically grouped:

Pro: gemini-2.5-pro, gemini-3-pro-preview
2.5-Flash: gemini-2.0-flash, gemini-2.5-flash, gemini-2.5-flash-lite
3-Flash: gemini-3-flash-preview

All models in a group deplete the shared quota equally. 24-hour per-model quota windows.

Environment Variables (for stateless deployment):

Single credential (legacy):

GEMINI_CLI_ACCESS_TOKEN="ya29.your-access-token"
GEMINI_CLI_REFRESH_TOKEN="1//your-refresh-token"
GEMINI_CLI_EXPIRY_DATE="1234567890000"
GEMINI_CLI_EMAIL="your-email@gmail.com"
GEMINI_CLI_PROJECT_ID="your-gcp-project-id"  # Optional
GEMINI_CLI_TIER="standard-tier"  # Optional: standard-tier or free-tier

Multiple credentials (use _N_ suffix where N is 1, 2, 3...):

GEMINI_CLI_1_ACCESS_TOKEN="ya29.first-token"
GEMINI_CLI_1_REFRESH_TOKEN="1//first-refresh"
GEMINI_CLI_1_EXPIRY_DATE="1234567890000"
GEMINI_CLI_1_EMAIL="first@gmail.com"
GEMINI_CLI_1_PROJECT_ID="project-1"
GEMINI_CLI_1_TIER="standard-tier"

GEMINI_CLI_2_ACCESS_TOKEN="ya29.second-token"
GEMINI_CLI_2_REFRESH_TOKEN="1//second-refresh"
GEMINI_CLI_2_EXPIRY_DATE="1234567890000"
GEMINI_CLI_2_EMAIL="second@gmail.com"
GEMINI_CLI_2_PROJECT_ID="project-2"
GEMINI_CLI_2_TIER="free-tier"

Feature Toggles:

GEMINI_CLI_QUOTA_REFRESH_INTERVAL=300  # Quota refresh interval in seconds (default: 300 = 5 min)

Stateless Deployment (Export to Environment Variables)

For platforms without file persistence (Railway, Render, Vercel):

Set up credentials locally:

python -m rotator_library.credential_tool
# Complete OAuth flows

Export to environment variables:

python -m rotator_library.credential_tool
# Select "Export [Provider] to .env"

Copy generated variables to your platform: The tool creates files like gemini_cli_credential_1.env containing all necessary variables.
Set SKIP_OAUTH_INIT_CHECK=true to skip interactive validation on startup.

OAuth Callback Port Configuration

Customize OAuth callback ports if defaults conflict:

Provider	Default Port	Environment Variable
Gemini CLI	8085	`GEMINI_CLI_OAUTH_PORT`

Deployment

Command-Line Arguments

python src/proxy_app/main.py [OPTIONS]

Options:
  --host TEXT                Host to bind (default: 0.0.0.0)
  --port INTEGER             Port to run on (default: 8000)
  --enable-request-logging   Enable detailed per-request logging
  --enable-raw-logging       Capture raw proxy I/O payloads
  --add-credential           Launch interactive credential setup tool

Examples:

# Run on custom port
python src/proxy_app/main.py --host 127.0.0.1 --port 9000

# Run with logging
python src/proxy_app/main.py --enable-request-logging

# Run with raw I/O logging
python src/proxy_app/main.py --enable-raw-logging

# Add credentials without starting proxy
python src/proxy_app/main.py --add-credential

Render / Railway / Vercel

See the Deployment Guide for complete instructions.

Quick Setup:

Fork the repository
Create a .env file with your credentials
Create a new Web Service pointing to your repo
Set build command: pip install -r requirements.txt
Set start command: uvicorn src.proxy_app.main:app --host 0.0.0.0 --port $PORT
Upload .env as a secret file

OAuth Credentials: Export OAuth credentials to environment variables using the credential tool, then add them to your platform's environment settings.

Docker

The proxy is available as a multi-architecture Docker image (amd64/arm64) from GitHub Container Registry.

Quick Start with Docker Compose:

# 1. Create your .env file with PROXY_API_KEY and provider keys
cp .env.example .env
nano .env

# 2. Create usage directory (usage_*.json files are created automatically)
mkdir usage

# 3. Start the proxy
docker compose up -d

# 4. Check logs
docker compose logs -f

Important: Create the usage/ directory before running Docker Compose so usage stats persist on the host.

Manual Docker Run:

# Create usage directory if it doesn't exist
mkdir usage

docker run -d \
  --name llm-api-proxy \
  --restart unless-stopped \
  -p 8000:8000 \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/oauth_creds:/app/oauth_creds \
  -v $(pwd)/logs:/app/logs \
  -v $(pwd)/usage:/app/usage \
  -e SKIP_OAUTH_INIT_CHECK=true \
  -e PYTHONUNBUFFERED=1 \
  ghcr.io/mirrowel/llm-api-key-proxy:latest

Development with Local Build:

# Build and run locally
docker compose -f docker-compose.dev.yml up -d --build

Volume Mounts:

Path	Purpose
`.env`	Configuration and API keys (read-only)
`oauth_creds/`	OAuth credential files (persistent)
`logs/`	Request logs and detailed logging
`usage/`	Usage statistics persistence (`usage_*.json`)

Image Tags:

Tag	Description
`latest`	Latest stable from `main` branch
`dev-latest`	Latest from `dev` branch
`YYYYMMDD-HHMMSS-<sha>`	Specific version with timestamp and commit

OAuth with Docker:

For OAuth providers such as Gemini CLI, you must authenticate locally first:

Run python -m rotator_library.credential_tool on your local machine
Complete OAuth flows in browser
Either:
- Mount oauth_creds/ directory to container, or
- Export credentials to .env using the export option

Custom VPS / Systemd

Option 1: Authenticate locally, deploy credentials

Complete OAuth flows on your local machine
Export to environment variables
Deploy .env to your server

Option 2: SSH Port Forwarding

# Forward callback ports through SSH
ssh -L 51121:localhost:51121 -L 8085:localhost:8085 user@your-vps

# Then run credential tool on the VPS

Systemd Service:

[Unit]
Description=LLM API Key Proxy
After=network.target

[Service]
Type=simple
WorkingDirectory=/path/to/LLM-API-Key-Proxy
ExecStart=/path/to/python -m uvicorn src.proxy_app.main:app --host 0.0.0.0 --port 8000
Restart=always

[Install]
WantedBy=multi-user.target

See VPS Deployment for complete guide.

Troubleshooting

Issue	Solution
`401 Unauthorized`	Verify `PROXY_API_KEY` matches your `Authorization: Bearer` header exactly
`500 Internal Server Error`	Check provider key validity; enable `--enable-request-logging` for details
All keys on cooldown	All keys failed recently; check `logs/detailed_logs/` for upstream errors
Model not found	Verify format is `provider/model_name` (e.g., `gemini/gemini-2.5-flash`)
OAuth callback failed	Ensure callback port (8085, 51121, 11451) isn't blocked by firewall
Streaming hangs	Increase `TIMEOUT_READ_STREAMING`; check provider status

Detailed Logs:

When --enable-request-logging is enabled, check logs/detailed_logs/ for:

request.json — Exact request payload
final_response.json — Complete response or error
streaming_chunks.jsonl — All SSE chunks received
metadata.json — Performance metrics

Documentation

Document	Description
Technical Documentation	Architecture, internals, provider implementations
Library README	Using the resilience library directly
Deployment Guide	Hosting on Render, Railway, VPS
.env.example	Complete environment variable reference

License

This project is dual-licensed:

Proxy Application (src/proxy_app/) — MIT License
Resilience Library (src/rotator_library/) — LGPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 816 Commits
.github		.github
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DOCUMENTATION.md		DOCUMENTATION.md
Deployment guide.md		Deployment guide.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.tls.yml		docker-compose.tls.yml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Universal LLM API Proxy & Resilience Library

Why Use This?

Quick Start

Windows

macOS / Linux

Docker

From Source

Connecting to the Proxy

Model Format: provider/model_name

Usage Examples

API Endpoints

Managing Credentials

Using the TUI

Using the Command Line

Credential Types

The .env File

The Resilience Library

Key Features

Basic Usage

Library Documentation

Interactive TUI

TUI Features

Configuration Files

Features

Core Capabilities

Advanced Configuration

Proxy Settings

Per-Provider Settings

Advanced Features

Blacklist Only

Pure Whitelist Mode

Exemption Mode

Concurrency Limits

Rotation Modes

Priority Multipliers

Model Quota Groups

OAuth Providers

Deployment

Troubleshooting

Documentation

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 57

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Model Format: `provider/model_name`

The `.env` File

Packages