Skip to content

OpenAI-compatible providers

The openai-compat provider (crates/lime-provider-openai-compat/) talks to any service that implements the OpenAI Chat Completions API. Lime ships built-in presets for the popular ones; you can also register a fully custom endpoint.

Authentication

Each compat service has its own credential bucket so multiple services can coexist:

Terminal window
lime login --provider openai-compat --service groq --with-api-key \
--with-base-url https://api.groq.com/openai/v1
lime login --provider openai-compat --service openrouter --with-api-key \
--with-base-url https://openrouter.ai/api/v1

--service NAME namespaces the credentials under openai_compat:<service>. Models declared with a matching service field in models.json route to the right bucket automatically.

Built-in presets

PresetProvider idDefault base URLEnv var fallback
OpenAIopenaihttps://api.openai.com/v1OPENAI_API_KEY
Groqgroqhttps://api.groq.com/openai/v1GROQ_API_KEY
OpenRouteropenrouterhttps://openrouter.ai/api/v1OPENROUTER_API_KEY
Together AItogetherhttps://api.together.xyz/v1TOGETHER_API_KEY
Mistralmistralhttps://api.mistral.ai/v1MISTRAL_API_KEY
DeepInfradeepinfrahttps://api.deepinfra.com/v1/openaiDEEPINFRA_API_KEY
Cerebrascerebrashttps://api.cerebras.ai/v1CEREBRAS_API_KEY
xAIxaihttps://api.x.ai/v1XAI_API_KEY
Perplexityperplexityhttps://api.perplexity.aiPERPLEXITY_API_KEY
Coherecoherehttps://api.cohere.ai/compatibility/v1COHERE_API_KEY
NVIDIAnvidiahttps://integrate.api.nvidia.com/v1NVIDIA_API_KEY
Ollamaollamahttp://localhost:11434/v1(none — local)

If a preset’s env var is set, it is used as a fallback when no stored credential is present.

Custom endpoints

For an internal LLM proxy, vLLM, LM Studio, or any other Chat-Completions-compatible service:

Terminal window
lime login --provider openai-compat --service my-proxy \
--with-api-key --with-base-url https://my-llm-proxy.internal/v1

Then add a model entry that pins to that service:

Terminal window
lime model add \
--slug internal-llama \
--provider openai-compat --service my-proxy \
--api-id internal/llama-70b \
--context-window 128000

See Custom models for the full schema.

Compatibility quirks

The compat layer is conservative with optional fields so it works with the widest range of servers:

  • reasoning_effort is sent only when the preset declares it supported. Ollama / vLLM / LM Studio receive it only when they advertise reasoning support.
  • stream_options.include_usage is sent only when supported. Otherwise Lime estimates token usage client-side.
  • Tool calls (tools / tool_choice) are sent only when the model entry declares supports_parallel_tool_calls or the preset is known to implement function calling.
  • System prompts that exceed a server’s context window prompt the runtime’s auto-compaction path before retry.

Common usage

Terminal window
# Pick a model by slug from the merged catalog
lime --model llama-3.3-70b # routed to whatever service was registered

Provider routing is automatic: the model entry’s service field (or provider for non-compat) determines which credential bucket is used and which base URL receives the request.