OpenAI-compatible providers

The openai-compat provider (crates/lime-provider-openai-compat/) talks to any service that implements the OpenAI Chat Completions API. Lime ships built-in presets for the popular ones; you can also register a fully custom endpoint.

Authentication

Each compat service has its own credential bucket so multiple services can coexist:

lime login --provider openai-compat --service groq --with-api-key \
           --with-base-url https://api.groq.com/openai/v1
lime login --provider openai-compat --service openrouter --with-api-key \
           --with-base-url https://openrouter.ai/api/v1

--service NAME namespaces the credentials under openai_compat:<service>. Models declared with a matching service field in models.json route to the right bucket automatically.

Built-in presets

Preset	Provider id	Default base URL	Env var fallback
OpenAI	`openai`	`https://api.openai.com/v1`	`OPENAI_API_KEY`
Groq	`groq`	`https://api.groq.com/openai/v1`	`GROQ_API_KEY`
OpenRouter	`openrouter`	`https://openrouter.ai/api/v1`	`OPENROUTER_API_KEY`
Together AI	`together`	`https://api.together.xyz/v1`	`TOGETHER_API_KEY`
Mistral	`mistral`	`https://api.mistral.ai/v1`	`MISTRAL_API_KEY`
DeepInfra	`deepinfra`	`https://api.deepinfra.com/v1/openai`	`DEEPINFRA_API_KEY`
Cerebras	`cerebras`	`https://api.cerebras.ai/v1`	`CEREBRAS_API_KEY`
xAI	`xai`	`https://api.x.ai/v1`	`XAI_API_KEY`
Perplexity	`perplexity`	`https://api.perplexity.ai`	`PERPLEXITY_API_KEY`
Cohere	`cohere`	`https://api.cohere.ai/compatibility/v1`	`COHERE_API_KEY`
NVIDIA	`nvidia`	`https://integrate.api.nvidia.com/v1`	`NVIDIA_API_KEY`
Ollama	`ollama`	`http://localhost:11434/v1`	(none — local)

If a preset’s env var is set, it is used as a fallback when no stored credential is present.

Custom endpoints

For an internal LLM proxy, vLLM, LM Studio, or any other Chat-Completions-compatible service:

lime login --provider openai-compat --service my-proxy \
           --with-api-key --with-base-url https://my-llm-proxy.internal/v1

Then add a model entry that pins to that service:

lime model add \
  --slug internal-llama \
  --provider openai-compat --service my-proxy \
  --api-id internal/llama-70b \
  --context-window 128000

See Custom models for the full schema.

Compatibility quirks

The compat layer is conservative with optional fields so it works with the widest range of servers:

reasoning_effort is sent only when the preset declares it supported. Ollama / vLLM / LM Studio receive it only when they advertise reasoning support.
stream_options.include_usage is sent only when supported. Otherwise Lime estimates token usage client-side.
Tool calls (tools / tool_choice) are sent only when the model entry declares supports_parallel_tool_calls or the preset is known to implement function calling.
System prompts that exceed a server’s context window prompt the runtime’s auto-compaction path before retry.

Common usage

# Pick a model by slug from the merged catalog
lime --model llama-3.3-70b               # routed to whatever service was registered

Provider routing is automatic: the model entry’s service field (or provider for non-compat) determines which credential bucket is used and which base URL receives the request.