Custom models
The model catalog Lime uses at runtime is the merge of:
- The built-in catalog compiled into the binary.
- User entries in
~/.lime/models.json(orLIME_MODELS_FILEif set).
User entries override built-ins on slug collision. Brand-new slugs are appended.
The fast path: lime model add
The CLI generates a valid models.json entry for you:
lime model add \ --slug llama-3.3-70b \ --provider openai-compat --service groq \ --api-id llama-3.3-70b-versatile \ --context-window 128000--service is required (and only accepted) when --provider openai-compat.
It pins the model to a specific compat credential bucket
(openai_compat:<service>). Set up the bucket first:
lime login --provider openai-compat --service groq \ --with-api-key --with-base-url https://api.groq.com/openai/v1Then lime model list will show the merged catalog with (builtin) /
(user) tags, and lime model rm <slug> removes a user entry — built-ins
always survive.
All lime model add flags
| Flag | Required | Default | Notes |
|---|---|---|---|
--slug SLUG | ✓ | — | Identifier used in --model and /model. |
--provider NAME | ✓ | — | One of openai, anthropic, gemini, openai-compat, custom. |
--context-window N | ✓ | — | Full context window in tokens. |
--service NAME | — | Required with --provider openai-compat. Pins to a credential bucket. | |
--display-name STR | <slug> | UI label. | |
--description STR / --desc STR | "Custom model" | One-line description. | |
--api-id STR / --api-model-id STR | omitted (falls through to slug) | Wire model id sent to the API. When omitted, the runtime uses the slug. | |
--max-output N | clamp(window / 4, 1, 32_768) | Default response cap. | |
--upper-max-output N | min(max_output × 2, window / 2) (never below max_output) | Upper retry / thinking cap. | |
--auto-compact N | ContextBudget::auto_threshold(window, max_output) | Token count at which auto-compaction fires. | |
--param-preset NAME | verbosity | One of none, verbosity, reasoning, reasoning-and-verbosity. | |
--vision | off | Model accepts images. | |
--reasoning-summaries | off | OpenAI-style reasoning summaries. | |
--parallel-tools | off | Multiple tool calls per turn. | |
--extended-thinking / --thinking | off | Anthropic-style extended thinking. |
models.json schema (v1)
{ "version": 1, "models": [ { "slug": "llama-3.3-70b", "display_name": "Llama 3.3 70B", "description": "Llama 3.3 70B via Groq", "provider": "openai_compat", "service": "groq", "api_model_id": "llama-3.3-70b-versatile", "context_window": 128000, "max_output_tokens": 8192, "upper_max_output_tokens": 32768, "auto_compact_threshold": 100000, "param_preset": "verbosity_only", "supports_parallel_tool_calls": true, "supports_vision": false, "supports_reasoning_summaries": false, "supports_extended_thinking": false } ]}| Field | Type / values |
|---|---|
version | Always 1. Mismatched versions cause Lime to fall back to the built-in catalog. |
slug | String. Required. |
display_name | String. Defaults to the slug. |
description | String. |
provider | One of openai, anthropic, gemini, openai_compat, custom. |
service | String. Required for openai_compat; ignored for other providers. |
api_model_id | String. The wire id sent to the API. Defaults to the slug. |
context_window | Integer. Required. |
max_output_tokens | Integer. |
upper_max_output_tokens | Integer. |
auto_compact_threshold | Integer. |
param_preset | One of none, verbosity_only, reasoning_only, reasoning_and_verbosity. |
supports_parallel_tool_calls | Boolean. |
supports_vision | Boolean. |
supports_reasoning_summaries | Boolean. |
supports_extended_thinking | Boolean. |
Loading rules
LIME_MODELS_FILEoverrides the default~/.lime/models.jsonpath.- If the file is missing, Lime uses the built-in catalog.
- If the file is unparseable or has the wrong
version, Lime logs a warning to stderr and falls back to the built-in catalog. No turn ever fails because of a malformedmodels.json.
When to use provider: custom
custom is for model entries you own end-to-end — typically used in
combination with a plugin that registers a non-standard wire format. Most
users never need it; openai-compat covers the long tail of HTTP services.