> ## Documentation Index
> Fetch the complete documentation index at: https://opentracy.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ot.Router

> Rule-based router with aliases, load balancing, and fallbacks (LiteLLM-style)

```python theme={null}
class ot.Router(
    model_list: list[dict],
    fallbacks: Optional[list[dict]] = None,
    strategy: str = "round-robin",
    num_retries: int = 2,
    timeout: float = 120.0,
)
```

Use this when you want **explicit, deterministic routing** — "try GPT-4o
first, then Claude, then DeepSeek if both fail". For semantic auto-routing
(the model picks itself based on cluster + error profile), use
[`load_router`](/api-reference/load-router) instead.

## Constructor parameters

### `model_list` (required)

A list of deployment configs. Each entry maps a **logical alias** (`model_name`)
to a **concrete provider/model**:

```python theme={null}
model_list = [
    {"model_name": "smart", "model": "openai/gpt-4o"},
    {"model_name": "smart", "model": "anthropic/claude-sonnet-4-6"},  # redundancy
    {"model_name": "fast",  "model": "groq/llama-3.1-8b-instant"},
    {"model_name": "cheap", "model": "deepseek/deepseek-chat"},
]
```

Optional per-entry fields:

| Field        | Type    | Description                                    |
| ------------ | ------- | ---------------------------------------------- |
| `model_name` | `str`   | The alias your app uses (e.g. `"smart"`).      |
| `model`      | `str`   | The concrete `"provider/model"` to call.       |
| `api_key`    | `str?`  | Override provider key for this deployment.     |
| `api_base`   | `str?`  | Override provider base URL.                    |
| `weight`     | `float` | For `weighted-random` strategy. Default `1.0`. |

### `fallbacks`

A list of `{alias: [fallback_models]}` maps:

```python theme={null}
fallbacks = [
    {"smart": ["deepseek/deepseek-chat", "mistral/mistral-large-latest"]},
    {"fast":  ["anthropic/claude-3-haiku-20240307"]},
]
```

Fallbacks are tried **after** all `model_list` deployments for the alias
fail. They're fully-qualified `"provider/model"` strings (not aliases).

### `strategy`

How to order the deployments within one alias on each call:

| Strategy                  | Behavior                                   |
| ------------------------- | ------------------------------------------ |
| `"round-robin"` (default) | Cycle through deployments in order.        |
| `"least-cost"`            | Pick the cheapest deployment first.        |
| `"lowest-latency"`        | Pick the one with the best recent latency. |
| `"weighted-random"`       | Random pick weighted by `weight` field.    |

Strategy only changes **which deployment is tried first**; on failure the
router falls through to the others.

### `num_retries`

Retries per deployment before moving to the next. Default `2`.

### `timeout`

Per-request timeout in seconds. Default `120`.

## Methods

### `.completion(model, messages, **kwargs) → ModelResponse`

Sync completion, same shape as `ot.completion`. The `model` argument is
the **alias** (e.g. `"smart"`), not a provider/model string. `**kwargs`
is passed through to the underlying `ot.completion` call.

```python theme={null}
resp = router.completion(
    model="smart",                                 # alias
    messages=[{"role": "user", "content": "..."}],
    temperature=0,
    max_tokens=200,
)
print(resp.choices[0].message.content)
```

### `.acompletion(model, messages, **kwargs) → ModelResponse`

Async version of `.completion`. Same API, returns a coroutine.

## Full example

```python theme={null}
import opentracy as ot

router = ot.Router(
    model_list=[
        {"model_name": "smart", "model": "openai/gpt-4o"},
        {"model_name": "smart", "model": "anthropic/claude-sonnet-4-6"},
        {"model_name": "fast",  "model": "groq/llama-3.1-8b-instant"},
    ],
    fallbacks=[{"smart": ["deepseek/deepseek-chat"]}],
    strategy="least-cost",
    num_retries=2,
    timeout=60,
)

resp = router.completion(
    model="smart",
    messages=[{"role": "user", "content": "Explain Bayes' theorem."}],
)
print(resp.choices[0].message.content)
```

## How failure handling works

For a call to alias `"smart"` with `num_retries=2`:

1. Order the `"smart"` deployments per `strategy` → `[D1, D2]`.
2. Try `D1` up to `1 + num_retries = 3` times; 300ms backoff between attempts.
3. If all attempts on `D1` fail, move to `D2`, try 3 times.
4. If both deployments are exhausted, try each entry in `fallbacks["smart"]`
   exactly once.
5. If everything fails, raise the last captured exception.

Stats per deployment are updated on every attempt (`dep.requests`,
`dep.errors`, `dep.total_latency_ms`), which is how `lowest-latency` and
`least-cost` strategies get their data.

## When to use `Router` vs `load_router`

| Use `ot.Router`                         | Use `ot.load_router`                  |
| --------------------------------------- | ------------------------------------- |
| You want explicit rules                 | You want the model picked per-prompt  |
| You care about availability (fallbacks) | You care about cost-quality tradeoff  |
| You're doing A/B across known models    | You have traffic but no routing rules |
| Fast to configure, understood by ops    | Pre-trained — no config needed        |

They compose: `Router` aliases can point at models that in turn go through
the semantic router via `"auto"`, so you can layer rule-based policy on
top of learned routing.
