> ## Documentation Index
> Fetch the complete documentation index at: https://opentracy.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ot.completion

> The single-call OpenAI-compatible completion — every provider, optional engine routing, full tool/stream support

```python theme={null}
ot.completion(
    model: str,
    messages: list[dict],
    *,
    api_key: Optional[str] = None,
    api_base: Optional[str] = None,
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
    top_p: Optional[float] = None,
    stream: bool = False,
    stop: Optional[str | list[str]] = None,
    tools: Optional[list[dict]] = None,
    tool_choice: Optional[str | dict] = None,
    timeout: float = 120.0,
    num_retries: int = 0,
    fallbacks: Optional[list[str]] = None,
    force_engine: bool = False,
    force_direct: bool = False,
    **kwargs,
) -> ModelResponse | Iterator[StreamChunk]
```

## Parameters

| Name           | Type                  | Description                                                                                                                                                                              |
| -------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`        | `str`                 | `"provider/model"` (e.g. `"openai/gpt-4o-mini"`) or a bare name that auto-detects (`"gpt-4o-mini"`, `"claude-3-haiku-20240307"`). `"auto"` means semantic routing — requires the engine. |
| `messages`     | `list[dict]`          | OpenAI-format messages: `[{"role": "user" \| "assistant" \| "system" \| "tool", "content": "..."}]`.                                                                                     |
| `api_key`      | `str?`                | Override the provider key from env. Defaults to the provider's env var (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, ...).                                                                     |
| `api_base`     | `str?`                | Override the provider base URL. Useful for proxies, vLLM, local models.                                                                                                                  |
| `temperature`  | `float?`              | `0.0`–`2.0`. Omitted if `None` (uses provider default).                                                                                                                                  |
| `max_tokens`   | `int?`                | Output cap.                                                                                                                                                                              |
| `top_p`        | `float?`              | Nucleus sampling.                                                                                                                                                                        |
| `stream`       | `bool`                | `True` → returns an iterator of `StreamChunk`.                                                                                                                                           |
| `stop`         | `str` or `list[str]?` | Stop sequence(s).                                                                                                                                                                        |
| `tools`        | `list[dict]?`         | Function/tool definitions — OpenAI format. The engine translates to provider-native shapes.                                                                                              |
| `tool_choice`  | `str` or `dict?`      | `"auto"`, `"required"`, `"none"`, or `{"type": "function", "function": {"name": "..."}}`.                                                                                                |
| `timeout`      | `float`               | Seconds. Default 120.                                                                                                                                                                    |
| `num_retries`  | `int`                 | Retries on transient errors before falling through to the next fallback model.                                                                                                           |
| `fallbacks`    | `list[str]?`          | Other model strings to try in order if `model` fails.                                                                                                                                    |
| `force_engine` | `bool`                | Always route through the OpenTracy engine even if `OPENTRACY_ENGINE_URL` is unset.                                                                                                       |
| `force_direct` | `bool`                | Always call the provider directly, skipping any engine routing.                                                                                                                          |
| `**kwargs`     |                       | Passed through to the request body (e.g. `user`, `logprobs`, `response_format`).                                                                                                         |

## Returns

### Non-streaming — `ModelResponse`

An OpenAI-compatible chat-completion dict with attribute access. Standard
fields plus OpenTracy extras:

```python theme={null}
resp.id                                  # str
resp.choices[0].message.content          # the answer
resp.choices[0].message.tool_calls       # if tools were used
resp.usage.prompt_tokens                 # int
resp.usage.completion_tokens             # int
resp.usage.total_tokens                  # int

# Extras
resp._provider                           # "openai" | "anthropic" | ...
resp._cost                               # USD for this call (float)
resp._latency_ms                         # float
resp._routing                            # dict — alias, selected model, scores if engine route
```

### Streaming — `Iterator[StreamChunk]`

```python theme={null}
for chunk in ot.completion(..., stream=True):
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
```

`StreamChunk` mirrors OpenAI's SSE delta format across all providers. The
engine translates Anthropic / Bedrock event-streams into OpenAI SSE.

## Examples

**Basic call:**

```python theme={null}
resp = ot.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "hello"}],
    temperature=0,
    max_tokens=20,
)
```

**Cross-provider with fallbacks:**

```python theme={null}
resp = ot.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[...],
    fallbacks=["openai/gpt-4o", "deepseek/deepseek-chat"],
    num_retries=1,
)
```

**Tool calling (provider-agnostic):**

```python theme={null}
resp = ot.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
        },
    }],
    tool_choice="auto",
)
print(resp.choices[0].message.tool_calls)
```

**Engine routing (semantic auto):**

```python theme={null}
# Set OPENTRACY_ENGINE_URL="http://localhost:8080", or pass force_engine=True
import opentracy as ot

resp = ot.completion(
    model="auto",          # engine picks per-prompt based on learned clusters
    messages=[...],
    force_engine=True,
)
print(resp._routing)       # {"selected_model": "gpt-4o-mini", "cluster_id": 84, ...}
```

## Async

```python theme={null}
import asyncio
import opentracy as ot

async def main():
    resp = await ot.acompletion(
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "hi"}],
    )
    return resp.choices[0].message.content

asyncio.run(main())
```

`acompletion` takes the same parameters and returns the same shape.

## Errors

| Error                                                   | Meaning                                                                                                                              |
| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `ValueError("Cannot resolve provider for model '...'")` | Bare model name that didn't match any known prefix, and `OPENTRACY_ENGINE_URL` isn't set. Add `provider/` prefix or set the env var. |
| `ValueError("No API key for <provider>")`               | Provider's env var isn't set and `api_key=` wasn't passed.                                                                           |
| `ValueError("Unknown provider: <name>")`                | Provider string doesn't match any of the 13 known.                                                                                   |
| `ImportError("openai package required for async...")`   | `acompletion` needs the `openai` Python package (it's in the default install — reinstall if you see this).                           |
| Provider-specific HTTP errors                           | Surfaced as `openai.APIError` / `urllib.error.HTTPError` with the provider's status + message.                                       |