> ## Documentation Index
> Fetch the complete documentation index at: https://opentracy.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# ot.Distiller

> REST client for distillation — create jobs, poll progress, fetch artifacts

<Note>
  **For most users, use [`ot.distill()`](/api-reference/distill)** — it
  runs the full pipeline in-process and returns a callable `Student`
  with zero external services. `Distiller` is the long-running,
  queued, multi-tenant REST flow for teams running the full self-hosted
  stack.
</Note>

```python theme={null}
class ot.Distiller(
    base_url: str = "http://localhost:8000",
    api_key: Optional[str] = None,
    timeout: float = 60.0,
)
```

Thin HTTP client for the engine's `/v1/distillation/*` endpoints.
**Requires the self-hosted stack running** — see the
[self-host guide](/guides/self-host).

## Constructor

| Name       | Type    | Description                                                |
| ---------- | ------- | ---------------------------------------------------------- |
| `base_url` | `str`   | REST API base. Default assumes local self-host on `:8000`. |
| `api_key`  | `str?`  | Bearer token if the API is behind auth.                    |
| `timeout`  | `float` | HTTP timeout in seconds.                                   |

## Methods

### `.create(...) → dict`

Create a new distillation job.

```python theme={null}
job = d.create(
    name="ticket-triage v1",            # label, any string
    dataset_id=None,                    # existing dataset to train on
    student_model="llama-3.2-1b",       # small open model to fine-tune
    teacher_model="openai/gpt-4o",      # model that produces labels
    num_prompts=1000,                   # how many prompts from the dataset
    n_samples=4,                        # BOND candidates per prompt
    training_steps=500,                 # fine-tune steps
    bond_beta=0.5,                      # BOND preference weight
    bond_gamma=0.1,                     # KL regularization
    temperature=0.8,                    # teacher sampling temperature
    export_gguf=True,                   # convert trained adapter to GGUF
    quantization_types=["q4_k_m", "q8_0"],
    description="",
    extra_config=None,                  # dict — passed through to engine
)

# Returns:
# {"id": "job_abc123", "status": "queued", "created_at": "...", ...}
```

### `.estimate(...) → dict`

Dry-run cost estimation — no job is created.

```python theme={null}
est = d.estimate(
    student_model="llama-3.2-1b",
    num_prompts=200,
    n_samples=2,
)
# {"estimated_cost": 0.94, "is_sandbox": False, "tier": "local",
#  "balance": 999999, "sufficient": True}
```

### `.get(job_id) → dict`

Fetch current state of a job.

```python theme={null}
job = d.get("job_abc123")
# {
#   "id": "job_abc123",
#   "status": "training",          # queued | generating | curating | training | exporting | completed | failed
#   "phase": "data_generation",
#   "progress": {"prompts_done": 120, "prompts_total": 500},
#   "metrics": {"teacher_cost_total": 0.82, ...},
#   ...
# }
```

### `.wait(job_id, timeout=3600, poll_interval=5.0, on_update=None) → dict`

Block until the job reaches a terminal state (`completed` or `failed`).

```python theme={null}
def show(update):
    print(update["status"], update.get("phase"), update.get("progress"))

job = d.wait("job_abc123", on_update=show)
```

### `.stream_progress(job_id, poll_interval=5.0) → Iterable[dict]`

Generator yielding status updates as they change.

```python theme={null}
for update in d.stream_progress("job_abc123"):
    print(update)
```

### `.metrics(job_id, limit=5000) → list[dict]`

Per-step training metrics (loss, ot, memory) — the series you'd plot.

```python theme={null}
for m in d.metrics("job_abc123"):
    print(m["step"], m["loss"], m["ot"])
```

### `.candidates(job_id, limit=100) → list[dict]`

Teacher-generated candidates (before curation), with judge scores.

### `.logs(job_id) → str`

Full text logs from the training subprocess.

### `.artifacts(job_id) → dict`

Paths to the trained artifacts on the engine side.

```python theme={null}
artifacts = d.artifacts("job_abc123")
# {
#   "adapter_path": "/app/data/distillation/job_abc123/adapter/",
#   "gguf_paths": {
#     "q4_k_m": ".../gguf/model-q4_k_m.gguf",
#     "q8_0":   ".../gguf/model-q8_0.gguf",
#   },
#   "tokenizer_path": ".../adapter/tokenizer.model",
#   "config_path": ".../train_config.json",
# }
```

### `.cancel(job_id) → dict`

Cancel a running job. Safe at any phase; partial artifacts are kept.

### `.delete(job_id) → dict`

Delete the job record and all its artifacts from disk.

### `.list(status=None, limit=50, offset=0) → list[dict]`

List jobs, optionally filtered by status.

```python theme={null}
running = d.list(status="training")
recent  = d.list(limit=5)
```

### `.teacher_models() → list[dict]`

Available teachers (populated from the engine's model registry).

```python theme={null}
for t in d.teacher_models():
    print(t["id"], t["provider"], t["available"])
```

### `.student_models() → list[dict]`

Available students (populated from the engine's HF whitelist + local
models). Typical entries include `llama-3.2-1b`, `llama-3.2-3b`,
`qwen3-0.6b`, `qwen3-1.7b`, `qwen3-4b`, `mistral-small`, `phi-3.5-mini`.

## TrainingClient (lower-level)

`Distiller` wraps `TrainingClient` — the raw HTTP layer. Use
`TrainingClient` directly only if you need fine control over retries,
custom endpoints, or headers.

```python theme={null}
from opentracy import TrainingClient

tc = TrainingClient(base_url="http://localhost:8000")
response = tc.post("/v1/distillation/jobs", json={...})
```

## Errors

All network / server errors raise `DistillerError` with the HTTP status
and response body attached. Wrap `.create` and `.wait` in try/except if
you need graceful degradation.

```python theme={null}
from opentracy import Distiller, DistillerError

try:
    job = d.create(...)
    job = d.wait(job["id"])
except DistillerError as e:
    print(f"distillation failed: {e.status} {e.message}")
    print(e.response_body)
```

## Typical flow

```python theme={null}
from opentracy import Distiller

d = Distiller()

# 1. Discover
print([t["id"] for t in d.teacher_models()][:3])
print([s["id"] for s in d.student_models()][:3])

# 2. Estimate
est = d.estimate(student_model="llama-3.2-1b", num_prompts=500, n_samples=4)
assert est["sufficient"], "not enough credits"

# 3. Submit
job = d.create(
    name="ticket-triage v1",
    dataset_id="ds_support_tickets",
    teacher_model="openai/gpt-4o",
    student_model="llama-3.2-1b",
    num_prompts=500,
    n_samples=4,
    training_steps=100,
)

# 4. Wait + track
job = d.wait(
    job["id"],
    on_update=lambda u: print(u["status"], u.get("phase")),
)

# 5. Inspect
print(d.metrics(job["id"])[-1])           # last training step metric
artifacts = d.artifacts(job["id"])
print(artifacts["gguf_paths"])
```
