> ## Documentation Index
> Fetch the complete documentation index at: https://opentracy.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Traces

> Every request captured — the raw asset the rest of the pipeline depends on

<img src="https://mintcdn.com/opentracy/GPH9CFICBELzB50g/images/calm.jpeg?fit=max&auto=format&n=GPH9CFICBELzB50g&q=85&s=4740496dd8898cd74494bf6f56fc4cfa" alt="OpenTracy ghost mascot peacefully watching the traffic — traces are quiet, always-on observation" width="1280" height="698" data-path="images/calm.jpeg" />

A **trace** is a row that records one LLM call: what was asked, what came
back, which model answered, how much it cost, and how long it took.

Traces are the single source of truth for everything downstream:
auto-routing, evaluation, dataset curation, and distillation. If you don't
have traces, you don't have a product.

## What's in a trace

Every completion that flows through the OpenTracy engine is persisted with
this shape:

| Field            | Type     | Meaning                                                             |
| ---------------- | -------- | ------------------------------------------------------------------- |
| `trace_id`       | uuid     | Primary key for this single request.                                |
| `tenant_id`      | string   | Which project / workspace this trace belongs to.                    |
| `timestamp`      | datetime | UTC start of the request.                                           |
| `model`          | string   | Concrete model the request was answered by (e.g. `gpt-4o-mini`).    |
| `provider`       | string   | `openai`, `anthropic`, `groq`, ... or `opentracy` if the student.   |
| `input_text`     | string   | The user message(s) — full content.                                 |
| `output_text`    | string   | The assistant response — full content.                              |
| `input_tokens`   | int      | Prompt tokens.                                                      |
| `output_tokens`  | int      | Completion tokens.                                                  |
| `total_cost_usd` | float    | USD cost for this request (uses live pricing tables).               |
| `latency_ms`     | float    | End-to-end time in milliseconds.                                    |
| `status`         | string   | `success`, `error`, `timeout`.                                      |
| `routing_alias`  | string?  | The alias the app asked for, e.g. `smart`. Null if direct.          |
| `cluster_id`     | int?     | Which semantic cluster the prompt landed in (after embedding).      |
| `metadata`       | json     | Anything you attach — user\_id, session\_id, A/B flag, custom tags. |

**Where they live**: a ClickHouse column-store, designed so that billion-row
aggregations (cost per model per day, latency p99 per alias) stay fast.

**How they're created**: you don't call a separate `log_trace()` API. Any
request that hits `/v1/chat/completions` on the engine — whether from the
Python SDK, the OpenAI SDK pointed at the engine URL, or a raw curl — is
traced automatically.

## How traces are created (three paths)

### Path A — Python SDK

```python theme={null}
import opentracy as ot

resp = ot.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "..."}],
)
# A trace for this request is already in ClickHouse. The response also
# carries the usage summary as attached metadata:
print(resp._cost, resp._latency_ms, resp._routing)
```

### Path B — OpenAI SDK (drop-in)

```python theme={null}
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",   # OpenTracy engine
    api_key="any",                         # engine handles provider auth
)

resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "..."}],
)
# Traced. No library changes in your app.
```

### Path C — Raw HTTP

```bash theme={null}
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "..."}]}'
```

Any OpenAI-compatible client works, because the engine speaks that protocol.

## Inspecting traces

The REST API exposes a search interface:

```bash theme={null}
curl "http://localhost:8000/v1/traces?model=gpt-4o-mini&limit=10"
```

Returns a list of trace rows matching the filter. Filters supported include
`model`, `provider`, `routing_alias`, `status`, `tenant_id`, `since`,
`until`, and full-text search on `input_text` / `output_text`.

In the UI, traces live under **Traces → Browse** with filters, a cost/latency
chart on top, and a detail drawer that shows the full prompt + response
and lets you "add to dataset".

## Attaching metadata

You control metadata — use it to link traces back to your app's world:

```python theme={null}
resp = ot.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "..."}],
    metadata={
        "user_id": "u_42",
        "session_id": "s_7af",
        "feature": "ticket_classifier",
        "ab_variant": "B",
    },
)
```

When you go to build a dataset later, these fields are how you say
"give me all traces from the `ticket_classifier` feature where the
user gave positive feedback".

## Privacy and PII

Traces store **full prompt and response text by default** — that's what
makes distillation possible. Two knobs for environments that can't do
that:

* Set `OPENTRACY_TRACE_REDACT=true` to strip matched patterns (emails,
  phone numbers, credit cards) before persist.
* Set `OPENTRACY_TRACE_CONTENT=false` to store only metadata + token
  counts + cost, dropping the text entirely. You lose distillation
  ability but keep cost analytics.

Per-trace override is possible via the `trace` header:

```python theme={null}
resp = ot.completion(
    ...,
    extra_headers={"X-Opentracy-Trace": "metadata-only"},
)
```

## From trace to dataset

One trace is a row. A dataset is a curated bundle of many rows, grouped
by intent. The transformation is covered in [Datasets](/concepts/datasets) —
that's where the pipeline starts producing training-ready artifacts.
