# OpenTracy

> The auto-distillation layer for your LLM calls — drop-in OpenAI-compatible SDK that turns every request into training data for cheaper custom models.

## Docs

- [ot.completion](https://opentracy.com/docs/api-reference/completion.md): The single-call OpenAI-compatible completion — every provider, optional engine routing, full tool/stream support
- [ot.distill](https://opentracy.com/docs/api-reference/distill.md): One-call distillation — train a custom student from a dataset and get back a callable Student
- [ot.Distiller](https://opentracy.com/docs/api-reference/distiller.md): REST client for distillation — create jobs, poll progress, fetch artifacts
- [ot.load_router](https://opentracy.com/docs/api-reference/load-router.md): Load the semantic auto-router with pre-trained weights
- [POST /v1/chat/completions](https://opentracy.com/docs/api-reference/rest/chat-completions.md): The OpenAI-compatible completion endpoint — every provider, optional semantic routing
- [Distillation endpoints](https://opentracy.com/docs/api-reference/rest/distillation.md): Create jobs, poll status, fetch artifacts over plain HTTP
- [GET /v1/models  &  /health](https://opentracy.com/docs/api-reference/rest/models.md): Discover what's configured and check service readiness
- [REST API Overview](https://opentracy.com/docs/api-reference/rest/overview.md): HTTP-first access to OpenTracy — for apps in any language, not just Python
- [POST /v1/route](https://opentracy.com/docs/api-reference/rest/route.md): Get the router's decision without running the completion
- [GET /v1/traces](https://opentracy.com/docs/api-reference/rest/traces.md): Search captured requests by model, time, status, or metadata
- [ot.Router](https://opentracy.com/docs/api-reference/router.md): Rule-based router with aliases, load balancing, and fallbacks (LiteLLM-style)
- [Auto-routing](https://opentracy.com/docs/concepts/auto-routing.md): How the router picks a model per prompt — semantic clusters, per-model error profiles, cost-quality tradeoff
- [Datasets](https://opentracy.com/docs/concepts/datasets.md): How curated traces become the training-ready data that feeds distillation and evaluation
- [Distillation](https://opentracy.com/docs/concepts/distillation.md): Train a cheap student model from teacher labels — the wedge that separates OpenTracy from a generic LLM gateway
- [The Pipeline](https://opentracy.com/docs/concepts/pipeline.md): How requests, traces, datasets, and distilled models fit together into one closed loop
- [Traces](https://opentracy.com/docs/concepts/traces.md): Every request captured — the raw asset the rest of the pipeline depends on
- [Drop-in OpenAI replacement](https://opentracy.com/docs/guides/drop-in-openai.md): Point any existing OpenAI SDK app at OpenTracy — zero library changes, every request becomes a trace
- [Python SDK](https://opentracy.com/docs/guides/python-sdk.md): Using opentracy directly — the Python-first path for new apps
- [Self-hosting](https://opentracy.com/docs/guides/self-host.md): Run the full stack — engine, REST API, ClickHouse, and UI — with Docker Compose
- [Welcome to OpenTracy](https://opentracy.com/docs/index.md): The auto-distillation layer for your LLM calls — drop in, cut cost, keep quality
- [Quickstart](https://opentracy.com/docs/quickstart.md): Two lines to your first completion with cost + latency — no server, no setup

## OpenAPI Specs

- [openapi](https://opentracy.com/docs/api-reference/openapi.json)