> ## Documentation Index > Fetch the complete documentation index at: https://opentracy.com/docs/llms.txt > Use this file to discover all available pages before exploring further. # Quickstart > Two lines to your first completion with cost + latency — no server, no setup By the end of this page — **in under three minutes** — you'll have made a real LLM call, seen the cost and latency on the response, swapped providers with one string change, and added automatic fallbacks. No server, no Docker, no config files. **What you need right now:** an OpenAI API key (or Anthropic, Groq, etc. — any of the 13 providers). Nothing else. ## 1. Install — 30 seconds ```bash theme={null} pip install opentracy ``` ```bash theme={null} export OPENAI_API_KEY=sk-... ``` ## 2. Your first call — 30 seconds ```python theme={null} import opentracy as ot resp = ot.completion( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Say hello in three words."}], ) print(resp.choices[0].message.content) print(f"cost: ${resp._cost:.6f} latency: {resp._latency_ms:.0f}ms") ``` ```text theme={null} Hi there, friend! cost: $0.000008 latency: 612ms ``` **This is the hook.** Every response already carries `_cost` and `_latency_ms`. You didn't wire up any observability — it's on by default. `ot.completion` is OpenAI-compatible, so `resp.choices[0].message.content`, `resp.usage`, and streaming all work like you'd expect. ## 3. Switch providers with one string — 1 minute Same function, same message shape, different provider. No new SDK, no new auth code: ```python theme={null} # Anthropic resp = ot.completion( model="anthropic/claude-haiku-4-5-20251001", messages=[{"role": "user", "content": "Say hello in three words."}], ) # Groq (Llama, sub-second) resp = ot.completion( model="groq/llama-3.3-70b-versatile", messages=[{"role": "user", "content": "Say hello in three words."}], ) # DeepSeek (cheap reasoning) resp = ot.completion( model="deepseek/deepseek-chat", messages=[{"role": "user", "content": "Say hello in three words."}], ) ``` Each provider reads its own env var (`ANTHROPIC_API_KEY`, `GROQ_API_KEY`, `DEEPSEEK_API_KEY`, ...). The 13-provider matrix is in the [completion reference](/api-reference/completion#parameters). ## 4. Add fallbacks — 1 minute Production calls that survive one provider being down: ```python theme={null} resp = ot.completion( model="openai/gpt-4o", messages=[{"role": "user", "content": "Draft a pithy tagline."}], fallbacks=[ "anthropic/claude-sonnet-4-6", "deepseek/deepseek-chat", ], num_retries=1, ) print(resp._provider) # which one actually answered ``` If OpenAI rate-limits you, Anthropic picks up. If Anthropic is degraded, DeepSeek does. You don't get paged. ## 5. Done. What you now have. Same message format, same response shape. Any existing code moves over. Switch at any time. One string change, no auth rewrite. `_cost` and `_latency_ms` on every response. No setup. Survive provider outages without writing retry logic yourself. ## Where to go next Point existing OpenAI code at OpenTracy — zero library changes.
**\~2 minutes.** Let the router pick the cheapest model that's good enough per prompt.
**\~5 minutes** (downloads \~100 MB of weights once). Self-host to capture every trace in ClickHouse + a UI for analytics.
**\~30 minutes** (needs Docker). Fine-tune a tiny student from your traffic. The cost-reduction wedge.
**\~2 hours** (needs self-host + a GPU). ## Optional: try the semantic auto-router If you want to see the full pipeline in action — including the model picking itself per prompt based on learned error profiles — load the pre-trained router. This downloads \~100 MB of weights on first run and caches them in `~/.local/share/opentracy/`. ```python theme={null} import opentracy as ot router = ot.load_router(cost_weight=0.5) for prompt in [ "What is the capital of France?", "Prove the square root of 2 is irrational.", "Write a haiku about autumn.", ]: d = router.route(prompt) print(f"[{d.selected_model:<24}] cluster={d.cluster_id:>3} {prompt}") ``` ```text theme={null} [ministral-3b-latest ] cluster= 84 What is the capital of France? [gpt-4o ] cluster= 47 Prove the square root of 2 is irrational. [ministral-3b-latest ] cluster= 29 Write a haiku about autumn. ``` Easy trivia → a cheap small model. Math proof → a strong model. **No rules from you.** See [Auto-routing](/concepts/auto-routing) for the full picture.