1 · Estimate your prompt & set your traffic
Token counts use a client-side BPE-approximate estimator (±10–15%). Nothing you paste ever leaves your browser.
Cached input is billed at roughly 25% of the normal rate (a cross-provider average). Stacks with batch.
2 · Cheapest model that fits your context window
Your context need = input + output tokens per request (plus headroom for history or RAG chunks). Models that don't fit are dimmed in the table below.
3 · Side-by-side model comparison
Click any column to sort. Prices are standard API list rates in USD per 1M tokens — estimates as of June 2026.
| Model | Tier | Input /1M | Output /1M | Context | Quality* | Monthly | Value |
|---|
Monthly spend by provider
Cheapest model vs. highest-quality model per provider at your current traffic (log scale).
Chart unavailable offline — the table above contains the same data.
Unlock the cost-modeling toolkit
A blended multi-model router calculator, saved scenarios with side-by-side compare, and one-click PDF + live-formula Excel export. One-time unlock — runs offline on this device.
Try it instantly with demo code AV-TOKEN-TALLY-DEMO · Get a license →