Цены · 2026-06-29
Цены LLM
Стоимость токенов у крупных провайдеров, в USD за 1M токенов.
AI APIs ranked by input + output token cost.
OpenAI API PricingAll OpenAI model prices in one table — GPT-5, GPT-5 Mini, embeddings and more.
Anthropic Claude PricingAll Anthropic Claude prices — Opus, Sonnet, Haiku and prompt caching costs.
Об этом списке
Almost every commercial LLM provider charges separately for input tokens (the prompt you send) and output tokens (the response you get back). Output tokens typically cost 3–5× more than input tokens because generation is autoregressive — each token depends on the previous one and cannot be batched as efficiently.
The table below normalises all prices to USD per 1 million tokens. This is the industry-standard unit; per-1K or per-token rates are easy to misread by 1000×. The "Total" column sums input + output for a quick apples-to-apples comparison, but your real cost depends heavily on your input/output ratio.
What the table does NOT show
- Prompt caching discounts — Anthropic and OpenAI offer cache_read rates 5–10× cheaper than standard input. If you reuse a long system prompt, caching dominates total cost.
- Tiered pricing above 200K tokens — Google and Anthropic charge a premium for very long inputs. Each model detail page shows the >200K tier when applicable.
- Reasoning tokens — thinking models (o-series, Claude Extended Thinking, DeepSeek R1) bill internal reasoning at output rates, often 2–3× the visible answer length.
- Volume discounts and prepaid credits — most providers offer 10–30% off at scale. These are not reflected here.
- Audio / image surcharges — multimodal inputs have separate per-image or per-second rates.
How to use this page
- Find the cheapest model that meets your context window and capability requirements (tool calling, structured output, vision).
- Open its detail page to check cache pricing, output limits and provider availability.
- Use the pricing calculator to estimate monthly cost for your actual token volume.
- Compare 2–3 finalists side-by-side on a comparison page.
Prices are refreshed daily. Models showing "Unknown" do not publish a public per-token rate — this usually means enterprise-only or invite-gated access. We deliberately do not show them as $0 to avoid misleading rankings.
Showing top 100 of 977. Use the full directory to see the rest.
Frequently asked questions
Why are output tokens more expensive than input tokens?
Generation is autoregressive — each output token requires a full forward pass conditioned on all previous tokens, which cannot be batched as efficiently as reading input tokens in parallel. The 3–5× premium most providers charge reflects this compute asymmetry.
What does 'per 1M tokens' mean in practice?
One million tokens is roughly 750,000 English words or 3,000 pages of standard text. A typical chatbot request uses 1,000–5,000 input tokens and 200–1,000 output tokens, so 1M tokens represents hundreds to thousands of requests depending on your workload.
Why are some models showing 'Unknown' instead of a price?
We deliberately do not coerce missing data to $0. 'Unknown' means the provider does not publish a public per-token rate — often models behind enterprise sales or invite-only access. Treating Unknown as free would push paid-but-unpriced models to the top of every cheap list.
How often do these prices change?
Vendor list-price moves are typically picked up within hours of an announcement, and our pipeline re-syncs daily. Each change is written to /changelog so you can audit historical pricing over time.
Does this include prompt caching or batch discounts?
No. The table shows standard headline rates only. Prompt caching (Anthropic, OpenAI) can reduce input cost by 50–90%. Batch API discounts (OpenAI) offer ~50% off for non-real-time workloads. Both are shown on each model's detail page.
Explore more
Last updated:
Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.
Pricing and capabilities are refreshed daily and reconciled against each provider's official documentation. Always verify critical production decisions with the provider directly.