AI Model Intelligence

Best AI models · 2026-05-12

Cheapest LLM APIs in 2026

Lowest-priced text models from established vendors.

How we picked these

  • Sorted by sum of input + output price per 1M tokens.
  • Models with $0 placeholder pricing (Github Copilot, free-tier rebroadcasts) are excluded — Unknown ≠ free.
  • Only text models with valid context windows are included.

Top 10 picks

1BGE Reranker Basecloudflare-ai-gateway

$0.003 in / Unknown out

  • Context: 128K
  • Providers: 1

$0.002 in / $0.002 out

  • Context: 32K
  • Providers: 3
  • Tool calling
  • Structured output
  • Open weights
4All-MiniLM-L6-v2digitalocean

$0.009 in / Unknown out

  • Context: 256
  • Providers: 1
  • Open weights
9BGE M3cloudflare-ai-gateway

$0.012 in / Unknown out

  • Context: 128K
  • Providers: 1

Recommended stack by tier

Same shortlist sliced four ways — pick the tier that matches your budget and constraints.

Budget

cloudflare-ai-gateway
BGE Reranker Base
$0.003 in / Unknown out · 128K ctx

Lowest total per-1M-token cost in this list ($0.00).

Lowest-cost option that still meets the use case. Pick this when you have high volume or strict unit-economics.

Balanced

Alibaba (Qwen)
Qwen3 Embedding 4B
$0.010 in / Unknown out · 33K ctx

Median price ($0.01) — typically the safest default.

Good-enough quality at a mid-tier price. The default choice for most production apps.

Premium

cloudflare-ai-gateway
PLaMo Embedding 1B
$0.019 in / Unknown out · 128K ctx

Highest-priced pick in the list ($0.02) — usually the flagship.

Highest-capability model in this list. Pick when accuracy or reasoning matters more than cost.

Open-weight

Mistral
Voxtral Small 24B 2507
$0.002 in / $0.002 out · 32K ctx

Open weights and the cheapest in that subset ($0.00).

Open weights — self-host on your own GPUs, fine-tune on private data, run offline. Pricing here reflects the cheapest API host.

Frequently asked questions

Which AI model is the best for the lowest possible cost in 2026?

Right now we put BGE Reranker Base from cloudflare-ai-gateway at the top, primarily because its sum of input + output per-million-token cost is the lowest among models that are not placeholder-priced. Rankings are recomputed from live model metadata — see "How we picked these" above for the exact rule.

What is the cheapest option in this list?

BGE Reranker Base (cloudflare-ai-gateway) is the lowest-priced pick at $0.003 per 1M input tokens and Unknown per 1M output tokens. Costs from other entries scale up from there.

How are these rankings generated?

Each pick comes from a programmatic rule defined in our use-case-rules config: a hard filter (e.g. tool calling required, context ≥ 100K) plus a numeric score combining capability, context window and price. We never hand-curate the order, but we do hand-curate the rule. The full data source is the models.dev API, refreshed daily.

How often is this page updated?

The underlying model data is refreshed once per day from models.dev, and the static page is rebuilt when the data changes. The 'Last updated' date below shows the most recent rebuild.

Are 'Unknown' priced models excluded?

Yes. We only rank models with a published per-million-token price. 'Unknown' here means the provider does not publish a public rate card — it is not the same as 'free', so showing them at $0 would mislead.

Last updated:

Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.

Data is sourced from models.dev and normalized for comparison. Prices and capabilities may change. Always verify critical production decisions with the provider's official documentation.