Question 1

How many AI models does inference offer?

Accepted Answer

We track 1 canonical inference models plus 3 community fine-tunes / derivatives (excluded from the main table). The list is recomputed daily from models.dev.

Question 2

Which inference model is the cheapest?

Accepted Answer

Osmosis Structure 0.6B is currently the lowest-priced inference model, at $0.100 per 1M input tokens and $0.500 per 1M output tokens. For the full apples-to-apples list, see /pricing/cheapest-llm-api.

Question 3

Which inference model has the largest context window?

Accepted Answer

Osmosis Structure 0.6B leads at 4K tokens. This is the total of prompt + completion.

Question 4

Which inference models support tool calling?

Accepted Answer

Multiple inference models support tool calling, with Osmosis Structure 0.6B being a popular pick. The capability column in the table above marks every model with inference tool-calling support.

Question 5

What are the best alternatives to inference?

Accepted Answer

Depends on the use case. For raw cost savings, look at /pricing/cheapest-llm-api. For agent-oriented workloads, /best/best-ai-model-for-agents. For long-document workflows, /best/best-long-context-llm.

Question 6

How fresh is this inference pricing data?

Accepted Answer

Daily. Our pipeline pulls models.dev each morning and rebuilds these pages on data change, so list-price moves and new model releases land within roughly 24 hours.

模型	输入 / 1M	输出 / 1M	上下文	服务商	标签
Osmosis Structure 0.6B	$0.100	$0.500	4K	1	tools · open-weights
Mistral Nemo 12B Instruct衍生	$0.038	$0.100	16K	1	tools · open-weights
Qwen 2.5 7B Vision Instruct衍生	$0.200	$0.200	125K	1	tools · vision · open-weights
Google Gemma 3衍生	$0.150	$0.300	125K	1	tools · vision · open-weights

inference

Frequently asked questions

Top inference models

Pricing pages

Browse by use case

Browse by capability

Tools