AI 模型情报

厂商 · 2026-05-12

inference

1 个标准模型4 条记录(含衍生 / 微调)
模型输入 / 1M输出 / 1M上下文服务商标签
Osmosis Structure 0.6B$0.100$0.5004K1tools · open-weights
Mistral Nemo 12B Instruct衍生$0.038$0.10016K1tools · open-weights
Qwen 2.5 7B Vision Instruct衍生$0.200$0.200125K1tools · vision · open-weights
Google Gemma 3衍生$0.150$0.300125K1tools · vision · open-weights

Frequently asked questions

How many AI models does inference offer?

We track 1 canonical inference models plus 3 community fine-tunes / derivatives (excluded from the main table). The list is recomputed daily from models.dev.

Which inference model is the cheapest?

Osmosis Structure 0.6B is currently the lowest-priced inference model, at $0.100 per 1M input tokens and $0.500 per 1M output tokens. For the full apples-to-apples list, see /pricing/cheapest-llm-api.

Which inference model has the largest context window?

Osmosis Structure 0.6B leads at 4K tokens. This is the total of prompt + completion.

Which inference models support tool calling?

Multiple inference models support tool calling, with Osmosis Structure 0.6B being a popular pick. The capability column in the table above marks every model with inference tool-calling support.

What are the best alternatives to inference?

Depends on the use case. For raw cost savings, look at /pricing/cheapest-llm-api. For agent-oriented workloads, /best/best-ai-model-for-agents. For long-document workflows, /best/best-long-context-llm.

How fresh is this inference pricing data?

Daily. Our pipeline pulls models.dev each morning and rebuilds these pages on data change, so list-price moves and new model releases land within roughly 24 hours.

最近更新:

Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.

Data is sourced from models.dev and normalized for comparison. Prices and capabilities may change. Always verify critical production decisions with the provider's official documentation.