KI‑Modell‑Intelligenz

Funktion · 2026-06-29

KI-Modelle mit langem Kontextfenster

Modelle mit 200K Tokens oder mehr Kontextfenster.

Was ist das?

  • Long-Context-LLMs akzeptieren 200K Tokens oder mehr in einem Prompt — ganze Bücher, mehrere Projektdateien oder lange Transkripte.
  • Manche Modelle skalieren auf 1M, 2M oder mehr Tokens Kontext.

Warum wichtig

  • Long Context ergänzt oder ersetzt RAG — Sie können alles einfügen statt nur Chunks zu retrieven.
  • Recall kann mit Länge nachlassen; lange Prompts werden bei Preis pro 1M Tokens teuer.
  • Stufenpreise über 200K gibt es bei manchen Anbietern — siehe Modell-Detailseiten.

509 Modelle mit dieser Funktion

ModellAnbieterEingabe / 1MAusgabe / 1MKontextHoster
Ling-2.6-flashopenrouter$0.010$0.030262K1
Google Gemma 3 27B InstructGoogle$0.030$0.110203K10
Qwen3 235B A22B 2507Alibaba (Qwen)$0.071$0.100262K3
Qwen3.5 9BAlibaba (Qwen)$0.040$0.150262K14
Qwen3 235B A22B Instruct 2507Alibaba (Qwen)$0.100$0.100262K16
Qwen3-235B-A22B-Thinking-2507Alibaba (Qwen)$0.100$0.100262K16
Greg 1 Minicrof$0.070$0.150229K1
Qwen3 30B A3B Instruct 2507Alibaba (Qwen)$0.048$0.193262K12
Qwen TurboAlibaba (Qwen)$0.050$0.2001M5
Hy3 previewopenrouter$0.063$0.210262K1
Amazon Nova Lite 1.0nano-gpt$0.059$0.238300K1
Ministral 3 8B 2512Mistral$0.150$0.150262K3
Nova Litevercel$0.060$0.240300K1
Ministral 8Bllmgateway$0.150$0.150262K1
Amazon: Nova Lite 1.0kilo$0.060$0.240300K1
Nova Liteamazon-bedrock$0.060$0.240300K1
Nova Lite 1.0openrouter$0.060$0.240300K1
Laguna XS.2openrouter$0.100$0.200262K1
inclusionAI: Ling-2.6 Flashkilo$0.080$0.240262K1
Ling 2.6 Flashnano-gpt$0.080$0.240262K1
Hy3 previewsiliconflow$0.066$0.260262K1
Tencent: Hy3 Previewkilo$0.066$0.260262K1
Tencent: Hy3 previewnano-gpt$0.066$0.260262K1
GLM-4.7-FlashZ.AI / Zhipu$0.040$0.300200K19
Qwen LongAlibaba (Qwen)$0.072$0.28710M2
Seed 1.6 Flash (250715)llmgateway$0.070$0.300256K1
Gemini 2.0 Flash-LiteGoogle$0.075$0.3001.05M4
ByteDance Seed: Seed 1.6 Flashkilo$0.075$0.300262K1
Seed 1.6 Flashopenrouter$0.075$0.300262K1
Llama 4 ScoutMeta$0.080$0.300328K5
Step 3.5 Flashrouting-run$0.096$0.288262K1
Gemma 4 26B A4B ITGoogle$0.060$0.330262K16
Qwen3 30B A3B Thinking 2507Alibaba (Qwen)$0.051$0.340262K4
Gemma 4 31B ITGoogle$0.100$0.300262K26
Step 3.5 FlashStepFun$0.100$0.300256K11
MiMo-V2-Flashxiaomi$0.100$0.300262K6
Ministral 3 14B 2512Mistral$0.200$0.200262K3
Step 3.5 Flash 2603StepFun$0.100$0.300256K2
Mimo-V2-Flashqiniu-ai$0.100$0.300256K1
MiMo-V2-Flashhuggingface$0.100$0.300262K1
Ling-2.6-flashnovita-ai$0.100$0.300262K1
XiaomiMiMo/MiMo-V2-Flashnovita-ai$0.100$0.300262K1
Step 3.5 Flashstepfun-ai$0.100$0.300256K1
Step 3.5 Flash 2603stepfun-ai$0.100$0.300256K1
Ministral 14Bllmgateway$0.200$0.200262K1
MiMo V2 Flashmeganova$0.100$0.300262K1
Greg (Roleplay)crof$0.100$0.300229K1
Step 3.5 Flash 2603routing-run$0.100$0.300262K1
OWLnano-gpt$0.100$0.3001.05M1
MiMo V2 Flash Originalxiaomi$0.102$0.306256K1
MiMo V2 Flash (Thinking) Originalxiaomi$0.102$0.306256K1
MiMo V2 Flash (Thinking)xiaomi$0.102$0.306256K1
DeepSeek V4 FlashDeepSeek$0.140$0.2801M31
DeepSeek ChatDeepSeek$0.140$0.2801M8
DeepSeek ReasonerDeepSeek$0.140$0.2801M5
MiMo V2.5opencode-go$0.140$0.2801M1
MiMo-V2.5llmgateway$0.140$0.2801M1
Coding Router Lownano-gpt$0.140$0.2801M1
Coding Router Mediumnano-gpt$0.140$0.2801M1
Gemini 2.5 Flash Lite Preview 09-2025Google$0.090$0.3601.05M6

Top 60 von 509 angezeigt. Im vollständigen Verzeichnis weiter filtern.

Frequently asked questions

How many AI models support 200K+ Kontext?

509 canonical models in our database currently support 200K+ Kontext. The list is regenerated on every data refresh, so it always reflects the latest releases tracked in our catalogue.

What is the cheapest model with 200K+ Kontext?

Ling-2.6-flash from openrouter is currently the lowest-priced option, at $0.010 per 1M input tokens and $0.030 per 1M output tokens. The full table above is sorted price-ascending.

Which model with 200K+ Kontext has the largest context window?

Qwen Long (Alibaba (Qwen)) leads on context at 10M tokens. This may matter if you also need long-document understanding alongside 200K+ Kontext.

Which models are available on the most providers?

Production-readiness usually correlates with how many independent providers host the same weights. The top three by provider count are: Kimi K2.6 (49), Kimi K2.5 (48), GLM-5.1 (47).

How is 200K+ Kontext different from a regular LLM?

Long-context models accept ≥ 200K input tokens — enough for entire books, codebases or hours of transcripts in one prompt. Effective recall and per-token pricing both degrade with input length, so 'big context' is not always the right choice over RAG.

How often is this list updated?

Daily. Our data pipeline syncs once a day, regenerates the canonical model list, and rebuilds these pages so newly released models appear within 24 hours.

Zuletzt aktualisiert:

Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.

Pricing and capabilities are refreshed daily and reconciled against each provider's official documentation. Always verify critical production decisions with the provider directly.