Capacité · 2026-06-29
Modèles d'IA à contexte long
Modèles avec une fenêtre de contexte de 200K tokens ou plus.
Qu'est-ce que c'est ?
- Les LLMs à contexte long acceptent 200K tokens ou plus dans un seul prompt — livres entiers, dépôts multi-fichiers ou heures de transcription.
- Certains modèles montent à 1M, 2M voire plus de tokens de contexte.
Pourquoi c'est important
- Le contexte long complète ou remplace le RAG — vous pouvez coller tout le contenu au lieu de ne récupérer que des fragments.
- Le rappel effectif peut se dégrader avec la longueur, et les prompts longs deviennent chers au prix par million de tokens.
- Certains fournisseurs appliquent des tarifs échelonnés au-delà de 200K — consultez la page de détail de chaque modèle.
509 modèles avec cette capacité
Top 60 sur 509 affichés. Utilisez le répertoire complet pour filtrer davantage.
Frequently asked questions
How many AI models support contexte 200K+?
509 canonical models in our database currently support contexte 200K+. The list is regenerated on every data refresh, so it always reflects the latest releases tracked in our catalogue.
What is the cheapest model with contexte 200K+?
Ling-2.6-flash from openrouter is currently the lowest-priced option, at $0.010 per 1M input tokens and $0.030 per 1M output tokens. The full table above is sorted price-ascending.
Which model with contexte 200K+ has the largest context window?
Qwen Long (Alibaba (Qwen)) leads on context at 10M tokens. This may matter if you also need long-document understanding alongside contexte 200K+.
Which models are available on the most providers?
Production-readiness usually correlates with how many independent providers host the same weights. The top three by provider count are: Kimi K2.6 (49), Kimi K2.5 (48), GLM-5.1 (47).
How is contexte 200K+ different from a regular LLM?
Long-context models accept ≥ 200K input tokens — enough for entire books, codebases or hours of transcripts in one prompt. Effective recall and per-token pricing both degrade with input length, so 'big context' is not always the right choice over RAG.
How often is this list updated?
Daily. Our data pipeline syncs once a day, regenerates the canonical model list, and rebuilds these pages so newly released models appear within 24 hours.
Explore more
Top models with this capability
- Ling-2.6-flash$0.01 in / $0.03 out
- Google Gemma 3 27B Instruct$0.03 in / $0.11 out
- Qwen3 235B A22B 2507$0.07 in / $0.10 out
- Qwen3.5 9B$0.04 in / $0.15 out
- Qwen3 235B A22B Instruct 2507$0.10 in / $0.10 out
Other capabilities
Best-of lists you might also want
Pricing comparisons
Vendors in this list
Dernière mise à jour :
Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.
Pricing and capabilities are refreshed daily and reconciled against each provider's official documentation. Always verify critical production decisions with the provider directly.