AI 模型情报

工具 · 2026-05-22

Context Window Calculator

Convert LLM token budgets into something you can plan around: book pages, English words, lines of code or minutes of transcribed audio.

ContextEnglish wordsBook pagesCode linesTranscriptCJK chars
8K tokens6.0K238896.7 min8.0K
32K tokens24K913.6K27 min32K
128K tokens96K36614K107 min128K
200K tokens150K57122K167 min200K
1M tokens750K2.9K111K833 min1.0M
2M tokens1.5M5.7K222K1.7K min2.0M

Conversion assumptions

  • 1 token ≈ 0.75 English words or 4 characters (BPE tokenizers)
  • 1 standard book page ≈ 350 tokens
  • 1 line of typical TypeScript / Python ≈ 9 tokens
  • 1 minute of transcribed speech ≈ 1,200 tokens (≈150 wpm × 1.33 tokens/word)
  • CJK scripts: ~1 token per character (worst case for non-Latin scripts)

Real-world numbers vary by tokenizer family (GPT-4o, Llama 3, DeepSeek V3 use slightly different vocabularies). Treat these as planning estimates, not exact measurements.

Models per budget tier

128K tokens

200K tokens

1M tokens

See the full ranked table at best long-context LLMs or every model meeting the threshold at /capabilities/long-context.

Frequently asked questions

Why doesn't 1 million tokens equal 1 million words?

Tokens are a model-specific unit roughly equivalent to 0.75 English words or 4 characters. A 1M-token context window holds about 750,000 English words — roughly 1,500 standard book pages.

Does code take more or fewer tokens than English?

More. Source code tokenises at roughly 1 token per 3 characters because programming languages use a lot of punctuation and short keywords that the BPE tokenizer doesn't merge. A 100K-token window typically fits 25,000-30,000 lines of TypeScript or Python.

What about Chinese, Japanese or Korean (CJK) text?

CJK scripts tokenise much less efficiently — typically 1 token per character. A 200K-token window therefore fits about 200,000 CJK characters, not 600 pages.

Should I just use the biggest context window I can?

Not always. Cost scales with input length, and effective recall (retrieving the right detail back) often degrades for content buried in the middle of very long prompts. For most apps, classical RAG over a 200K-window model beats stuffing a 1M window.

Does the context budget include the model's reply?

Yes. Context window is the total of input + output you can fit in one call. The output limit is a separate, smaller cap on how much the model can generate per reply (e.g. 128K context with 16K max output).

Prices in USD per 1M tokens. Unknown means the provider does not publish per-token pricing.

Pricing and capabilities are refreshed daily and reconciled against each provider's official documentation. Always verify critical production decisions with the provider directly.