Question 1

Why doesn't 1 million tokens equal 1 million words?

Accepted Answer

Tokens are a model-specific unit roughly equivalent to 0.75 English words or 4 characters. A 1M-token context window holds about 750,000 English words — roughly 1,500 standard book pages.

Question 2

Does code take more or fewer tokens than English?

Accepted Answer

More. Source code tokenises at roughly 1 token per 3 characters because programming languages use a lot of punctuation and short keywords that the BPE tokenizer doesn't merge. A 100K-token window typically fits 25,000-30,000 lines of TypeScript or Python.

Question 3

What about Chinese, Japanese or Korean (CJK) text?

Accepted Answer

CJK scripts tokenise much less efficiently — typically 1 token per character. A 200K-token window therefore fits about 200,000 CJK characters, not 600 pages.

Question 4

Should I just use the biggest context window I can?

Accepted Answer

Not always. Cost scales with input length, and effective recall (retrieving the right detail back) often degrades for content buried in the middle of very long prompts. For most apps, classical RAG over a 200K-window model beats stuffing a 1M window.

Question 5

Does the context budget include the model's reply?

Accepted Answer

Yes. Context window is the total of input + output you can fit in one call. The output limit is a separate, smaller cap on how much the model can generate per reply (e.g. 128K context with 16K max output).

Context	English words	Book pages	Code lines	Transcript	CJK chars
8K tokens	6.0K	23	889	6.7 min	8.0K
32K tokens	24K	91	3.6K	27 min	32K
128K tokens	96K	366	14K	107 min	128K
200K tokens	150K	571	22K	167 min	200K
1M tokens	750K	2.9K	111K	833 min	1.0M
2M tokens	1.5M	5.7K	222K	1.7K min	2.0M

Context Window Calculator

Conversion assumptions

Models per budget tier

≥ 128K tokens

≥ 200K tokens

≥ 1M tokens

≥ 2M tokens

Frequently asked questions