LLMs and Context Windows

2 min readJan 29, 2024

Day 29 / 366

While reasoning power and so-called intelligence are the most important factors when testing an LLM, the context length is crucial as well. You can think of the context length as a measure of how much information the LLM can hold in its memory at any given time. Thus it dictates the amount of instructions it can remember, and the amount of output it can generate.

What are tokens?

The context length of LLMs is measured in tokens. LLMs break text into tokens, which are then converted to numerical representations so that they can be fed to neural networks. Tokens are not synonymous with words, but we can use the following approximations to see how many tokens a piece of text might have

1 token ~= 4 chars in English.
1 token ~= ¾ words.
100 tokens ~= 75 words.

Context lengths of GPT models

GPT models are the best out there so far, but do they have the best context lengths as well?

The latest GPT-4-turbo model has a context length of 128k tokens. That would be just under 100k words, which is the number of words in an average novel. This means that you can feed in an entire book to GPT-4 and ask it to perform anything using that books info, without using something like RAG.

However, GPT-4-turbo is expensive, and many applications right now still use GPT-3.5 since it makes the most sense cost-wise. GPT-3.5 has a context length of just 4096 tokens, which is around 3000 words. There is also a model GPT-3.5 turbo which has a context length of 16k tokens, but it can only output a maximum of 4096 tokens.

If we compare it to the open-source alternatives, the Mistral 7B model has a context length of 8k. Its Mixtral model (a combination of 8 7B models or ‘experts’) has a context length of 32k.

The llama2 model by Meta also has a context length of 4k.

And there are models with context length even higher than GPT-4, for instance, this one has a context length of 200k

01-ai/Yi-34B-200K · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

But I haven’t tested this out, so I won’t be able to say it its any good.

LLMs and Context Windows

What are tokens?

Context lengths of GPT models

01-ai/Yi-34B-200K · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Written by Pranav Tiwari