M
MeshWorld.
AI LLM Beginners Claude Context Window 6 min read

What Is a Context Window and Why Should Developers Care?

By Vishnu Damwala

There is a concept that underpins almost every quirk, limitation, and capability of working with large language models. It shows up when Claude forgets something you said earlier. It shows up when your app suddenly gets expensive. It shows up when pasting a long document works brilliantly or fails entirely.

That concept is the context window.

What it is

When you send a message to Claude, you are not just sending that message. You are sending everything that matters for Claude to understand and respond to it — the conversation history so far, any system prompt, any documents or code you pasted in, and your actual question.

The context window is the maximum amount of text that can exist in that combined input at one time, measured in tokens.

Tokens are roughly chunks of text — about four characters each in English. “Hello world” is two tokens. A typical paragraph is about 100 tokens. A full page of text is about 500 tokens.

Claude’s current context window is 200,000 tokens. That is roughly 150,000 words — a substantial novel’s worth of text. A year ago that would have sounded enormous. Now it is table stakes for a production AI tool.

What happens at the limit

When your combined input exceeds the context window, you cannot send it. The API returns an error. The chat interface tells you the conversation is too long.

But there is a subtler problem that happens before you hit the hard limit: quality degrades.

The further back something is in the context, the less reliably the model attends to it. This is not a bug — it is a fundamental characteristic of how these models work. If you paste a document, have a 40-message conversation, and then ask about something from the document, you might get a less precise answer than if you had asked immediately after pasting it.

This is why “just add more context” is not always the answer. More context can actually hurt if the important information gets buried.

Why it matters for building apps

If you are using the API to build a chat feature, you manage the context window yourself. Anthropic does not do it for you.

Every time a user sends a message, you need to send back the full conversation history. Your code is responsible for deciding what to include and what to drop.

This means if a user has a 200-message conversation, you either:

  • Send all 200 messages every time (expensive, and eventually hits the limit)
  • Truncate old messages (user loses history, older context not available)
  • Summarize old messages (keep the gist, drop the detail)

Which strategy you use depends on your use case. A customer support bot might not need messages from three months ago. A research assistant might need to remember every source it found.

The point is: you have to think about this. It does not manage itself.

Tokens are money

Every token in your context window costs money. Input tokens (everything you send) and output tokens (Claude’s response) are priced separately, with input usually cheaper than output.

At Sonnet pricing, 200,000 input tokens costs about $0.60. That sounds cheap until you multiply it by 10,000 users having 20-message conversations per day.

For high-volume applications, context window management is directly a cost management issue. Strategies like prompt caching (Anthropic caches repeated system prompts so you only pay for them once) and aggressive conversation summarization can cut costs significantly.

Practical things to know

System prompts count. Your system prompt is part of the context window. A 5,000-token system prompt means 5,000 fewer tokens available for conversation.

Documents count. When you “paste a PDF” into a chat interface, all that text enters the context window. A 50-page document is roughly 25,000 tokens.

Code counts. Pasting a large codebase into context is one of the heaviest things you can do. A 10,000-line codebase might be 80,000-100,000 tokens.

Images count differently. When you send an image, it takes up a variable number of tokens depending on size and resolution. Claude has to “see” the image, which uses context space.

The context does not persist between sessions. When you start a new conversation, the context window is empty. Claude has no memory of previous conversations unless you explicitly include them.

The “200k context window” is not a full solution

Large context windows are exciting. Being able to paste an entire codebase and ask Claude to reason about it is genuinely useful.

But bigger context is not a substitute for thoughtful context management. The practical advice:

  • Put the most important information close to the end of the prompt, not buried at the start
  • Do not pad context with irrelevant information hoping it helps — it often hurts
  • For applications, think about what context each query actually needs and send only that
  • Use conversation summarization for long-running sessions rather than passing the full history
  • Know your typical token usage and design your cost model around it

Why this matters even if you just use Claude

Even if you are not building apps and just use Claude through claude.ai, understanding context windows explains things that might otherwise seem mysterious.

Why does Claude sometimes seem to forget something from earlier in the conversation? Long context degradation — it is technically there, but far enough back that it gets less weight.

Why does a fresh conversation sometimes produce better results than continuing a long one? Because the fresh conversation has no noise — only the information relevant to your current question.

Why does Claude sometimes say “I don’t have access to that information” when you told it three messages ago? Could be context distance. Try mentioning it again in your current message.

Context is everything — literally. What Claude sees is what Claude knows, for that conversation, in that moment. Understanding that is the foundation for using it well.


Related: What Is an LLM? A Plain English Guide for Developers · How to Add Claude to Your App Using the Anthropic API