Context Window Packing

Pack system, tools, history, RAG, and output reserve into a fixed token budget — then recover when you overflow.

Context budget OVERFLOW
0 / 32,768 tokens 0%
Pack components
System prompt800
Instructions, persona, policies
Tool definitions1200
JSON schemas, function signatures
Output reserve2048
Reserved for model generation
Conversation turns 4
RAG chunks 3
Overflow — choose a recovery strategy
Lost in the middle — position vs. attention

Models often attend strongly to the start and end of context; middle segments get less effective recall.

Start (system)Middle (RAG / old turns)End (recent)

Tip: place critical instructions at the beginning, put the freshest user message and key retrieved docs near the end.