Context Window Packing
Pack system, tools, history, RAG, and output reserve into a fixed token budget — then recover when you overflow.
Context budget
OVERFLOW
0 / 32,768 tokens
0%
Overflow — choose a recovery strategy
Lost in the middle — position vs. attention
Models often attend strongly to the start and end of context; middle segments get less effective recall.
Start (system)Middle (RAG / old turns)End (recent)
Tip: place critical instructions at the beginning, put the freshest user message and key retrieved docs near the end.