he first thing I learned when I started running agents in production is that the model is the easy part. The model is ready, tested at absurd scale, and behaves predictably within certain limits. What fails — almost always — is the context that reaches it.
Context engineering is the fancy name for a simple discipline: deliberately deciding what the agent sees, in what order, and what it does NOT see. Every token has an attention cost. Every paragraph stuffed into the context "just in case" steals focus from another paragraph that actually matters.
What goes in
Context that works is the minimum necessary for the agent to make a good decision. It's not "everything that might be relevant" — it's "what changes the answer if I remove it". The rest is structured noise.
"Context that works is the minimum necessary for a good decision — not the maximum that fits in the window."
In practice, this means building "task brief"-style contexts before each agent call: what we need to solve, what it CAN assume, what it CANNOT assume, and which tools are available. Instead of dumping entire logs, we dump summaries. Instead of dumping the whole codebase, we dump the 3-5 snippets that matter.
What stays out
Residue from previous calls. Conversation history that no longer says anything. Whole stack traces when the first line is enough. Internal documentation that is out of date. Each of these looks useful — and each of them degrades the decision.
If you can't explain why a piece of context is there, it probably shouldn't be. Compaction isn't optimization — it's hygiene.
The good news is that this discipline is cheap. It doesn't require a new model, doesn't require new infra. It requires only that someone stops and designs — explicitly — the context of each call as if it were an API contract. Because, at heart, it is.