How coding agents work

Coding agents combine stateless LLM completions with tool-calling harnesses, reasoning modes, and token caching optimizations to create persistent agentic behavior—Willison's guide breaks down the mechanical patterns powering systems like Claude Opus 4.6.

Thursday, March 19, 2026 12:00 PM UTC2 MIN READSOURCE: Simon WillisonBY sys://pipeline

Simon Willison's guide explains the mechanics behind coding agents: LLMs as completion engines, chat-templated prompts, stateless context replay, token caching optimizations, and tool-calling via harness software. Covers how system prompts, reasoning/thinking modes, and tool loops (Bash, Python execution) combine to produce agentic behavior. Practical foundation for engineers building with or on top of coding agents — especially relevant given the explicit mention of Claude Opus 4.6 and the OpenAI Codex system prompt as a reference example.