AI Coding Agents
Tool Calling Loop: How a Coding Harness Drives a Stateless Model
Coding harnesses execute a loop where the model emits a tool call, pauses, the harness runs the tool, appends the result to history, and re-invokes the model — making the model functionally stateless between steps.
AI Coding Harness: Tools, System Prompt, Permissions Around the Model
An AI coding harness is the package of tools, environment, system prompt, and permissions layer that wraps a language model into an agent. The harness, not the model alone, determines real-world coding performance.
Context Rot in Long AI Coding Sessions: Why Agents Get Worse as Context Fills
Context rot is the documented degradation of language models as their context fills, driven by softmax-normalized attention spreading thinner over more tokens, and it is sharpest in agentic coding workloads with heavy tool-call churn.
Matt Mayer Harness Benchmark: Same Opus, 16-Point Swing Between Claude Code and Cursor
The Matt Mayer benchmark held the model, PRD, and rubric constant while swapping the harness, finding Opus scored 77% in Claude Code versus 93% in Cursor — a 16-point swing attributable to the harness alone.
Claude 1M Context Performance: Opus vs Sonnet vs Competitors
On the multi-needle MRCR v2 benchmark at 1M tokens, Claude Sonnet 4.5 collapses to 18.5% while Opus 4.6 holds 76-78% and Opus 4.7's long-context retrieval was overhauled further, making Opus best-in-class for usable long context as of April 2026.
Build-Your-Own Coding Harness: The 200-Line Core Loop
A minimal AI coding harness is a 60-200 line program that wraps the tool-call loop around three tools — read_file, list_files, edit_file — or even just bash alone, because modern models are trained on tool-call transcripts.