Engineering teams that have been using AI coding tools in production for a year or more tend to show the same pattern. Output volume is clearly up. Bugs per developer are up too. Review time has gone through the roof. Something in the economics of the work has flipped, and the st
Engineering teams that have been using AI coding tools in production for a year or more tend to show the same pattern. Output volume is clearly up. Bugs per developer are up too. Review time has gone through the roof. Something in the economics of the work has flipped, and the standard productivity numbers aren't catching it.
The work has split into a part that's getting faster, and a part that's getting heavier. The typing leaves the developer's hands. Almost everything that makes the typing worth anything stays with them, and carries more weight than before.
Where the volume is going
By early 2026, somewhere between 30 and 50 percent of code committed to GitHub is AI-authored or AI-assisted. GitHub Copilot alone generates about 46 percent of code in users' sessions. For Java developers it rises to 61 percent. At Anthropic, the CPO reported that 'effectively 100 percent' of their code is AI-written, under human direction.
The typing is cheap. But AI-generated code, left to itself, contains around 2.7 times more security vulnerabilities than human-written code. In pull-request analysis, AI PRs had about 1.7 times as many issues, with critical issues per hundred PRs up 40 percent. Forty-five percent of samples contained OWASP Top 10 vulnerabilities. What used to happen inside a developer's head while they typed, now has to happen somewhere else.
Where the weight is going
Producing code used to be the expensive step. Review and integration were cheap. Now producing code is cheap, and everything around it is expensive. Median pull-request review time is up 441 percent. Bugs per developer, up 54. Incidents per PR, up 243. Senior engineers spend 4.3 minutes on an AI suggestion versus 1.2 minutes on one from a human. Engineers with AI tools were 19 percent slower while believing they had been 20 percent faster — a forty-point gap between feels and is.

The part that stays with the developer is an expanding range:
- Understanding what the code should actually do, in relation to a customer, a product, a business
- Designing the architecture the code sits inside
- Writing the context, briefs, and prompts that set the agents up to succeed
- Building and maintaining the test infrastructure AI outputs have to pass through
- Deciding what to trust, what to rework, what to throw out
- Integrating what AI produces with systems it doesn't have in context
- Organising the work itself: how agents work together, what standing context they carry, where handoffs happen
In volume terms, AI takes on about eighty percent of the typing. In qualitative terms, eighty percent of what determines whether the output is any good stays with the human, and weighs more than before.
What the shift feels like from inside
Three months ago, most of what I did went through my keyboard. Now most of it goes through a set of instructions I give to something else. What stays with me is deciding what to do, reviewing what came back, integrating it with what already exists. The same split runs through any work that involves producing something with AI in the loop. In software development it's further along, and clearer.
A recent conversation with the founder of a Dutch development studio illustrates this: they run about five agents per workstream covering security review, infrastructure, dev-ops, and code generation. One developer acts as an 'AI-automation process consultant' — designing the context the agents work in, reviewing outputs against the architecture. A team that used to ship features in weeks now ships in days.

Where the value lives
Producing code has gotten cheap. Everything that makes code hold together has gotten more expensive: architectural judgment, specification, test design, security discipline, context design for the agents, knowing what to trust in review.
A team strong in architecture, specification, review discipline, and taste will use these tools to compound. A team weak in those places will compound the weakness. AI amplifies what's already there.

The split isn't a loss of work. It's a reallocation of weight. What used to live in the typing now lives in the thinking around the typing. That's harder, not easier. It's also a lot more interesting.



