Where the weight in software work is moving

Published : Apr 19, 2026 BY Ernesto Spruyt 6 MIN READ

Engineering teams that have been using AI coding tools in production for a year or more tend to show the same pattern. Output volume is clearly up. Bugs per developer are up too. Review time has gone through the roof. Something in the economics of the work has flipped, and the standard productivity numbers aren’t catching it.

What’s happening underneath the numbers is a split. Writing software used to be one thing. You thought about what to build, translated that into code, made sure the code worked, integrated it, fixed what broke. One person, one flow, one craft. The individual steps were visible, but they belonged together.

That’s not how it holds together anymore. The work has split into a part that’s getting faster, and a part that’s getting heavier. The typing leaves the developer’s hands. Almost everything that makes the typing worth anything stays with them, and carries more weight than before.

Where the volume is going

By early 2026, somewhere between 30 and 50 percent of code committed to GitHub is AI-authored or AI-assisted, depending on how you count. GitHub Copilot alone generates about 46 percent of code in users’ sessions. For Java developers it rises to 61. At Anthropic, an extreme case, the CPO reported late last year that “effectively 100 percent” of their code is AI-written, under human direction.

The direction is unambiguous. For greenfield work, boilerplate, translation between frameworks, scaffolding, internal tools that don’t need to survive: AI does the typing now. The typing has been promoted to a utility.

The typing is cheap. It also requires explicit attention for things that used to sit implicit. AI-generated code, left to itself, contains around 2.7 times more security vulnerabilities than human-written code. In pull-request analysis, AI PRs had about 1.7 times as many issues, with critical issues per hundred PRs up 40 percent. Forty-five percent of samples contained OWASP Top 10 vulnerabilities. For Java, the security failure rate reached 72 percent. The AI does what it’s set up to do. What used to happen inside a developer’s head while they typed, now has to happen somewhere else.

Where the weight is going

Here the conventional production economy inverted. Producing code used to be the expensive step. Review and integration were cheap, because you were checking the work of someone who already knew what they were building and why. Now producing code is cheap, and everything around it is expensive.

Median pull-request review time is up 441 percent. Bugs per developer, up 54. Incidents per PR, up 243. Senior engineers spend 4.3 minutes on an AI suggestion versus 1.2 minutes on one from a human. A study with senior developers on mature codebases produced the counterintuitive finding: engineers with AI tools were 19 percent slower, while believing they had been 20 percent faster. A forty-point gap between feels and is.

“Review” on its own is too narrow a word for what’s happening. The part that stays with the developer isn’t one task. It’s an expanding range:

  • Understanding what the code should actually do, in relation to a customer, a product, a business
  • Designing the architecture the code sits inside
  • Writing the context, briefs, and prompts that set the agents up to succeed
  • Building and maintaining the test infrastructure AI outputs have to pass through
  • Deciding what to trust, what to rework, what to throw out
  • Integrating what AI produces with systems it doesn’t have in context
  • Organising the work itself: how agents work together, what standing context they carry, where handoffs happen

This is what good developers have always done. The ratio has changed. In volume terms, AI takes on about eighty percent of the typing. In qualitative terms, eighty percent of what determines whether the output is any good stays with the human, and weighs more than before.

That’s the piece that’s easy to miss when looking only at output numbers. The output going up doesn’t mean the work got lighter. The weight moved.

What the shift feels like from inside

A small observation from my own work makes the pace tangible. Three months ago, most of what I did went through my keyboard. Now most of it goes through a set of instructions I give to something else. What stays with me is deciding what to do, reviewing what came back, integrating it with what already exists. Three months ago, not three years. I’m not a developer. I run a company. The same split runs through any work that involves producing something with AI in the loop. In software development it’s further along, and clearer.

For how this shows up in software specifically, a recent conversation with the founder of a Dutch development studio gave me a concrete picture.

They run about five agents per workstream. Different agents handle different layers: security review, infrastructure, dev-ops, code generation. One developer sits in the role of what he called an “AI-automation process consultant”: designing the context the agents work in, reviewing outputs against the architecture, deciding where a human has to stay in the loop. A team that used to ship features in weeks now ships in days. The typing volume is far higher. The human judgment layer is deeper, more deliberate, and also more front-loaded: much of it is encoded in how the agents are set up, briefed, and handed off, not performed fresh on every task.

That picture isn’t yet mainstream. But it’s how the front-runners are operating right now, in April 2026.

Where the value lives

Producing code has gotten cheap. Everything that makes code hold together has gotten more expensive. Architectural judgment. Specification. Test design. Security discipline. Context design for the agents. How the work itself is organised. Knowing what to trust in review. These are the parts that now decide whether anything produced with AI is worth shipping.

This changes where the value in a team lives. A team strong in architecture, in specification, in review discipline, in taste, will use these tools to compound. A team weak in those places will compound the weakness. AI amplifies what’s already there.

Whether working this way is worth it comes down to those conditions. The same setup produces compounding in one team and technical debt in another.

The split isn’t a loss of work. It’s a reallocation of weight. What used to live in the typing now lives in the thinking around the typing. That’s harder, not easier. It’s also a lot more interesting.