Based in Italy · 2026
All writing
12 min read

What happens when coding is not the bottleneck anymore

AI made writing code fast. Teams are not shipping much faster. The bottleneck moved, and the Theory of Constraints called it years ago.

  • ai
  • ai-native-engineering
  • engineering
  • teams
  • theory-of-constraints
  • productivity

Engineering teams in 2026 can write code up to 5 times faster than 2 years ago, at least on codebases that are well prepared for AI. But the cycle time, from a ticket opened to a feature in production, is barely moving. Not 5x. Not 2x. Sometimes a little. Sometimes nothing.

This gap is the whole story of where AI-Native Engineering is right now. The coding part got fast. Everything else did not. And until you sit with that and decide what to do about it, your team will keep buying more Cursor seats and asking why the chart still looks flat.

The bottleneck moved. It did not disappear.

For most of software engineering history, the slowest step from "idea" to "shipped" was a human typing code. Not the only slow step, but the biggest one. We built everything around it. Tickets to feed it. Sprints to measure it. More hiring to scale it.

Then coding agents got good. A senior engineer can now turn a clear spec into a working PR in under an hour for a wide class of tasks. The "typing code" box on the value-stream map shrank. So far, great.

The problem is what comes next. If you take any system and remove one constraint, you do not get a faster system. You get a system where the next slow step is now very, very visible. The bottleneck moved upstream into the work before the code, and downstream into the work after the code.

A quick detour into the Theory of Constraints, because it predicted this exactly. The idea comes from manufacturing in the 1980s. The observation is simple: in any system that turns inputs into outputs, there is one step that is slower than all the others. That step sets the pace of the whole system, and nothing else does. Picture a factory with 10 machines. Machine 4 produces 100 units per hour. The other 9 produce 200. The factory ships 100 units per hour, not 200, not 150. Speed up machine 7 from 200 to 300 and the factory still ships 100. Only when you make machine 4 faster does the whole place move.

So a system's throughput is set by its slowest stage. Speed up any other stage and the throughput does not change. The only thing that grows is the pile of work-in-progress in front of the real constraint. Push hard enough and the system actually gets worse, because work piles up faster than the constraint can drink it.

The framework comes with 5 steps you can run in a loop: find the constraint, get the most out of it, line up everything else to support it, make it bigger, and then do not get stuck doing the same thing once the constraint moves.

Theory of Constraints: the five focusing steps (identify, exploit, subordinate, elevate, avoid inertia)

AI-Native Engineering, applied only to coding, is exactly that move. You make the implementer faster. So now PRs pile up, waiting for someone to review them. The reviewed PRs pile up, waiting for product or stakeholder sign-off. The product asks pile up, waiting for a decision from leadership. Every queue that was hidden before, because the engineer was slower than all of them, is now full and visible. The team feels busier than ever, and the roadmap looks the same. (At this point someone usually says "we just need to do more standups." Please do not.)

Where the new constraints actually live

When a team starts writing code 5x faster, here is what becomes painfully obvious. None of this is new. It was just hidden behind a slower upstream.

Code review. The most obvious one. A reviewer who used to look at 3 PRs a day is now looking at 8, and each PR is the same size but written by a system that has no intuition for the team's invariants yet. Review time per PR goes up. The queue grows. Reviewers become the constraint, and they hate it. Yes, "LGTM" is still a coping mechanism. No, it is not a strategy.

Product decisions. A PM who could feed 4 specs per sprint is now being asked for 12. The thinking a real PRD needs (talking to users, reading data, weighing trade-offs) is not something AI has made dramatically faster yet. Especially when the actual bottleneck on a decision is "find time on 6 calendars."

Design. Mockups, components, design systems. Accessibility, copy review, brand. The handoff from design to engineering is still mostly a meeting. AI helps draft things, but the alignment is human, and the people are the same people as last year.

Stakeholder alignment. Legal, security, compliance, finance, ops, support, marketing. Every one of them has an inbox. Every one of them has a queue. Every one of them has a calendar that does not care about your sprint goal.

Leadership decisions. "Should we even build this?" "Are we OK changing this API?" "Do we have budget for that vendor?" These travel up the org and back down at the speed of human management bandwidth, which is roughly the same speed it was in 2015.

Cross-team coordination. Your team ships fast. The team that owns the database does not. The platform team owns the deploy. The data team owns the schema you depend on. Your fast code finds the slowest neighbor and waits, like every microservice you ever wrote.

Look at the list. None of these are coding problems. They are deciding and agreeing problems. AI made the doing fast. The deciding is still as slow as it was.

The real unit is the cross-functional team

You see this most clearly when you stop measuring the engineer and start measuring the cross-functional team.

A fast engineering team inside a slow cross-functional team is like a sports car parked in traffic. The horsepower is real. It just does not matter. The team's throughput is still set by the slowest function, not the fastest one. And in most product teams in 2026, the slowest function is no longer engineering.

This is what the early "10x engineer" story missed. The 10x framing assumes engineering is still the constraint. For more and more teams, it is not. The new constraint is the connective tissue: how a PM, a designer, a tech lead, a stakeholder, and a reviewer get to a shared understanding fast enough to keep up with the implementer that no longer slows them down.

The general name for this is collaboration. The specific failure mode is: the cross-functional team is not speeding up much, because the slowest functions inside it have not.

What actually moves the system

If you take the Theory of Constraints seriously, the moves are not subtle. You either elevate the constraint, or you exploit it better, or you restructure the system so the constraint moves to a place you can attack. Some of this is old Lean thinking. Some of this is specific to the AI era.

1. Smaller, more focused, fully cross-functional teams

The biggest structural lever is to stop running a team where engineering, product, and design are 3 separate queues passing tickets to each other like a relay race nobody wins. Replace it with a pod: 1 PM, 1 designer, 2 to 4 engineers, sitting close to the stakeholders that matter, owning a problem from start to end.

The reason has nothing to do with engineering output. It is about decision latency. In a pod, "is this the right thing to build?" is a 10-minute chat. In a sliced-up cross-functional team, the same question takes a week, because it has to walk across 4 calendars and 2 Slack channels. AI does not fix calendars. Smaller pods do.

Amazon's "2-pizza team" rule has been re-discovered many times for many reasons. With AI accelerating implementation, the reason is sharper now: the smaller the team, the lower the coordination tax. And the coordination tax is now the biggest line on the bill.

2. Speed up code review (this is the next big chapter)

Code review is the most visible bottleneck right after coding, which is why most teams hit it first. Engineers feel it every day. Reviewers feel it more.

There is a real chapter's worth of moves to make here: AI as a first-pass reviewer for the boring stuff (style, obvious bugs, test coverage, hallucinated APIs), spec-anchored review so the reviewer is checking against a written intent and not trying to re-derive it from the diff at 5pm on a Friday, narrower PRs so the human review has a bounded scope, and team norms about what AI-generated code has to carry with it before it is even allowed near the queue.

I will write more about this in a future chapter of the book, because the difference between a team that solves review and one that does not is roughly the difference between 2x and 5x in real throughput. For now the structural point is enough: if you speed up implementation without speeding up review, you are filling a queue you cannot drain. That is a backlog, not a roadmap.

3. AI for PMs, designers, and leadership, not only engineers

This is the move most companies have not made yet. It is also the one with the biggest payoff at the team level.

If your engineers are using Claude Code, Cursor, and an AGENTS.md file, and your PM is still writing PRDs in a blank Notion page at midnight, you have not built an AI-Native team. You have built an AI-native engineering function inside a non-AI-native company. The two are very different.

The same logic goes upstream. PMs can use AI to draft PRDs, run competitive scans, summarize user interviews, generate test scenarios, and pressure-test their own thinking before it ever lands on an engineer's desk. Designers can use AI to generate variants, document components, draft copy, and prototype interactions. Leadership can use AI to summarize, model trade-offs, and prepare faster decisions.

The point is not "everyone uses ChatGPT now." The point is that every role on the cross-functional team needs their own version of the AI-Native practice: clear intent, real context, a human-in-the-loop checkpoint, an explicit verification step. When only engineering has it, the gain is capped at the engineering part of the system. The math is brutal here.

4. Smaller bets, faster prototypes, real artifacts

When implementation is cheap, the right shape of work changes. You stop running a single 12-week project that has to be perfect on the first try. You start running 4 to 6 small bets that take days to prototype and then either get killed or scaled. Yes, this means killing more of your own work. No, your ego does not deserve a feature flag.

This is where vibe prototyping, which is genuinely useful, reconnects with serious engineering. A working prototype is the cheapest way to align people on something. Stakeholders react to a thing, not to a slide. The 2-week debate about "should this be a modal or a dropdown?" becomes a 1-hour debate after everyone has clicked the prototype.

The shift is from "let's decide and then build" to "let's build the smallest version that lets us decide." AI drops the cost of that smallest version by an order of magnitude. Teams that use this change their decision cadence. Teams that do not, keep arguing about wireframes.

5. SDD as the cross-functional contract

The other underrated lever is to push spec-driven development out of engineering and into the cross-functional team.

A real spec, the kind I described in What is AI-Native Engineering?, is not only an engineering artifact. It is a contract. It states the behavior, the constraints, the acceptance criteria, and the edge cases in a language that PM, design, and engineering can all read and amend. It is the natural place to find disagreements early, when changing them is cheap. And not at PR review on Thursday afternoon, when changing them is expensive and also nobody is in the mood.

Teams that share a spec template across functions get a benefit nobody talks about: their stakeholder review compresses. Instead of looping security in at the end on a finished PR, you loop them in at the spec, where changing 2 sentences is enough to fix the problem. The same logic applies to design, product, legal, and platform.

The spec becomes the place where cross-functional alignment happens, and also the place from which AI implementation is launched. Same artifact, doing two jobs. That is what makes it efficient.

The spec as a contract between PM, design, and engineering

How to know which constraint you have

The honest version of this advice is: do not adopt all 5 moves at the same time. Find your team's real constraint and elevate that one.

Quick diagnostic. Spend 1 week tracking, for every piece of work your team is doing, where it spends time. Not "in progress" vs "done." Specifically: how many hours it spends in each of these states.

  1. Waiting for a spec or a decision.
  2. Waiting for a design.
  3. Waiting for stakeholder input.
  4. Being implemented.
  5. Waiting for review.
  6. Waiting for deploy or QA.

Whichever bucket is biggest is your constraint. Everything else is local optimization until you fix that one. If your biggest bucket is "being implemented," congratulations: AI coding tools are your next investment. If your biggest bucket is "waiting for review" or "waiting for stakeholder input," more AI in your IDE will literally make things worse, because it fills the queue in front of the real constraint even faster. It works on your machine. The queue does not care.

Most teams I have looked at in the last 12 months had implementation as their 3rd or 4th biggest bucket. Not their 1st. They were investing in coding speed because it was visible and exciting, and not because it was the constraint.

The thing to remember

The Theory of Constraints called this situation a long time ago. Speed up any non-constraint and you get zero throughput improvement and a bigger pile of half-done work. Speed up the actual constraint, even by 10%, and the whole system moves.

AI-Native Engineering, done well, is not about making engineers code faster. It is about figuring out which step in your team's value stream is actually constraining throughput, and pointing AI exactly there. Sometimes that is coding. In 2026, often it is not.

The next chapter of this work is about the most obvious second constraint: code review. That is where AI-accelerated implementation crashes hardest into a human-paced process, and that is where the next round of practical gains is sitting.

For now, if you take one thing from this post: stop measuring the engineer. Start measuring the cross-functional team. That is the unit that ships, and that is the unit the constraint lives in.

Where to go next

  1. The book. AI-Native Engineering: Building Production-Ready Software with AI covers the full practice, including the code-review chapter I just teased above. alfonsograziano.it/book
  2. The newsletter. Weekly, written from the book drafts. Subscribers get new chapters before they go public. Subscribe on Substack
  3. Start with the foundation. If you have not read What is AI-Native Engineering? and On AI-assisted software engineering, please start there. This post assumes the shared vocabulary.

The engineering organizations that win the next 18 months are not the ones with the most Cursor licenses. They are the ones that figured out where their real constraint is, and pointed the tools at it.