← All posts
Personal Blog  ·  Essay
Essay · Engineering · The AI Rush

The productivity illusion: when shipping speed becomes the enemy of quality.

AI made building software faster than it has ever been. It also made building bad software faster than it has ever been. Teams are confusing the two, and the cost is showing up everywhere except the dashboards that matter.

Something is going wrong in software teams right now, and most of the people inside them can feel it without being able to name it. The PR throughput is up. The feature shipping cadence is up. The release notes are full of new things. By every metric the leadership tracks, the team is more productive than ever. And yet the product is somehow getting worse, the engineers are exhausted, and the customers are starting to notice. This is the shape of the problem I want to talk about.

The forces creating it aren't mysterious. They're mostly downstream of two simultaneous shifts: AI made writing code dramatically cheaper, and a wave of small AI-native apps started eating the low end of every SaaS market. The combination put established software companies into a defensive crouch they don't quite know how to escape, and the way they're trying to escape it is making everything worse.

The market backdrop, briefly

Most software gets used the way most gym memberships get used. 90% of the users spend most of their time on 20% of the features. Everyone in product knows this. It's been the boring truth of the industry for as long as I've been in it.

What's changed is that small AI-native apps can now replicate that critical 20% in a weekend, ship it at $5/month or for free, and pull in the long tail of new users who would otherwise have started their journey with the incumbent. In creative software especially, across photography, image editing, and design, the AI shortcuts have collapsed entire complex workflows into single prompts. The value proposition of "spend six months learning a sophisticated tool" is decreasing every quarter that an AI can do the same job in one click.

The incumbents are not crazy to feel threatened. Their moat was the depth of their tooling and the size of their distribution. AI is hollowing out the first, and the second isn't going to hold forever on its own. So they do what any threatened incumbent does: they speed up. They use the same AI tools to ship features at the same pace as the upstarts. They release small free apps to feed users into their bigger ecosystem. They make sure there's always buzz in the market about something new.

I get it. The strategy is rational, even necessary. But the way it's actually being executed, on the ground, in real engineering teams, is corroding the thing the strategy was supposed to protect.

AI didn't just make building software faster. It made building bad software faster. We've been measuring the first while quietly accumulating the cost of the second.

Four things I've watched happen

Here's what the productivity illusion looks like inside an engineering team. None of these are hypothetical. All of them are patterns I've watched up close in the last twelve months.

→ Symptom 01

The 50-file pull request

The PR touches 40 to 60 files. It's accompanied by a 30-page AI-generated design doc explaining the changes. The author has, charitably, skimmed the diff. Reviewers do their best, but the volume is so large that subtle bugs hide easily in code that everyone treats as boilerplate.

My team used to mandate that PRs touch a bounded number of files, with larger changes split into a series of focused, reviewable diffs. That practice has quietly evaporated. Nobody decided to abandon it. It just stopped being enforced once AI made the cost of generating a large change trivial, while the cost of reviewing one stayed exactly what it always was.

Before · careful diffs
Bounded PRs, focused review

Changes split across multiple small PRs. Each one tells a single story. Reviewers can actually hold the change in their head.

~5 files touched
100% lines reviewed
After · AI-scale diffs
Mega PRs, performative review

One sprawling change touching everything. Reviewers approve based on the description, not the code. Bugs land hidden among "boilerplate."

40+ files touched
~20% lines actually read
→ Symptom 02

Refactor-by-prompt

Refactoring legacy code used to be a careful, deliberate activity. You'd identify the scope, build a test harness, make the change in small reversible steps, and validate the behavior was preserved at every checkpoint. Now people fire off "modernize this module" prompts to Claude Code, get back a working-looking diff, and ship it.

The dangerous shift is that LLM hallucinations are still there, but the model's confidence has gone way up. The output reads as authoritative even when it's quietly wrong. Code review can't catch this kind of error. The code looks fine and the explanation sounds reasonable. Only automated tests and careful manual validation surface the regressions, and the QE teams running those have been overwhelmed by the sheer volume of changes flooding the pipeline.

→ The new failure mode

In some cases, our pre-release customers have become our QA team. They don't know it. They give us vocal feedback when workflows break. But it's not working in our favor: every broken workflow is a small subtraction from the credibility we spent years building.

→ Symptom 03

The backlog as a graveyard

The engineering backlog of small quality of life bugs, performance improvements, cleanup tickets, that one rare crash report, used to get dedicated cycles. Now it doesn't. The focus has shifted entirely to "how many new features can we deliver this quarter?" and anything that doesn't look like a feature falls off the priority list.

What gets really strange is what counts as a "feature." A fancy new submit button is a feature. Making a slider 10x faster by tweaking the underlying algorithm isn't. One shows up on a launch slide. The other doesn't. Guess which one gets the engineering time.

→ The new prioritization logic

What gets shipped vs. what gets shelved

New AI-assisted submit button with shimmer effect
Shippable · launch slide-worthy
Slider performance optimization, 3x faster
Backlog · invisible win
Onboarding flow redesign #4 this year
Shippable · measurable engagement
Rare crash that affects 0.3% of pro users
Backlog · low impact
"AI-powered" badge on existing feature
Shippable · marketing-friendly
Memory leak in long-running session
Backlog · hard to reproduce
→ Symptom 04

"Let the agent handle it"

There's a paradigm emerging that I find genuinely troubling. Let the agent write the code. Let the agent review the code. Let the agent run the tests. Let the agent fix the bugs. And when something fails in production, let the agent take the blame. It's childish, and it doesn't survive contact with reality.

Accountability should always lie with the human developer or orchestrator. Pushing it onto the tool is a category error. Tools don't have careers, don't get fired, don't have skin in the game. Every time we outsource the thinking, we lose a little of the muscle that did it. The ability to identify bugs through pattern recognition, to feel the wrongness of a solution before you can articulate it, to hold the complexity of a system in your head: these are cognitive skills that atrophy when an agent does the work.

The cost is real, and falling on everyone

The strange thing about the productivity illusion is that it isn't making anyone happy. It looks like a winner-take-all situation where someone is gaining at someone else's expense, but if you actually look around the room, every stakeholder is losing in their own way.

→ The pain, distributed

Everyone is running, nobody is happy

Customers
Receive half-baked features that look new but ship buggy. Spend hours filing bug reports for things they paid to use. Lose trust in the product line they once loved.
Engineers
Ship code they're not proud of. Watch their craft be measured in PRs per week. Feel the burnout building. Lose the sense of ownership that made the job feel meaningful.
QE teams
Drowning in a volume of changes they cannot possibly validate. Discover regressions days after they shipped. Watch test coverage and confidence both quietly decay.
PMs & EMs
Caught between the leadership's "ship more" message and the team's "we're burning out" reality. Forced to deliver bad news in both directions. No good choice available.
Management
Watching the stock price reflect the market's growing skepticism about long-term strategy. The output metrics look great. The outcome metrics don't.
Investors
Looking at ARR growth that doesn't match the velocity story being told to them. Wondering when the bill comes due.

This is what makes the situation so strange. There's no villain in this story. The market pressure is real. The strategic response is rational. The individual engineers are doing their best inside the constraints they're given. The PMs are responding to the same incentives the executives are responding to. Everyone is being reasonable, locally, and the aggregate result is a slow-motion erosion of the thing that made the company worth working at in the first place.

What we're actually losing

I want to be specific about the cost, because "quality" is a fuzzy word that lets executives shrug off the conversation.

We're losing institutional craft. The unwritten standards that experienced engineers held, like small PRs, careful review, deliberate refactoring, validated changes, were never in a document. They lived in the culture, transmitted from senior to junior, reinforced by what got praised and what didn't. When the culture stops reinforcing them, they evaporate within a couple of quarters. Getting them back is much harder than losing them.

We're losing customer trust. Not all at once. Just one frustrated bug report at a time. A pro user who hits three broken workflows in a month doesn't write an angry blog post. They just quietly start using a competitor for the part of their workflow that's most broken. Eventually they stop showing up. By the time it's visible in the churn metrics, the damage is years deep.

We're losing engineering judgment. The instinct to push back on a bad spec. The pattern recognition that says "this refactor is going to bite us." The careful reading that catches a subtle bug. These are muscles, and they only stay strong if you use them. An entire generation of engineers who learned the job in an AI-pair-programmed world might never develop them fully, and that's a structural loss for the industry.

We're losing accountability. When code is generated by an agent, reviewed by another agent, and shipped by a third, the question "who is responsible for this?" becomes harder to answer. And in software, an unanswered question of responsibility eventually becomes an actual problem that bites real users.

→ The honest part

I want to be careful not to sound like a complete reactionary. AI tools are genuinely transformative. Used carefully, they make engineers significantly better: faster on boilerplate, broader in their reach, more willing to attempt ambitious refactors. The problem isn't the tools. The problem is the management theory and team culture that's grown up around the tools, and the assumption that more output equals more value.

What would actually help

I don't have a clean five-bullet fix for this. But here are the things I've started doing on my own team, and that I'd suggest to anyone watching the same patterns unfold:

Re-enforce the PR size limit, with teeth. A 40-file AI-generated PR isn't a productivity win. It's a review failure waiting to happen. Treat large changes the way you'd treat large transactions in a database: break them up, validate each one, ensure rollback is possible.

Make engineering quality a measured outcome, not a vibe. Bug counts in the first 30 days post-release. Time to fix incoming critical bugs. Performance regression rates. These are tractable metrics that resist the productivity illusion because they measure what actually shipped, not what was shipped.

Reserve cycles for non-feature work explicitly. A standing 20% of engineering time for performance, cleanup, debt, and infrastructure. Not "if we have time." Reserved. The teams that survive long-term are the ones that protect this; the teams that don't are the ones whose codebase becomes unmaintainable by year three.

Use AI as an accelerator, not an autopilot. The right model for AI in engineering is "you still drive, the AI just makes the car faster." Not "the AI drives, you nap in the back seat." The former produces leverage. The latter produces accidents.

Speak up. If you're an engineer watching this happen, name it. If you're a manager seeing the cost build, surface it. The productivity illusion only persists as long as everyone agrees to pretend the dashboard is the truth. Once enough people inside the org start saying out loud that the dashboard is lying, the conversation has to start.

There's a bright light at the end of this tunnel only if we choose to find it. The tools aren't the problem. The story we're telling ourselves about what the tools mean, that's the problem. And stories can be rewritten, once enough people inside them start telling a different one. Thanks for reading ✦

Comments