Your engineering org shipped more code last quarter than any quarter in its history. Your dashboards are green. Your velocity charts point up and to the right. And your most expensive engineers are quietly drowning. That is not a contradiction. It is the whole story, and almost nobody on the executive floor is reading it correctly.
Here is the signal hiding under the celebration. GitClear analyzed 211 million changed lines of code authored between 2020 and 2024 across repositories owned by Google, Microsoft, Meta, and enterprise C-corps. The finding: the share of code lines associated with refactoring (the work of consolidating and reusing what already exists) collapsed from 25 percent of changed lines in 2021 to under 10 percent in 2024, while copy-pasted (cloned) lines climbed from 8.3 percent to 12.3 percent over the same window. For the first time on record, copy-pasted code surpassed moved code. Duplicate code blocks rose roughly eightfold.
Translate that out of metrics and into money. AI did not pay down your technical debt. It opened a credit card with no limit and started financing more debt at machine speed. The minimum payment has not come due yet. When it does, it lands on the people you can least afford to lose.
Velocity Is Not Progress. It Is a Loan.
Every executive who bought AI coding tools bought a velocity story: more lines, more pull requests, faster tickets. The velocity is real. So is the catch. Code churn, the share of lines reverted or rewritten within two weeks of being committed, climbed from 5.5 percent of new code in 2020 to a projected 7.9 percent in 2024. Code that gets thrown away inside a fortnight was never progress. It was rework you paid for twice: once to write it, once to undo it.
This is what makes the credit-card framing exact. A loan lets you consume now and pay later, with interest. AI-assisted output lets you ship now and maintain later, with compounding interest in the form of duplication, churn, and a shrinking base of reusable code. The bill is denominated in senior-engineer hours, the scarcest currency you have.
The maintenance burden was already brutal before any of this. Stripe's developer research found the average developer spends more than 17 hours a week on maintenance issues like debugging and refactoring, with close to four of those hours spent specifically fixing bad code, an opportunity cost Stripe pegged near $85 billion annually worldwide. Now multiply a pre-AI maintenance load against a codebase generating duplicates eightfold faster. The interest rate just went up.
The Defect Math Nobody Put in the Business Case
Duplication is not a style preference. It is a defect multiplier with a measured rate. Empirical studies of cloned code find that roughly cloned code is modified more frequently and carries more defects than non-cloned code, and clone-focused research has documented bug-propagation rates in the high teens, with one study finding around 17 to 18 percent of buggy clones propagate their defects. The mechanism is simple and merciless: when you copy a block instead of moving or referencing it, every latent bug in that block now lives in N places. Fix one, miss the rest, and you have shipped an inconsistent-change defect, the most expensive kind to trace.
So the eightfold rise in duplicate blocks is not an aesthetic regression. It is a forward-loaded defect liability that has not detonated yet because the duplicates are young. Clones get dangerous when the original logic changes and the copies do not. That gap widens over months, which is exactly why the 2024 cohort of AI-assisted code has not yet shown its full defect bill.
Velocity Up, Quality Down: The Scorecard
Put the trend lines side by side and the pattern stops being ambiguous. Here is the GitClear data, pre-AI baseline against the AI-saturated present.
| Metric | 2020-2021 baseline | 2024 | Direction |
|---|---|---|---|
| Refactored / moved lines (code reuse) | ~25% | under 10% | Down sharply |
| Copy-pasted (cloned) lines | 8.3% | 12.3% | Up |
| Code churn (revised within 2 weeks) | 5.5% | ~7.9% | Up |
| Duplicate code blocks (frequency) | baseline | ~8x | Up steeply |
Every row that should fall is rising. The one row that should rise, reuse, is in free fall. A scorecard like this on a financial statement would trigger an audit. On an engineering dashboard it gets celebrated as throughput.
The Delivery Data Agrees, and That Should Worry You
If this were one vendor's report, you could wave it off. It is not. Google's 2024 DORA report, built on roughly 3,000 respondents and a decade of delivery research, found that rising AI adoption was accompanied by an estimated 1.5 percent decrease in delivery throughput and a 7.2 percent reduction in delivery stability for every meaningful increase in AI use. Read that twice. The tool sold on speed correlated with less stable delivery. DORA's own framing is that AI helps individual and team conditions while pressuring the delivery outcomes that actually matter to customers, unless teams hold the line on small batch sizes and serious testing.
Then there is the uncomfortable controlled experiment. METR ran a randomized trial with 16 experienced open-source developers across 246 tasks in repositories they knew well. The developers predicted AI would make them about 24 percent faster. They reported feeling roughly 20 percent faster. The measured result: they were 19 percent slower with AI than without it. The gap between perceived speedup and actual slowdown is the single most important number for any executive funding this. Your teams feel faster. The instruments say otherwise. You are managing to the feeling.
Your Own Engineers Stopped Believing the Hype
The people closest to the code are already pricing in the risk, even as adoption climbs. Stack Overflow's 2025 Developer Survey found that while AI use is near-universal, trust in AI accuracy fell to 29 percent, down from 40 percent the year before, and more developers now actively distrust AI output (46 percent) than trust it (33 percent). The top frustration, cited by 45 percent of respondents, is AI solutions that are almost right but not quite, the precise failure mode that makes debugging slower, not faster.
And note who distrusts it most. The most experienced developers report the lowest trust and the highest active distrust. These are your senior engineers, the people with the context to spot an almost-right answer and the accountability to fix it. They are not Luddites. They are the early-warning system, and the system is flashing.
Why the Bill Lands on Your Best People
Here is the part the business case never modeled. AI lowers the cost of producing a plausible-looking line of code close to zero. It does not lower the cost of understanding code, reconciling duplicated logic, or safely changing a system. That work still requires senior judgment, and there is no AI discount on it.
So the flow goes one direction. Junior and mid-level engineers, plus the AI itself, generate volume. Senior engineers absorb the consequences:
- Review load. More pull requests, each requiring a human who can tell almost-right from right. That filter is your senior bench, and it does not scale linearly.
- Duplication archaeology. When a cloned block needs a fix, someone has to find every copy. That someone has the whole-system map in their head, and it is not the AI.
- Churn cleanup. Code reverted within two weeks still consumed review, testing, and merge cycles. Senior engineers eat that waste.
- Trust arbitration. Every almost-right suggestion is a judgment call. Volume of calls scales with output. Capacity to make them does not.
You are converting your most leveraged, hardest-to-replace people into a maintenance layer for machine-generated debt. That is not a productivity gain. It is a misallocation of your highest-value capital, dressed up as one.
The Macro Bill Is Already Enormous
Zoom out and the stakes get concrete. CISQ's report on the cost of poor software quality in the US put the 2022 figure at $2.41 trillion, with accumulated software technical debt alone estimated near $1.52 trillion and named the single largest obstacle to changing existing codebases. That figure was tallied before AI assistants began industrializing duplication. The trajectory of GitClear's metrics says the 2026 successor to that number will not be smaller.
This is the strategic miss. Leaders bought AI tools to reduce engineering cost. The credible read of the data is that, managed carelessly, the tools increase the most durable engineering cost there is: the cost of owning and changing software over its life. You did not cut the bill. You moved it off this quarter's ledger and onto a future one, with interest.
The Security Bill Hiding Inside the Debt
Debt is bad enough when it just slows you down. It gets worse when it ships with a breach attached. The same engine that floods your codebase with duplicated, unreviewed code is also writing insecure code at a rate no human team would tolerate.
Veracode tested more than 100 large language models against 80 curated coding tasks and found that 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities. When the model had a choice between a secure and an insecure way to solve the problem, it picked the insecure one nearly half the time. The detail that should end the "we'll just use a better model" argument: security performance stayed flat regardless of model size or recency, even as functional correctness improved. Newer and larger models wrote code that worked more often and was no safer. Java was the worst offender at a security failure rate above 70%, with Python, C#, and JavaScript landing between 38% and 45%.
The human side is more uncomfortable. In a controlled Stanford study, participants with access to an AI assistant wrote significantly less secure code than those without one, and were more likely to believe their code was secure. Less safe, more confident. That combination is how vulnerabilities get merged without a second look. The researchers also found that participants who trusted the AI less and reworked their prompts produced safer code, which tells you exactly where the risk concentrates: in the developers who treat the output as finished.
The industry behavior confirms the exposure. Snyk surveyed more than 500 practitioners and found that 56.4% said insecure AI suggestions were common, yet over 75% believed AI code was more secure than human code. The operational gap is the punchline: nearly 80% of developers admitted bypassing security policies for AI code, and only about 10% scanned most of it.
Now stack this on the duplication problem from earlier. A hardcoded secret or an injection flaw in a copy-pasted block does not stay in one place. It propagates everywhere the block was cloned. You are not carrying one vulnerability. You are carrying it with interest, replicated across every file that inherited the pattern. Insecure duplicated code is not technical debt in the ordinary sense. It is debt with a breach attached, and the collateral is your customer data.
Who Actually Gains, and Who Pays the Interest
Strip away the aggregate productivity numbers and a more awkward pattern appears: the benefit and the bill go to different people.
An NBER controlled experiment found Copilot users completed a programming task 55.8% faster, and that less experienced programmers benefited the most from the tool. The largest field study agrees. Across roughly 1,974 developers at Microsoft and Accenture, the MIT-led experiment found developers completed 12.92% to 21.83% more pull requests per week, with less experienced developers showing higher adoption and greater gains. The framing in those papers is optimistic: AI helps people break into software development.
Read the same finding as a CFO and it inverts. If your largest speed gains land with your most junior engineers, then AI is an amplifier pointed at the people who produce the most volume and exercise the least judgment about what should not be written. They ship more pull requests, more new code, more cloned blocks. Meanwhile the experienced engineers, the expensive ones, get no comparable lift. The METR result referenced earlier showed seasoned developers on familiar codebases actually slowed down with AI assistance. So the productivity curve tilts toward juniors generating output and away from seniors generating leverage.
Then the bill arrives. Someone has to review the flood, trace which duplicated block is the real one, and clean up the insecure patterns the juniors trusted and the policies they bypassed. That work does not go to the junior who wrote it. It goes to the senior who can tell a subtle defect from a correct line. You have engineered a system where your cheapest capital creates the work and your most expensive capital absorbs it. That is the exact inversion of how you want a senior engineer's hours spent. The productivity gain is real. It just shows up on one team's dashboard and the cost shows up on another's calendar.
How to Stop Financing the Debt
None of this is an argument against AI in engineering. It is an argument against measuring AI by output and ignoring the interest. Operators who get this right will run the same tools and get the opposite outcome. The difference is governance.
- Measure debt, not volume. Put duplication rate, two-week churn, and the moved-versus-copied ratio on the same dashboard as velocity. If churn and duplication are climbing, your velocity number is a loan, not a win. Tools like GitClear exist precisely to make this visible.
- Make refactoring a funded line item. A 60 percent collapse in code reuse is a budget signal. Reuse does not happen for free when the cheapest path is to generate a new copy. Fund consolidation explicitly or watch the duplicates compound.
- Protect senior capacity as scarce capital. Cap review queues. Track the share of senior time spent on machine-generated cleanup. If it is rising, you are mispricing your most valuable people.
- Hold the delivery basics. DORA is blunt: AI does not wreck stability on its own. Skipping small batches and serious testing does. Test coverage and tight feedback loops are how you keep the interest rate low.
- Treat almost-right as a defect class. Your engineers already named the top failure mode. Build review standards that assume AI output is plausible until proven correct, not correct until proven wrong.
The teams that win the next 24 months will not be the ones that shipped the most AI-generated lines. They will be the ones that read the interest statement early, kept their reuse ratio high, and refused to let a green velocity chart hide a red balance sheet. Output was never the constraint. Ownership of what you shipped always was. AI did not change that law. It just made it far more expensive to ignore.
Strategia-X helps operators turn AI velocity into durable engineering leverage instead of compounding debt. Start the conversation at strategia-x.com.
-Rocky
#TechnicalDebt #AICoding #GitHubCopilot #SoftwareQuality #DeveloperProductivity #CodeMaintainability #DORA #SoftwareEngineering #StrategiaX #RockyStack #EngineeringDreams



