DORA 2025: More AI, More Code, Flatter Delivery

Ninety percent of developers use AI daily, yet organizational delivery metrics have barely moved. The 2025 DORA report explains why: AI amplifies what is already there, making strong teams stronger and dysfunctional ones more chaotic. Here is what the data actually shows.
DORA 2025: More AI, More Code, Flatter Delivery
Photo by Jakub Zerdzicki on Pexels

Introduction

Ninety percent of software developers now use AI tools daily — and yet their organizations’ delivery performance has barely moved. That is the central and somewhat uncomfortable finding of the 2025 DORA State of AI-assisted Software Development report, published by Google Cloud in July 2025. After surveying nearly 5,000 technology professionals and conducting over 100 hours of qualitative interviews, the DORA researchers arrived at a conclusion that cuts against the prevailing enthusiasm: AI accelerates individuals, but it does not automatically improve the system.

What the Numbers Actually Show

The adoption figures are remarkable. Roughly 90% of developers report using AI in their daily work, with 65% saying they are heavily reliant on it. The average respondent had been using AI tools for 16 months by the time of the survey — long enough to move past novelty and into genuine integration. Seventy-one percent use AI to write new code; the majority also use it for documentation, debugging, and exploring unfamiliar frameworks.

At the individual level, the productivity gains are real and measurable. Faros AI, one of the DORA report’s research partners, analyzed telemetry from over 10,000 developers and found a 21% increase in tasks completed, 98% more pull requests merged, and 47% more pull requests handled per developer per day. Over 80% of respondents believe AI has made them more productive, and 59% say it has improved code quality.

Those numbers sound like a success story. The problem shows up when you zoom out to the team and organizational level.

Why Delivery Metrics Stay Flat

The classic DORA metrics — lead time, deployment frequency, change failure rate, and mean time to restore — have not improved in proportion to the surge in individual output. In some dimensions they have gotten worse. Pull request sizes have grown by 154% on AI-assisted teams, and code review time has increased by 91%. Bug rates have climbed 9%. The speed at which individuals write code has outrun the organizational capacity to review, test, and safely ship it.

This maps directly to what the Harness State of AI in Software Engineering report called the “AI Velocity Paradox”: coding is only about 15% of the total work involved in shipping software. The other 85% — review, testing, security scanning, compliance, deployment — remains largely manual. When you accelerate the 15% without changing the 85%, you create a pile-up. The DORA data confirms it: AI adoption now correlates with increased delivery instability, more change failures, and longer cycle times to recover when things go wrong. We covered the shape of this problem earlier this month in AI Velocity Paradox: More Code, More Bottlenecks.

The DORA team’s framing for this dynamic is pointed: AI functions as an amplifier. It magnifies whatever is already true about an organization. Teams with mature engineering practices convert the individual productivity gains into organizational throughput improvements. Teams with fragmented tooling, weak processes, or unclear ownership accelerate the production of technical debt instead. As the report puts it, AI “magnifies the strengths of high-performing organizations and the dysfunctions of struggling ones.”

The Seven Capabilities That Separate Winners

The DORA team did not stop at diagnosing the problem. They identified seven organizational capabilities that predict whether AI investments actually translate into delivery improvements, publishing them in the companion DORA AI Capabilities Model.

The first is a clear, communicated AI stance — organizations that have articulated what AI is and is not for see better outcomes than those that left adoption to individual discretion. The second is a healthy data ecosystem: AI tools are only as useful as the internal information they can access. Organizations where documentation is siloed, inconsistent, or out of date find that AI confidently retrieves wrong answers.

The third capability — AI-accessible internal repositories — is closely related: code, runbooks, and architectural decisions need to be structured so that AI can actually navigate them. The fourth and fifth capabilities are fundamentals that predate AI entirely: strong version control practices and working in small batches. The DORA data shows that teams ignoring these basics do not recover them by adding AI; they compound the problem.

The sixth capability is a user-centric focus, which the researchers flag as particularly consequential: teams without it see negative AI impacts, not neutral ones. The seventh is quality internal platforms — standardized developer toolchains that reduce cognitive overhead and make AI assistance predictable rather than chaotic.

Seven Archetypes, Not Four Performance Tiers

One structural change in this year’s report is worth noting. DORA has retired the traditional four-tier performance classification (Elite, High, Medium, Low) and replaced it with seven team archetypes: Foundational Challenges, Legacy Bottleneck, Constrained by Process, High Impact Low Cadence, Stable and Methodical, Pragmatic Performers, and Harmonious High-Achievers. The shift acknowledges that organizations do not fail or succeed on a single axis. A team can be technically strong but organizationally slow, or fast but fragile. The new archetypes make it easier to diagnose specifically where AI is likely to help and where it is likely to cause harm.

This has practical implications for how engineering leaders should frame their AI strategy. The question is not “how do we adopt more AI?” but “which archetype are we, and which of the seven capabilities are we missing?”

The Trust Gap and the Stability Problem

Despite the broad adoption numbers, 30% of developers report low confidence in AI-generated code quality. That skepticism is grounded: the DORA data shows AI adoption correlating with increased software delivery instability even as throughput rises. More code reaching review does not mean better code reaching production. A separate Harness survey found that 72% of organizations had experienced at least one production incident caused by AI-generated code. The DORA team’s conclusion is that organizations need faster feedback loops and stronger safety mechanisms — not fewer, not the same ones running slower — to handle the higher volumes that AI-assisted teams produce.

Notably, the report finds no correlation between AI adoption and increased burnout or developer friction. The instability problem is organizational, not psychological. Developers are generally fine. Their delivery systems are under strain. This echoes what earlier research on experienced developers found: the tools may slow certain practitioners down not because the tools are bad, but because the surrounding systems were not designed for the volume they now produce.

What Engineering Leaders Should Actually Do

The DORA report is not an argument against AI adoption — 90% adoption has already made that debate moot. It is an argument for deliberate adoption. Three practical takeaways stand out from the data. First, measure the whole system, not just developer output: if your PR review times are ballooning and your change failure rate is climbing, adding more AI coding assistance will make both worse. Second, audit your seven capabilities before scaling AI investment; the organizations seeing the best returns are not the ones with the most AI tools, they are the ones with the strongest foundations. Third, treat documentation and internal knowledge structure as first-class infrastructure — AI tools that cannot find accurate internal context will hallucinate your architecture.

The research partners behind this report include GitHub (whose tools are used by 90% of the Fortune 100) and GitLab (used by over 50% of the Fortune 100), so the sample is not skewed toward early adopters. These are production systems, mature organizations, and real delivery pipelines. The flat metrics are not a failure of imagination. They are a signal about where the work actually needs to happen.

Conclusion

The 2025 DORA report is the clearest evidence yet that AI is not a delivery strategy — it is a capability multiplier. Organizations that use it as a shortcut to skip the hard work of engineering culture, process design, and technical foundations will find their problems running faster. Those that treat it as one layer on top of solid ground will see compounding returns. The next wave of AI tooling will likely make individuals even more productive; the question for 2026 is whether organizations can build the review, testing, and deployment infrastructure to match.

Further Reading

Don’t miss on GenAI tips!

We don’t spam! We are not selling your data. Read our privacy policy for more info.

Don’t miss on GenAI tips!

We don’t spam! We are not selling your data. Read our privacy policy for more info.

Share the Post:

Related Posts

DeepSeek V4: What We Know Before the April Launch

DeepSeek’s first major model since its January 2025 R1 disruption is finally close—a trillion-parameter multimodal system targeting frontier coding benchmarks and million-token contexts. But the AI landscape it’s launching into looks nothing like the one R1 disrupted.

Read More

Elicit: How AI Cuts Systematic Review Time by 80%

A systematic literature review typically takes 3–12 months and is outdated before the ink dries. Elicit claims to cut that process by 80% through automated paper screening and data extraction — a claim that independent peer-reviewed studies largely validate, with important caveats on search sensitivity. This is a practical look at what works, what doesn’t, and what the March 2026 API launch means for institutional adoption.

Read More