AI Is Reshaping Engineering Workflows, Not Replacing Them

Written by Matt Hogan | March 31, 2026

Written by Matt Hogan

If the first article in this series was about the broader strategic frame, this one is about where most organizations actually encounter AI first: engineering.

That is not surprising. Software development is unusually well suited to this moment. It is digital, highly structured in places, rich in patterns, and full of repetitive tasks that can be accelerated without eliminating the need for judgment. If you were designing a function in which AI would show visible returns early, engineering would be near the top of the list.

That, however, is also what makes it easy to misunderstand.

The Misread: AI as Coding Replacement

The most common interpretation of AI in engineering is still too narrow. It tends to focus on code generation and then draws a straight line from faster coding to fewer engineers. That framing is superficially compelling, but it misses the more important shift. What AI is changing first is not the need for engineering. It is the shape of engineering work itself.

The distinction matters.

For years, the industry has treated coding as if it were the center of software development. In practice, it never really was. Code matters, of course, but the actual work of engineering has always been broader than implementation. It includes understanding systems, tracing dependencies, designing changes, reasoning about risk, deciding where patterns should hold and where they should break, debugging failures that emerge only under real conditions, and maintaining coherence over time as systems evolve. The act of writing code is only one part of that larger whole.

What AI has done is compress part of that whole — often dramatically — while leaving the rest intact.

Real Gains, Easy to Overread

That is why the early results are both real and easy to overread. McKinsey’s work showed that on some common software tasks, developers can complete work up to twice as fast with generative AI, particularly on activities like documentation, code generation, and refactoring. Google’s enterprise randomized controlled trial, which is more useful than many smaller demonstrations because it looked at a complex enterprise-grade task rather than a toy exercise, found developers about 21% faster with AI assistance. DORA’s 2025 report, based on nearly 5,000 technology professionals and more than 100 hours of qualitative research, found both widespread adoption and broad productivity gains, but its most important conclusion was not the size of the gains. It was that AI acts primarily as an amplifier. It strengthens the systems that are already functioning well and exposes the weaknesses of the ones that are not.

That last point is the one I think matters most.

If AI were simply a coding accelerator, then the primary question would be tool choice: which assistant, which model, which interface, which license. Those questions matter, but only at the surface. The more consequential issue is what happens when implementation gets faster but the surrounding system does not. If engineers can produce code more quickly, but requirements remain fuzzy, testing remains weak, review remains overloaded, and context remains scattered across repositories, tickets, chat threads, and people’s heads, then some of the local gain will be real but much of the system-level gain will be lost. Atlassian’s recent developer-experience work is useful here. It found that many developers report saving more than ten hours per week with generative AI, yet much of that reclaimed time is offset by broader organizational inefficiencies that still shape how work moves. The implication is hard to ignore: coding speed is not the same thing as engineering throughput.

This is why I think the right way to understand Pillar 1 is not as “AI for coding,” but as AI-assisted engineering.

That phrase sounds broader because it is broader. It includes code generation, but it also includes implementation planning, test generation, self-review, documentation, and support in unfamiliar parts of the codebase. More importantly, it changes the engineering loop itself. Instead of beginning with a blank editor and moving linearly toward a PR, the loop increasingly becomes: understand, analyze, plan, generate, refine, test, review, and then submit. AI has a role in each of those stages, but the engineer’s role does not disappear. It shifts upward. Less time is spent on boilerplate or reconstructing common patterns. More time is spent deciding whether the generated solution is actually the right one, whether the edge cases matter, whether the tests are meaningful, and whether the change fits the architecture rather than merely satisfying the ticket.

The “One Engineer Can Build Anything” Temptation

That shift is why the recurring “one engineer can now build an application” narrative is both directionally true and strategically incomplete.

There are contexts in which a single engineer, working with strong tools, can do something today that would have taken a small team not long ago. That is real. It is also most visible in bounded settings: internal tools, prototypes, greenfield applications, and work where implementation is the dominant constraint. But most real engineering organizations are not operating in those conditions all the time. They are dealing with existing systems, multiple services, legacy assumptions, operational obligations, security constraints, and the accumulated complexity of software that has been alive long enough to matter. In those conditions, the bottleneck is not simply code production. It is coordination, correctness, and continuity.

The emerging academic evidence points in the same direction. A recent Science paper on the global diffusion of AI coding assistants found measurable productivity benefits associated with AI-assisted coding, but those benefits were concentrated among senior developers, while early-career developers showed no statistically significant gains. Another recent randomized study of experienced open-source developers working in mature repositories found something even more useful as a caution: in that setting, AI actually increased completion time by 19%. Neither of these studies invalidates the broader productivity story. What they do is force the more interesting conclusion: gains are real, but they are heterogeneous. They depend on codebase maturity, workflow design, user experience, and the degree to which the person using the tool has the context and judgment to direct it effectively.

That is one reason I do not think the right executive response is to push engineering teams toward a simplistic mandate to “use AI more.” A better response is to standardize the conditions under which AI actually produces leverage.

Two Modes: Inline Help and Task Execution

At Liferaft, that means distinguishing between two different modes of use. One is inline assistance: quick help while coding, small implementations, localized refactors, scaffolding, and test generation. The other is broader task-oriented execution: multi-file changes, ambiguous work, larger features, migrations, non-obvious debugging, and planning. In practical terms, that is why I think the model of Copilot in the IDE and Codex CLI-first makes sense. The point is not that these are the only tools that matter. The point is that they reflect two very different kinds of engineering work, and that work benefits from different modes of interaction.

What matters even more than the tools, though, is the workflow around them.

For small tasks, the engineering loop can remain lightweight. Use AI inline, move quickly, generate or extend tests, ask for a quick self-review, then submit. For larger or less bounded work, the workflow should be more deliberate: begin with analysis, ask for a plan before asking for code, review that plan, generate a first pass, refine locally, expand tests, run checks, inspect the diff, and then ask AI to critique the change before it reaches another human reviewer. This is not process for its own sake. It is how you convert an assistant from a faster autocomplete engine into something closer to a productivity multiplier.

That distinction is important because there is now enough evidence to show that adoption alone does not produce durable gains. DORA’s AI Capabilities Model makes that point explicitly: organizations realize more value when AI adoption is accompanied by stronger systems, clearer workflows, and more mature engineering practices. In other words, AI does not rescue weak engineering discipline. It rewards strong engineering discipline.

This is also why I do not think AI in engineering should be framed defensively — either by leaders or by engineers themselves.

The wrong way to talk about it is as if the craft is being hollowed out, or as if the core skill of the profession is about to disappear. The more accurate interpretation is that some forms of labor are becoming less scarce, while some forms of judgment are becoming more valuable. When a draft can be generated quickly, the quality of the outcome depends even more on the quality of the person steering it. Architecture matters more. Review matters more. The ability to reason about failure modes matters more. Taste matters more. Knowing when not to accept the obvious solution matters more.

If that sounds familiar, it should. Most technological shifts do not erase expertise. They move its center of gravity.

Pillar 1 as the Beginning, Not the End

That is what I think Pillar 1 really is.

It is not the end state. It is the beginning of a broader change in how engineering work is organized. But it matters because it is where the organization learns the first real lessons: how to distinguish speed from throughput, how to redesign workflows rather than just adding tools, how to preserve accountability while increasing leverage, and how to make stronger engineers more effective rather than merely making output more abundant.

That is also why I think the right expectation is not magic.

The best available evidence supports real gains, but not fantasy. In most organizations, a well-executed Pillar 1 should be thought of as a meaningful productivity and quality initiative, not as a promise that one person now replaces many. The real payoff is that engineering teams can spend less time on low-leverage work, move faster through routine implementation, strengthen tests and self-review, and apply more of their attention to the parts of software development that were always hardest to automate in the first place.

That is already significant.

And it is only the first pillar.

View full post