With an eye on the huge downstream pressure that AI code tools are putting on software engineering teams, GitLab released its AI Accountability Report to assess which direction the “industry conversation” is moving in.
The narrative now appears to have shifted from how quickly teams can generate code to whether they can actually control what they ship.
The Harris Poll for GitLab surveyed 1,528 developers and technology buyers across six countries. Some 91% of organizations have two or more AI coding tools in active use, and 78% report that developers are writing and committing code faster since adopting AI tools. But speed is outpacing control, with 43% of respondents reporting they cannot reliably distinguish AI-generated code from human-written code in their own codebase.
Manav Khurana, chief product and marketing officer at GitLab, says the study sheds light on a governance gap opening up due to the sheer volume of code now being produced.
The AI Code Review Bottleneck
“AI has shifted the bottleneck from writing code to reviewing it — 85% of our survey respondents confirmed this,” Khurana says. “Developers have an increased load of validating code they didn’t write and may not fully understand. The gains from writing code faster are washed away by the lag in days-long review cycles.”
He notes that while the speed of software coding has increased, cutting code is only one part of the software development lifecycle (SDLC): before coding comes requirements; throughout coding comes review, security, testing, and deployment; and after coding comes enhancements, integrations, and maintenance.
Khurana argues that the fix is to use an agentic infrastructure that makes the rest of the software delivery process move at the same pace as agentic coding — requiring machine-scale execution, context across the full lifecycle, governance built into the flow, and orchestration across all layers.
“A developer reviewing an agent-generated merge request can see who invoked the agent and what issue it was tied to. What they often can’t see without pulling from multiple systems is what security findings it touched, what policy governed it, and whether the risk it introduced was ever resolved.” — Manav Khurana
A Break in the AI Coding Toolchain
“Only 28% of organizations say their SDLC tools are fully integrated with shared data and workflows,” Khurana highlights. “A developer reviewing an agent-generated merge request can see who invoked the agent and what issue it was tied to. What they often can’t see without pulling from multiple systems is what security findings it touched, what policy governed it, and whether the risk it introduced was ever resolved.”
The GitLab position is that when governance is built into the platform, code reviews are automatic, based on the team and company’s policies. All agent actions are tied to an identity, logged against a policy, and surfaced in the review flow automatically.
“The goal is to make the governance layer invisible to the developer so reviewers can focus on the decisions that require human judgment,” Khurana says.
What GitLab Is Building
In terms of providing for machine-scale agentic execution, GitLab has developed a new Git backend and interface that the company claims will “sustain millions of agent sessions reliably” at high speed. In its own testing, GitLab has recorded up to 50x faster wall clock time and up to 1,000x less network traffic compared to the current generation of Git.
“We have also engineered for context,” Khurana explains. “GitLab Orbit [introduced on June 10] provides agents with a context graph connecting code, pipelines, work items, security findings, and production signals. In our testing, we’re seeing agents work up to 11x faster, require 4.5x fewer tokens, and have 45x fewer hallucinations. More notably, agents can now answer questions they previously couldn’t because they can get all the context they need with a single graph call.”
Additional governance and orchestration developments are also in progress to ensure agent actions are automatically coordinated across the SDLC according to the policies that teams define.
Three Questions to Ask About Any AI-Generated Code
The GitLab report defines AI accountability as the organizational and technical capability to answer three questions about any line of AI-generated code:
- Where did this code come from?
- What was it meant to do?
- Who is responsible for it once it’s in production?
GitLab states that most organizations cannot answer those questions today.
As a result, Khurana says that escalating costs are usually a clear signal that the governance gap is widening. Agents consuming tokens inefficiently against infrastructure that wasn’t built for them is a sign that the context and governance layers are missing.
“Most organizations have pursued agentic software engineering by adding AI coding tools on top of the infrastructure they already have, and the problems are showing up quickly,” Khurana says. “This is where GitLab’s approach differs. GitLab is building the agentic infrastructure that other tools do not address — from execution at machine scale to context, governance, and orchestration across the software lifecycle. A coding assistant makes one developer faster — what we are doing is what makes the whole system move at machine speed without losing control of it.”
Other findings from the report reinforce the urgency: 91% of respondents say they are likely to invest in AI code governance tools in the next 12 months, and 98% have already allocated or expect to allocate budget. Additionally, 85% agree that the next phase of AI in software will focus less on generating code and more on governing it.
What This Means for Developers
Khurana points to what he calls “a maturation” in how enterprises are thinking about AI — one that, if executed well, moves AI code functions from being a productivity tool to a foundational capability that can scale.
That maturation has implications at the senior level for engineering project management, but it also has ramifications for junior developers just starting their careers.
“One of the skills that matters most now is judgment,” Khurana concludes. “Junior developers who invest in understanding systems deeply — not just syntax — and can trace code back through pipelines, security findings, and production signals, are the ones who make agentic engineering work.”
Agents can generate code faster than any developer, but what they cannot yet do is evaluate whether that code is right for the system and the requirements it needs to meet. That judgment gap remains a distinctly human responsibility.