March 9, 2026

Taming the firehose of AI code: Scaling throughput without exploding CI costs

By Mirco Dotta

If you've been working with CI/CD for any significant amount of time, you're likely familiar with the frustration of waiting. You push a commit, and then you wait. You wait for the environment to spin up, you wait for dependencies to download, you wait for tests to run—often on code that hasn't even changed.

In large enterprise systems, this isn't just a nuisance—it's a massive productivity black hole.

"Based on our experience, about 50-80% of total build time is typically wasted doing redundant work." 10:46

Think about this. Up to 80% of the time our expensive cloud infrastructure is running, they're doing work they've already done before. AI agents and developers alike need constant, rapid feedback to iterate.

If a compile/test cycle takes 30 minutes, then:

Your AI agent is idle or "hallucinating" on stale data
Your human developer is losing productivity from context-switching, and experiencing greater cognitive fatigue from the effort

So for both humans and AI, this is not only costly, but also takes a toll on overall productivity. With the rise of autonomous AI agent swarms, the impact of this bottleneck is about to get a lot worse and is a primary impediment to realizing the full business value of your AI investment.

This blog will walk through why this is happening, the specific offenders clogging our pipes, and how we can solve it using a concept we call a Build Artifact CDN—specifically, Develocity Universal Cache.

If you prefer watching a video to reading a post, check out our webinar recording.

The problem: The "firehose" of code

The traditional CI/CD infrastructure was built on the assumption that humans write code at human speeds. We used to commit a few times a day, push changes, and wait for feedback. But that assumption is rapidly becoming obsolete.

With the surge of AI-driven development, the volume of code entering our pipelines is exploding. Recent surveys suggest that in just the last six months, lines of code per developer have increased by at least 75%, and pull request sizes have grown by 33%.

We are seeing a "firehose" effect. Some companies are seeing a 5x to 10x growth in commit volume. As Amazon engineer Joe Magerramov noted regarding his team's experience:

"The overall math changes and what used to be a production impacting bug once or twice a year can become a weekly occurrence." 01:55

The question for engineering leadership is simple: Now that we can code faster, can we deliver faster? And can we do it without our CI bill increasing by 500%?

Not without intelligent, comprehensive caching. Let's see why.

The ephemeral build dilemma

Most of us have moved to ephemeral CI builds (like Jenkins, GitHub Actions, etc.) for security and stability. We love them because they start with a clean slate every time. But "clean slate" means "empty cache."

Every time a build runs, we are:

Downloading dependencies from scratch
Initializing toolchains (compiling build scripts, calculating task graphs)
Rebuilding and retesting code that hasn't changed

If you're working on a 100,000-line codebase and change two files, a standard ephemeral build will often still pay the tax for the entire project. This leads to wasted egress costs, network bandwidth strain, and—most importantly—slower feedback loops.

The three main offenders

Let's look at the data. Using Develocity Analytics, we can actually see where the time goes.

1. Loading Dependencies

In the real-world example we analyzed (image above), a single organization downloaded 7 TB of dependencies in a single week. That's nearly 20 days of cumulative download time.

For a specific build example, we can look at the Apollo GraphQL project. A single CI build spent 4 minutes 38 seconds just downloading dependencies. That is pure wall-clock time spent doing absolutely nothing productive.

2. Build Environment Setup

Before your build system (such as Gradle) even runs a task, it has to figure out what to do. It compiles scripts and computes the task graph, and in large repos this is an imposing time sink. We've seen examples where just this configuration phase takes over seven minutes per build.

3. Unnecessary Rebuilding

This is the classic "re-running tests on code I didn't touch" problem. In a Maven build for the Apache Stream Pipes project, we saw an 8-minute build where over six minutes of work could have been completely avoided because the inputs hadn't changed.

The solution: A Build Artifact CDN

So, how do we stop the waste?

"What happens when something is slow in computing? What do we do? The answer is almost always the same, right? You cache it." 01:00

But not just any cache. If your cache is in a remote repository halfway across the world, you're still fighting network latency. You need the cache to be close to where the build is happening.

Think about Netflix, Steam, or Akamai. They use CDNs (Content Delivery Networks) to put nodes close to you so your content loads instantly. We need to apply that same logic to our CI infrastructure. We need a Build Artifact CDN.

This is what Develocity Universal Cache does. It places a node on the same network as your CI agents (and even close to your developers). The CI system pushes to the closest node, and that node handles the distribution.

How Universal Cache works

Develocity Universal Cache tames the three offenders we mentioned earlier with three distinct layers of caching.

1. Setup Cache (the "start faster" layer)

This layer addresses the build environment setup. Setup Cache accelerates the initialization phase of builds. Build tools like Gradle perform compute-intensive steps, including compiling scripts, building the task graph, and calculating file hashes before the actual build execution begins.

By restoring the initialization state on the CI agent before the build starts, the Setup Cache prevents builds from spending time doing work already done by a previous build. This optimization typically results in a 50% reduction of the Gradle configuration phase time.

2. Artifact Cache (the "bandwidth" layer)

This replaces the "download the internet" phase. Imagine that instead of each CI agent reaching across the WAN to Maven Central, npmjs.com, or a remote Artifactory for thousands of individual dependencies, Universal Cache serves as a high-speed bridge.

By caching packages on the local network (LAN) immediately adjacent to your build agents, it transforms a slow, 2,000-part external trek into a single, high-bandwidth local pull. This reduces egress costs, shields your pipeline from external repository downtime, and ensures that bandwidth is no longer a bottleneck.

We typically see a 95%+ improvement in dependency resolution time here. Importantly, if you have a new dependency (like in a Dependabot pull request), the system is fine-grained and smart enough to fetch just that new file from the source, while intelligently caching any unchanged dependencies.

3. Build Cache (the "compute" layer)

This is the most transformative layer. It looks at the inputs of your build tasks (compilation, tests, checkstyle).

"If I use the same source file... and I run that through Java C, I'm going to get the same output as long as I have the same inputs... I can get significant benefits from downloading a previously cached version of that and then skipping that compute." 30:00

If the inputs haven't changed, the Build Cache downloads the output (the compiled class, the test result) instead of running the process. We are essentially trading a tiny bit of local network traffic for a massive amount of CPU time.

Putting Universal Cache into practice

How does this look in practice on an ephemeral CI agent? Here's the lifecycle of a build using Universal Cache:

Pre-provisioning: Before the build script even starts, the agent connects to the local Universal Cache node.
Bulk download: It pulls down the dependencies, wrappers, and environment setup files.
Execution: The build runs.
- New dependencies: If you added a library, it downloads normally.
- Build caching: For tasks whose inputs haven't changed, it pulls the results from the cache node immediately.
Updates: At the end of the build, any new artifacts or task outputs are pushed back to the local cache node so the next build (or the next developer) can use them.

The results are immediate and visible. In the two images below, you can see that caching enabled the Apollo GraphQL project to go from 4m 38s of dependency downloads to zero. Similarly, the Apache Stream Pipes reduced overall build time from 8 minutes to under 3 minutes.

Apollo GraphQL

Apache Stream Pipes

The final puzzle piece: Observability

None of these optimizations are possible if you're flying blind. This brings us to the final, critical piece of the puzzle: Observability.

It's mind-boggling to realize that we would never push a production application live without Datadog or New Relic, yet we routinely run our CI pipelines—the factory floor of our software—with zero monitoring.

"Most of us are adding hundreds of thousands, if not millions of lines of code a year. All of that is just load on your system." 38:00

Without observability, you cannot answer basic questions such as:

Is my build getting slower over time?
When did my tests become flaky?
Where did this artifact come from?

Develocity provides the data required to pinpoint these bottlenecks. The examples we showed above—identifying that 7 TB of bandwidth or that 7-minute configuration time—were only possible because of Build Scan's observability data.

Universal Cache

We're in an era where writing code is easier and faster than ever before. But if our delivery pipeline can't keep up, that increased coding speed just translates to increased waiting time and increased infrastructure bills.

We need to decouple the cost of CI from the volume of code, and Develocity Universal Cache is how we do it:

Artifact Cache reduces network bandwidth
Setup Cache gets the build tool working faster
Build Cache eliminates redundant CPU usage

By implementing a unified system like Develocity Universal Cache, we can turn that "firehose" of AI-generated code and rapid commits into shipped features, without the wait.

If you're ready to stop wasting time on redundant work, request a guided trial of Develocity to see your own data in action.

Is GenAI stressing your Continuous Delivery pipeline?

GenAI Will Stress Your Continuous Delivery Pipeline whitepaper

Share this blog post

Products

Features