Developer Productivity Engineering Blog

How Cash App speeds up local builds with Gradle Enterprise’s remote build cache

One of the most powerful features of Gradle Enterprise is its remote build cache. It allows teams to share the benefits of caching, even for local builds. If your teammate has recently built some code, and that code hasn’t changed, there’s no reason to rebuild it. 

At DroidCon London 2021, John Rodriguez from Square and Rooz Mohazzabi from Gradle talked about how the Cash App team realized significant performance benefits from remote build caching. Across their distributed team, however, not everyone got the same benefits at first. Read on to see how Cash App set up their remote build cache to make everyone more efficient. (Note: Square is now Block, Inc., a publicly traded company listed on the New York Stock Exchange.)

The problem

Cash App is Square’s mobile payment service. John’s development team builds the Android version of their app. There are currently 30-40 developers on the team, so their builds are fairly complex. Based on their existing relationship, Rooz and Gradle’s Nelson Osacky (formerly of Square) approached John to see how they could make the Cash App build even faster. One of the main tenets of Developer Productivity Engineering is to be proactive and not wait for developers to complain before addressing build and test time issues. Good DPE organizations continuously work to make builds and tests faster. 

Another important consideration was the upcoming changes to their team and their code base. Square wanted to build on CashApp’s success, particularly during the pandemic. With that in mind, they were looking at growing the team substantially. Even if their build was perfect to start with, there were going to be challenges going forward. A build optimized for today’s code base won’t be optimized for that code base six months from now. 

The team focused on a simple problem: How to improve and maintain build and test speed across geographies. In particular, the first build of the day was really slow. If a developer switched branches or projects, those first builds were really slow as well. Effective caching would deliver benefits on the first build and throughout the day. 

The solution

The obvious solution was to use Gradle Enterprise’s remote build cache. The Gradle Build Cache was introduced in 2017, and it supports both Maven and Gradle builds. The benefits of a build cache on a local machine are obvious; if code hasn’t changed on my machine since the last time I built it, there’s no reason to build it again. The Gradle Enterprise remote build cache delivers those benefits across an entire team. 

Elsewhere in the team’s infrastructure, their CI server built the code every night, so every day started with a fresh build of the code on the CI server. Putting that freshly built code in the cache for the entire team to use was sure to improve things. 

Note: A build cache is different from, but complementary to, a dependency cache such as JFrog’s Artifactory or Sonatype’s Nexus. In a nutshell, a build cache makes building code from source faster by letting you avoid recompiling code that hasn’t changed. A dependency cache makes Gradle tasks and Maven goals faster by letting you avoid downloading dependencies again. Check out this blog post on build caches and dependency caches if you’d like to learn more. 

With the remote build cache turned on, the average AssembleDebug time across the organization was faster, but not significantly so at first. In looking for more substantial savings, John had the idea to tag each local build with the location of the machine. Using the powerful analytics of Gradle Enterprise, the team could look for patterns and trends based on those tags. In other words, Gradle Enterprise would give them rich data about their builds and the tools to analyze that data. 

Here’s some pseudocode that generates the geotagged information associated with each build: 

ip = [this machine’s external ip address from
      https://ipinfo.io/ip] 

geo = [this node’s geographical properties from
      https://geoiplookup.io/$ip]

location = $geo.city, $geo.region, $geo.country

coordinates = $geo.latitude, $geo.longitude 

This automates tagging based on each machine’s IP address. The metadata above could be used to analyze build data based on other factors. For example, $geo.isp returns the name of this machine’s internet provider; it’s possible that some slow build times could be caused by the ISP. 

The build cache was physically located on the West Coast, but the team (originally in Kitchener, Ontario, Canada) has members in Melbourne, Australia; New York; Seattle; and San Francisco. Obviously, latency for some users could be significant. Using the tagged builds and the ability of Gradle Enterprise to analyze build data, the team could see just how significant that latency actually was.

The problem with the solution

As the team sorted through the data, they realized that team members on the East Coast of the United States weren’t getting much benefit from the remote cache. In the San Francisco area on the West Coast, though, the remote cache was delivering great results. 

Getting the remote build cache to work was simple,but getting it to work well was more complicated. 

With the tags added to their build data, they used the reporting features of Gradle Enterprise to discover that East Coast users were saving an average of 38 seconds of wall clock time per build. Developers in San Francisco, on the other hand, were saving an average of more than 3 minutes and 10 seconds per build. That’s a difference of just over 2½ minutes per build. 

One way to address the problem would be to add a second cache node on the East Coast. If East Coast users got the same time savings, that would be a benefit, but would it be enough to justify the cost of the second node? With the data from Gradle Enterprise, the team ran some numbers: 

  • Per build, San Francisco developers saved 2:32 more time than East Coast developers
  • East Coast developers ran an average of 709 local Assemble Debug builds per week
  • Adding a build cache node on the East Coast would deliver 709 x 2:32 = 1796.1 minutes of saved engineering time per week
  • Assuming 48 work weeks per year, that added up to 1796.1 x 48 = 1436.91 hours of saved engineering time per year.

With these numbers in mind, the team could decide whether adding another cache node on the East Coast made sense. On the one hand is the cost of creating, configuring, and maintaining the cache node; on the other is the cost of more than 1,400 hours of an Android engineer’s time. The team was able to make the business case that the additional cache node would more than pay for itself.

Conclusion

Gradle Enterprise’s remote build caching is an extremely powerful feature that can make your development teams much more efficient. Optimizing the remote build cache may take some work, but the data and analytics in Gradle Enterprise helps you make informed decisions about your infrastructure and your build systems. The Cash App team was able to get significant savings from the remote build cache, find areas where the cache wasn’t performing as well, then use hard data to justify a second cache node that brought higher productivity to the whole organization.