Speed Up Apache Maven Builds

Hell has frozen over. Develocity supports Apache Maven!

Maven users get faster Maven builds and fine-grained build analysis without having to migrate to Gradle. For Gradle users, nothing changes for you, and more exciting features are in development.

This 1-hour webcast demonstrates how you:

Connect Maven builds to Develocity
Capture build scans for Maven builds and understand where build time is spent
Speed up Maven builds with a remote build cache shared by your entire team and CI
Get insights on Maven build reliability

View the full recording below:

Speed up Apache Maven(™) Builds with Gradle Enterprise

Etienne: Welcome, everyone, and thanks for joining our webinar. My name is Etienne. I’m a V.P. of Engineering at Gradle and my main focus is working on Develocity. And with me today is Stefan, do you want to briefly introduce yourself.

Stefan: Sure, Stephan Oehme I’ve been with Gradle for about three years now mostly working on performance and my latest exploits have been improving the performance of Maven which is part of what we’re talking about today.

Etienne: Absolutely. So, we’re very excited to share for the first time what we’re doing to the benefit of the Maven users. So, what do we want to accomplish today?

Pretty much two things –

We want people to understand what’s happening in the Gradle build, sorry the Maven build process so they can start reasoning about what’s going on. And once you can do that, you can make informed decisions on how you make your build faster & more reliable.
The other part is we want to show you how we can actually accelerate your builds using Develocity. We’re going to share a little bit of context of Develocity and then we’re going to show some demos.

Stefen: Right

Etienne: So just to avoid any confusion. What do we do at Gradle? We’re creating two pieces of software. We’re delivering the Gradle Build tool to the open-source community that you probably have either used or at least heard of. Gradle has over seven million downloads a month, it’s very likely that you have been in touch with Gradle in one way or another. And then we’re also building Develocity. It’s a very different product. It’s a commercial SaaS product. You install it on-premise and the goal is to improve developer productivity and create an enterprise where you can use with both Maven builds as well as Gradle builds or you can even have projects that have both.

Stefan: It’s very likely to have both especially for migrating or you just have a big diverse team.

Etienne: Yes, like Android builds. Some Java teams using are using Gradle or Maven. Android teams also are using Gradle. So that’s an important distinction to make between the two tools. All right. And our focus today is absolutely on Develocity. So, what is Develocity?.

In short, to give you a bit more context here as well, Develocity connects to every developer and every CI machine that builds with Maven or Gradle

Stefan: Or even with some other build tools?

Etienne: Yes, in the future it’s very likely we will support other build tools as well. And once your connected, Develocity captures all the data about your build. And like I said, once you have the data, you can reason about it. And you can make informed decisions about it.

So, where do we capture this?. We capture this in something called build scans. Build scans are a persistent viewable record of what happened in the build, and we’re going to see that in action too. There’s also what we call the expert API, which is a way to get to that build scan data in a way that allows you to build your own class analysis around it. And then the third aspect of Develocity besides scans and connecting all your builds, is that we also provide the build cache. That allows you to dramatically speed up your builds. So, as you run your build and you’re doing the same work that has been done previously, you can just reuse it and save the cost. We’re going to see that in action as well.

So that’s in short, what is Develocity. To back that up with some numbers. We have two examples here. One is from ourselves right from the Gradle team.

Stefan: That’s a big build.

Etienne: Yes, we run a lot of tests and every day for a full day of 24 hours, we’d save about 60 days of build time.

Stefan: I remember, we used to have this instruction in our read me on GitHub that said, just run Gradle build, this may take a few hours. And so, we have a CI build. But even with that, we would be wasting a lot of time if we didn’t have the reuse that the build cache brings us.

Etienne: And with these numbers, it would be more like, please come back in a few days.

Stefan: Exactly, by now for sure.

Etienne; Yeah. So pretty impressive improvements that we get out of the build cache and of course, the build scans because the build scans give us the insights to keep it optimized and to find issues and fix them. And then we have an example from a customer, Tableau. There’s an existing customer base and they did some very thorough analysis around how they’re building, how much time is spent, and they came to the conclusion that every week they save about one day for those building a lot. So that’s like an extra day that you can put into the software you’re building instead of just waiting.

Stefan: – Doing something useful rather than just wait.

Etienne: Yes. Wait and browse or do something not so useful there. There’s also a webinar that we did with them with somebody from Tableau and it’s really worth watching. It’s very entertaining and very interesting. All right. So, before we get into demos we just want to do a quick poll. The first one being: What version of Maven are you building with? So please use the facility, having your recording to tell us what version you’re using. We’re very interested in the outcome.

Stefan: Exactly because that informs us what we have to support.

Etienne: Yes.

Stefan: So, you’re very much shaping the outcome of our work here.

Etienne: How far back in history will you go with supporting versions of Maven?

Stefan: 2.x is not an option.

Etienne: Yes.

Stefan: All right. I see a lot of results rolling in.

Etienne: So, we have some people here that are not using Maven at all.

Stefan: But are probably interested

Etienne: Maybe all are in Gradle. You know only a very small fraction is using a very old version of Maven below 3.3 and the majority are using 3.3 or newer.

Stefan: So that’s sounds promising.

Etienne: Yeah, I would think that’s what we were hoping and expecting as well. We will capture more data points, but excellent. It’s good to have this data. So, thanks for giving us your feedback.

And then we would like to get to the second question before we start with the demos and that question is for all those that are using Maven. So, who’s using a forked compiler?

Stefan: So, for those of you who don’t know, or if you’re not configuring it by default, The main compiler will run in the process. But if you use any toolchain or you set a custom executable then it will use a forked compiler instead in a new VM.

Etienne: Why are we asking this question?

Stefan: It affects performance quite a lot. So, the in-process compiler is quite efficient. It reuses the warmed-up classes of the current JVM. The out of process compiler, on the other hand, starts up a new JVM for every single compiler that you do, which costs a lot of time. It used to be more necessary when you wanted to do cross-compilation. You were running your build on, let’s say Java 8 but you wanted to make sure that you weren’t using any APIs or features that only work on certain other VM like Java 6 or some other vendor. So, this is not as important anymore. Now that we have Java 9 plus with the desktop release flag – you get automatic validation that you’re using the right API. So yeah, I was just curious, how many people are using it?. And it seems to be about 16% say, yes, I do use forked compile and the vast majority, 84% say no. So that’s good to know. Thanks, everybody.

Etienne: And we’ll see that no matter which one you use. Exactly everybody can benefit once we got to start talking about exactly right.

Stefan: So, for the demo, I had decided not to use forking. This validates that the demo will show you roughly what you can expect if you’re not using forking. If you do use it. It’s going to be even better.

Etienne: All right, great. So, thanks for your feedback on the poll. And with that, we can go over to the demo. We’ll start with build scans right before we can start optimizing our build. Make it faster, more reliable. We need to know what’s going on, and that’s where build scans come into play. What I have here is the Maven project itself that we’ve checked out from Github.

Stefan: That’s very inception of you.

Etienne: We’re just going to run the clean compile. The only difference to what you would do at home is that we’ve also enabled to build scan extension for Maven. So, what happens as I run the clean compile is that in the background the Maven plugin or Maven extension is capturing data about what’s happening in the build. What that is exactly we’ll see in a second, once we get in this game.

Stefan: So, it looks like a URL there

Etienne: Yes, exactly. At the end of the build you can see that build published a scan. Meaning whatever was captured while running the build has been published to the Develocity server. So now let’s take a look at that.

So, this is a representation, a visual representation of what happened in your build. And I’d like to just go through it briefly so you get an understanding what kind of data we’re capturing. We have a summary where you can see when it started when it finished, which will give you an indication of how long it ran. We’ll be showing that in different places too. Yeah, you can see what the version of Maven you were using. Which version of the build scan extension you are using and that gives you a rough idea of what was going on. You can also see what projects were actually built while running this build. And you can get even more data about each project by inspecting it. We also showed you Switches so you can see what options you were running with the build.

Stefan: So, you no longer have to ask basically what was this run with.

Etienne: Yes, exactly. And I’ll come back to that in a second. So, I didn’t use any switches here or they were all turned off, but they might be turned on in other situations. We also capture information about the infrastructure, which is, of course, very interesting. What always did run? How many cores were available?, How many threads to configure Maven with? What JDK was used?, How much heap was available? And so on.

And then we also capture what was actually run, which goals were executed. Right. And we can see that in the timeline view where you can see all the goals. We can also drill into goals to get a bit more information about how long they ran, What was the class behind it from which plugin did it come. Which of course, is also very interesting once you have your own plug-ins.

Stefan: Exactly or if something goes wrong and you want to know, who do I blame.

Etienne: Yes. It’s always good to know who to blame for sure.!!

Stefan: But also getting this kind of information from the command line would be rather painful. So yes, it’s really cool.

Etienne: We saw the amount of output we had on the console. And so here is a much more distilled and approachable view, which is also searchable. I can search if I am interested in a certain project. I could search for that or if I’m only interested in a certain phase I can do that. I have all kinds of searchability functionalities. This is something about how outcome that I can filter by. We’ll come back to that in the context of some of the cache. So right now all the goals were executed as I expect from the cache.

Then we also have a performance view, which we’re going to enhance quite a bit before the release. But just to give you an idea, here we see a breakdown of where was the time spent. Because once you understand where it’s spent you can start drilling in and trying to optimize that time and make it lower. You can also see how much time was spent on garbage collection. Not too bad. We’ve seen worse.

Stefan: Absolutely, well, you also gave it plenty of RAM. I think it was 8 gigabytes. So, I would be worried if that wasn’t enough.

Etienne: Exactly yes.

Stefan: Yes, this is really useful, we’ve seen many builds that spend most of their time garbage collecting rather than building.

Etienne: Yeah, and you can just tell us that we also have some more information about the goal execution. See how many goals were run and so on. But I’ll come back to that a bit later. Again, so in summary, what you get are insights about what happened in your build and you can then start reasoning and making decisions on what to improve. There’s one more thing I want to show you and that is the concept of custom values because it’s quite powerful. What I can do is when I run the build I can add my own data to it not just what we capture, what you want to capture. And you can add parts tags. can See I added a tag to indicate it’s a local build. If I ran locally on CI, I could tag it as being a CI build. We can also add custom values and key-value pairs. I added just a few samples here. So, we can say what the build group was, the build team whatever that means. It’s not so important here. But maybe more important as a use case is – I can also capture the local state of my repository. And I did that by executing a good command during the build, Very early on, the plugin that I integrated, and what we can see here is what are the changes I had locally.

Stefan: So, you changed the POM if you had a failure. I would ask you, hey, what was that change you did to your POM maybe. Or maybe that’s the problem. Yes, before I dig down into our own build plugins and wonder if I broke you.

Etienne: Yeah, absolutely.

Stefan: This saves a lot of headaches.

Etienne: Yes, and I’ll show that in a second that we can actually answer that question or you will be able to answer it for yourself via the links that we can also integrate with. Yep, so this is how we can enrich the data in the build scan with our own data. I can do this via system properties to that person or like I said, for example, by integrating the Groovy plug-in and I can have any kind of custom logic that decides when to report, which kind of custom values.

Stefan: Things like Groovy always come to the rescue when there’s something you need to do.

Etienne: Yes, absolutely yeah. And for those that this hasn’t become obvious yet, this is very powerful to share with others. If I have an issue with my build and I want an associate to look at it I can share a scan. I can point to it and I can say, OK, I want a link to the custom values and I sent a link like this to Stefan and he can look at it. He will see what was the state of my machine or of the CI build. Doesn’t matter if it was run on CI or local. And he can start investigating. He doesn’t even have to ask me like you said, you don’t have to ask me what JDK, you can see how much heap did I have available, the maximum. It’s all right there.

Stefan: In this case as usually there is so much back and forth when somebody says, oh,

the build broke for me. Well, what did you call? What were your local chains?. What branch?. Blah blah, blah.

Etienne: Yes, you can go back and forth.

Stefan: And then you’re in a different time zone. I mean, we’re all distributed right.

So sometimes somebody might have an issue and I’m asleep. Yeah and then you have to get back to them and then they’re asleep. So, it’s not pretty. And this makes that so much easier.

Etienne: Yes, so look to the point where I just say just give me a build scan and I don’t answer the question before I have the build scan.

Stefan: Absolutely don’t show us any links just show us the build scan that was created because we also integrate with CI.

Etienne: And the other aspect I did. So maybe one more thing to sharing. I can be very specific. I can say this CI plus execute something. I’m not sure what’s going on with this one.

And I really want you to look at this. So, I can actually deep-link to this line exactly. This line shared a link, and you don’t even have to search.

Stefan: So instead of sending me your debug log, and me sifting through it with grep? he just sends me a link. Yes, absolutely.

Etienne: You see right away what I want you to look at versus just looking at the scan or look at the timeline. So, it’s very on the point. So, to complete the custom values aspect. What I would like to show is that I can also run a diff via git of my local, state, and create the gist for it then publish that to git and then include the link to that gist here in my build scans

Stefan: So, I can actually see what you changed not just which files

Etienne: yes, exactly so let me show you. I ran this already locally. So just speed it up a little bit. You can see it has a local tag meaning quite locally. And we now have a custom link. I can give it any name I can put any kind of link behind it. And now if I go here again to the customize, please. I can see you OK pom changed into unknown files. Yeah, All right.

Etienne: So, let’s go back to the diff and if I click here, I already did so let’s just reuse the tab I see now the gist with exactly the changes that I had locally.

So, your question, what did you change locally? Quite a lot. We answered them here. For those interested. It’s here we are including the script that is actually creating these custom values and links and tags.

Stefan: So, he added the Ruby plugin.

Etienne: Yes Yes, exactly right. There are a few things that we don’t have yet. Build scans that we are going to have in the first release. And I just want to show you them that an analogous from the Gradle side.

Etienne: OK I don’t want to go to that really quickly. But first one is failures. So, if you have a failure. We’re going to show you the failure with the full stack trace. Making it even easier to debug an issue. We’re also going to show you plugins. So what plugins were applying to your build. Very informative especially when you look at the build and really are familiar with get some idea. And you can then also see which project actually applied a certain build. So again, this is from a Gradle build. But it would work similarly for me if there was a question if the Gradle that you showed was private.

Stefan: I’m sorry to interrupt. That Gist you showed was a private gist.

Etienne: Yeah, I created it as a private gist. Yeah, but I could as well have done it as you could have made it public

Stefan: You could. You could have linked basically anything.

Etienne: Yes, absolutely. I used GitHub, but maybe other repositories support that as well. And in use there too. So, it’s totally unrestricted. You have all the freedom that you have with using a scripting language in that sense or you could even use this statically.

Stefan: So maybe the question was also like can you share this with me. So, you know as soon as that plugin is out. We’ll probably share this as a recipe. I just get diffs. I mean, it makes total sense, right. That’s and it’s actually not a lot of code.

Etienne: Exactly then we will also have a test for you. So here you can see all the tests that ran how long they took. This always order by duration that’s something we do in general, we always order by duration. So, you can see the slowest thing first. Of course, there’s deep linking here as well to get to these individual tests. And I can see the failed ones. And I can see more details about the failed ones to really quickly resolved. The other thing we have to keep in mind besides the sharing aspect is the local versus CI aspect. So, if I run to build on CI and it fails and I’m not sure what’s going on. And what did I do before I would go on that box and try to rerun it hopefully create the same problem. And now I just take the scan that was created. And I can reason about it with the same build that already ran right versus rebuilding hopefully happen.

Stefan: Hopefully I have an SSH key to log into that box.

Etienne: Exactly yes. And that’s all you have to go to Maven. I mean, like right now. So that all goes away and you can independently or by yourself investigate those builds.

So that’s another view we’re going to add. And the last one is that dependencies view is new of course, a very interesting one to be able to see. I have my plugins and how does my dependency tree look like. Sorry, not my plugins my dependencies who what are their child dependencies. What’s the hold at the transitive tree.

Stefan: And I mean to say makes sense for plug-in dependencies. Right if I don’t add another dependency to the Java compiler. I want to use a different compiler like Surefire. Surefire also has this provider concept where we can exchange how Surefire runs things. So, you might want to know, hey OK, we’re using the Eclipse compiler. Which version? Yes also showing the plugin classpath would be an obvious thing to do here as well.

Etienne: Mm-hmm Yep, yep. And then we also have this feature that you can go both ways. You can see what this dependency but does it depend on and you can also ask who is depending on me as a dependency. And who’s bringing this in. And what I show here for the case of Gradle will be done very similarly for the case of Maven. So these are some more views that we are going to add. All right. So just to finish my demo here on scan. Let’s go back to the timeline, one more time. I’m showed the linking, how you can see the durations. You can also sort by duration like I mentioned so we quickly see which goal took the longest to run. Not too surprisingly executing the Maven script because it had to invoke the Git status. Yes and to get diff even in one case would take some time. But once we know we can decide what to do about it.

Stefan: But we have to know what you couldn’t make your decision based say like, hey, I’m only going to do this on certain builds, I’m not going to do it as if you’re just doing the small thing. I’m only going to do this if you call to verify which takes longer. So yes, the gif doesn’t add so much time. Maybe.

Etienne: Yes, exactly And the upload time as well. In the case of git diff, it uploads something so you pay a cost that you can then decide where you want to pay it or not. And when you want to pay it.

Yes, absolutely. So if we go back to the performance of you one more time and can see that we spend a total of 19 seconds executing our goals a total of 158 goals were actually run. Or consider running. I should say.

And then we see a further break down of which goals are not run because we could use leverage to cache them and which goals were actually executed and we can see right now we haven’t really been using the cache. We’re going to see that once you show you a demo. So every goal was executed.

Stefan: But we could have cached some might see there. So there is a link there.

Etienne: yes, there is potential for a cacheability. So did the Maven extension the cache extension already noticed that there are some goals that we can actually cache but we didn’t yet. So we’re going to enable that later on and some are just not cacheable yet. And they will not be unless we make changes to the actual goals right. And then I can also see how that distributes in terms of time, and so on. And we’ll see how that looks with the cache enabled. Once you show your builds and the other thing you can see down here in the goal execution is the breakdown. Here you see all the goals in which one is the longest. That’s always the one where it’s to the most right the most. The ones that take the longest to run. But we can also break this up by project, and we can see for each project. How long did it take? And then within each project how long did each goal take and which one was the longest one to run.

Stefan: That’s especially interesting when I’m doing parallel builds and I want to know. Which one should I be breaking down basically because parallel in Maven is done by a project, not by task?

Etienne: Yeah Yes, very good point. Yep So we get these insights here as well. And also with the parallel thing that you mentioned. I mean, if we were all going. If we were to go to the console and try to parse the output to see where was the start and where was the end. I mean, that would not work at all. But the way we do it based on events we really know when something starts when something ends regardless of what is between

Stefan: and what it belongs to because you know where it’s coming from.

Etienne: Exactly we actually know what it belongs to, and we can do all the breakdowns if we want to whether it’s by phase or something else.

Stefan: Maybe by plugin exaggeration and what plugin is contributing the most.

Etienne: Yep yep. So that’s all. So let’s take a look at how we can get the number of goals being avoided up from zero.

Stefan: Let’s do that let’s have a look. How we can actually make Maven builds faster. I’ll just give a quick introduction to the project. We’ll be looking at. So this is one of our performance testing examples that we use to test the Gradle build tool and also Maven. It normally has 500 sub-projects and about 5 million lines of code. So this should be right up with the biggest projects that you were listening have, I hope. I’m not going to show that in its entirety because it’s going to take many minutes to actually compile even with the process compiler and many more minutes to then run the tests. So I reduced to 10 sub-projects and these projects form kind of a tree. So you will have higher projects like project 10 project 5, depending on lower projects like project to protect zero so that we also get some idea of how the cache behaves when we change an upstream dependency. So I’m just going to start with a clean slate. I have an empty local cache and I have an empty remote cache. So let’s run a clean compiler clean test in fact and hope that nothing comes from the cache because it’s actually quite hard to make a rerun when you’ve been running builds all day reviews of all permutations. Yeah, I’ve used up all permutations of the same words, but yeah. So I’ve changed some internal the implementation detail of the cache extension to make sure that it won’t come from the cache. Nice So you can see Maven running. And this kind of your daily bread and butter. I guess for most developers out there on some of you might say, hey, I would be very happy if this was my daily bread and butter. These tests fly by really fast and they are actually tough. Yeah, there are tests for these tests are just like calling a setter and then calling the getter and asserting they’re equal. So it’s really nothing.

Etienne: So how do we not know we didn’t use the cache.

Stefan: So we can look at the build scan and it’ll tell us. So let’s look at the performance tab. And what we see. OK It took about 46 seconds to execute and of these 46 seconds. We got zero cache hits and 45.7 seconds actually executing things. So we’ve been able to use build scan to confirm there was no cache hits And once the build cache type is implemented we’ll also see how much overhead that the build cache has because what just happened now is I executed. And I put the result into my local build cache. But what I also did is I uploaded it to the remote. Build cache was just shared with my whole team. This is not normally something I do as a local user. This is more something that I would let my CI server do because I trust it a little more to be set up correctly and not have some virus on it. And so on. You can never really trust local dev machines. So normally it would be the CI Server that fetches the latest changes builds them and then populates the cache with that. And then when I come to work in the morning. I just say get pull Maven clean build and it executes. Well, we’ll see how long it takes to clean it. Clean Test in the morning after the cache is fully populated you can see kind of starting off slowly. That’s because maybe it is starting a cold JVM. But then things get really nice. And I’m done after just eight seconds instead of 45. So let’s look at the build scan to confirm what we just saw and the goal execution tab. We can see 44 goals were coming from the cache took us about 6 and 1/2 seconds to load them. Another 34 goals were not cacheable and they took about 1.6 seconds. So there must be relatively fast goals. We can look at them and it turns out. It’s just clean while cleaning is relatively fast although I think we can make it faster. Challenge accepted and process resources. While this project doesn’t have many resources. So it doesn’t have much to do.

Etienne: Can you go back to the first scan one more time, which was actually later. Let’s take a look at one of these goals that is actually cacheable and get an idea. How long did it take to upload that, right to the cache?

Stefan: The result, was both a local miss and a remote miss and it was a store. We stored it locally and then we started remotely and the built cache added the total overhead of about 2 seconds. So the packing at the end of it took about 58 milliseconds. That’s not a lot. And the remote upload took about half a second period at a very slow pace this might also be because this was the very first motor that ran so we still had to cold JVM.

If we look at things further down the line. Let’s say project 8 compile and should look a little different. Yeah, we’re getting more into the range of .8 seconds. So there is some overhead for sure. But the savings are just worth it to you.

So let’s look at one of those cache things. Let’s look at project 8 again, because the first one is this That is suffering from all the JVM warm-up overhead.

So it’s kind of pathological. And we see we had a local cache hit and it took us a mere 40 milliseconds to unpack it. So we could actually try and deactivate the local cache and see what would be the cost if you know, I’m pulling everything from remote like I have nothing cached locally.

So I think we’re pulling this from San Francisco right now. So this is going a long way to get here to Berlin. Normally you would have a setup where you have distributed cache where you have different nodes close to different developers.

So it took 20 seconds. Still much better than the 45. But there was a cost. And this distributed cache I was talking about with then also practically push things to the cache nodes that are closer to the developers.

So if this happened in a real setup like we have for own built then I wouldn’t be waiting 20 seconds because the cache hits would be coming from Frankfurt and we wouldn’t be eight seconds from the local cache. But it would probably be more like I don’t know, 10, 12 much faster than what we just saw.

So we had 19 seconds of things coming from the cache. Let’s look at project 8 compile again. And we can see it was a remote hit the download took us about eight 180 milliseconds. And then another 45 milliseconds to unpack it. So much better than all actually executing the compiler.

Etienne: What tells a nice use case with the caches or even with the local caches if you switch branches like you when we did that today quite a bit. So you’re in one branch, you build its cache locally even if you have to remote cached disabled to switch branches you come back to it later. Not necessarily a next run. But maybe an hour later and the many runs between.

And you can still fish out from the cache what you failed before.

Stefan: Exactly right. So I could go back to my initial state. So remove that method again. And I would get a cache hit again because it’s not just the last built. It’s all builds that I ever ran are potentially coming from the cache.

Of course, there’s some cleanup so we have some logic that after a certain number of days we purge your local build cache and I guess the remote build cache also has some maintenance attached to it. So it won’t grow out of bounds. But it will give you big speed up, even if you’re changing projects changing branches and so on. Yeah, excellent. All right.

Etienne: So maybe before we wrap up. And one more thing about the cache, which could be interesting. We are connected to Gradle enterprise right now in the US and we’re in Berlin here.

So there’s quite some latency involved here maybe in some bandwidth constraints. But what we can do we go in Gradle enterprise. We can set up nodes that are closer to where we are. So we could have one here in Germany set up yet or at least in Europe and the cache node is then connected to a remote cache. Right And then what we said about the local cache talking to a remote cache or going to a remote cache if it’s not available locally, then also applies to cache notes begin to remove caches only if necessary. And we can even preemptively push changes from the remote nodes to it too much closer remote.

Stefan: So if your CI server rebuilds and pushes us to one node or it can then fan out to all your other developers. So that people working in Frankfurt and Shanghai all have a hot cache.

Etienne: Exactly or we have people in Australia. So they’re very happy that their cache is already populated in Australia. And they don’t have to first get it once. All right.

So we gave it to them on the build scans we gave you them on the build Cache and how that fits into understanding to build better. And using that information to make you build faster.

And we have a last question we want to ask you before we become to summarize and that is, which of your goals in your build is the one that takes the longest or some of them, which are the longest taking. And we don’t have a poll for that because we can own It’s a free text answer. So please just add a comment to our comment channel. And if you don’t know which one it is please write that as well. Right because I wouldn’t be surprised if some people don’t know or some teams still know what takes the longest. Because it’s pretty hard to find out if you don’t have a tool that surface.

Stefan: Yeah you only have like the command line outputs. It’s more anecdotal I guess. Yes, I thought this one taking long once. I see this time in my control.

So we’ll see for myself. I remember being the body in which it said compiler which was compiling oh I think it was Google web toolkit under the hood, which it said yes. That took ages, and now they have a Gradle plugin for that. And when they made that cachable that was a huge win.

Etienne: Nice how did you find out.

Stefan: Our backend was Maven land. But all that team later switched to Gradle. And then they told me like, hey, I just thought it became cachable and I was so happy. It’s great. All right. (responses to the poll showing up)

Etienne & Stefan:

So we have tests. Test goals. Right end to end tests. I’m not sure to end tests, we’re not sure yet. 70% of full-timers on the tests. Clean install skip test. It’s 15 minutes. Well, that is a lot of that is. I mean, that is basically just I mean, maybe they have code generation.

Otherwise, this will just be compilation with 15 minutes would be a lot of code yeah. Wow, tests. So I guess we’re right on track with making a sure-fire and safe questionable first. And with more coming. I think we’ll see a second. There will be more motors coming for sure. Excellent 700 modules 5 million lines of code. Thank you, Peter.

Etienne: Yes And by the way, just remember, we also reported a separate webinar on CI development – on how to optimize your CI and why we go into it. How would you set up the cache on CI. to mostly benefit even in your pipeline that runs CI but also with the local developers how would they benefit? And so on. So just to hold signs around it. How you can optimize your setup and your pipeline to make best use of the cache and how you would use scans to actually investigate if it’s not working as expected. All right. So what have we learned? Before we give you a chance to ask some questions. We’ve learned how we can accelerate your builds with Develocity.

We’ve also shown you a bit how you make your builds more reliable. There’s more coming as we capture more about failure analysis. It will become even more powerful. We’ve shown you how we can debug your builds using the current data and how you can even export the data via the export API and do your own analysis If you were interested in that.

And the thing to keep in mind is that it works for all builds whether they’re local builds or CI builds. So when is all this coming? It’s not out yet. We’re still working on it. But it’s progressing nicely.

And so once we’re done, we’ll have some build caching for compiling and test goals and possibly even some more of the same.

Stefan: I mean, the compiling test is pretty much in the bag by now. So it seems to actually have some air left in the plan to add some more goals. So it will be interesting what the poll tells us. Yeah if you have more ideas besides compile and test because that’s basically a done deal. If you have more let us know and we’ll consider them.

Etienne: Yep we will also ship build scans like I show you plus a few more views and currently focuses on build issues and around performance.

The timeline investigating cache issues. That’s really primarily for the build master but even the sharing aspect can really be interesting for developers as well.

All right. We will also have a scans list. I didn’t show that. I can show it in the second one to show what’s coming in the next release.

And the scan list allows you to see all the scans and you can sort them and you can query them meaning you can also filter them. And so for example, what I showed the tags. So if you want to just see the built on CI you could say, give me all the builds that have a tag CI, maybe reduce it to one project maybe say you only might see the fail builds of that project

Stefan: And do you have some data to work with or maybe builds where a certain mojo failed. Or so that you can tell – this is really a problem or this only happens for this one developer.

Etienne: Yep, absolutely. And then you can start fixing it. And then we also have the data analysis that you can do yourself the expert API that’s available for all the data we capture both in Gradle and Maven and then what’s coming after in the next release. Oh, and by the way, that released 2019 0.1 we expect from like the next eight weeks, or eight weeks six to eight weeks to release that right. So we’re getting there. So after that, we want to have also been caching for custom goals right. So if people have their own goals created. We also want to make it possible for them to cache them.

Stefan: So first of all, of course, there’s internal use case, but it could also be used for the standard Maven components or open-source plugins if they are interested, they could add those or caching annotations. It will most probably be annotations just like the right. All because that’s just the least verbose way of adding this information. Yeah, that would also be possible not just for your own custom things. But for community modules. Absolutely that’s right.

Etienne: And hopefully we’ll get there not as many. I mean, that’s the dream of our cache so that everything is cachable out of the box at some point. Yeah All right then we can spend more time building software instead of waiting for it to build or to finish building. We will also provide more Maven build insights. So those things I mentioned like test dependencies and so on. And there, then, of course, also a lot of very interesting for developers. It my test failed. I don’t know why It runs on CI. I can share it with somebody else and that they can help me investigate and debug understanding your dependencies, which can create all kinds of odd errors downstream.

Stefan: You change a first level. And then it draws in some new version of some other one that you didn’t want.

Etienne: We will also have a build scan comparison. And we’ll have a performance dashboard

Stefan: The comparison feature actually is a real killer feature for me just today where we’re trying out the build cache on the Maven project itself. And we’re wondering, damn it’s not working too slow. It’s not like the most important projects the ones with lots of code are not being cached. And we didn’t have the comparison yet. And that really hurt.

I had to basically do it with debug logging and eventually get to the bottom of it. They have a properties file that contains the current time in milliseconds. And that, of course, completely kills reuse if every build you have a new file in there. So after removing that all and making the build no longer look like it’s 1999 it was properly cached.

Etienne: Nice, nice. And we do even have ways that we will have ways that you can say, OK, there is some signs in there.

Stefan: But ignore that kind of right. We could also do a destination. If you absolutely want to keep your build timestamp. We can make it so that you can tell the Gradle build cache. OK in this file does a property called x just ignore it. Well, also the Maven build cache… Well, the Gradle Maven build cache. This is going to get real interesting.

Etienne: So let me just finish this what I said about that the scan list and the build comparison and the dashboard. So here we see the dashboard. So what we’re going to build after this first release is what we already have in Gradle which is you see over a large amount of builds that have already happened.

You see the performance metrics you can see trends, you can see outliers and you can also then investigate see which build was that you can even drilling to that build and go there and investigate it. All right. So a very powerful feature that also allows you to really easily calculate how much am I actually saving. Because that’s oftentimes what you’re interested in how much. I’m actually saying is the build cache actually pulling its weight. And the other one is the build comparison that you already mentioned. So here I have opened two scans and they are different. The compile Java ran on both. And the question is, why did it run on both what was different. Like you said before. We can’t tell in detail. So we have to start digging deeper by hand. But with the build comparison that we already have in Gradle that we’re going to have for Maven very soon as well. You can actually see which were the files that changed.

Stefan: I think that that really is the key, not just saying, OK the source is changed. Yeah, great. I have 10 million files. Yes, which one. The actual file that is so useful.

Etienne: Yes great. And the last thing is the scan list. Like I mentioned here, you see all the scans. And then you can start restricting it. You can say, I’m only interested in the CI builds or you can say, I’m only inserted in a certain project or you can say I want to see of failed builds. Whatever and builds also constraining it time-wise like how far you want to go back in time. You can, for example, say last week we had “x” number of failed builds on that project on CI and they may be compared with how many local builds failed for the same project and start thinking about, OK.

Stefan: How can you make this better. So we can actually track your progress. Also, you know I mean, being a build master can sometimes be of a tough job because you get all the blame when things fail, but you rarely get the credit when you make things really awesome and fast. But with Gradle enterprise we actually do because you can just show, hey, look last week our average build time was slow and this week it’s this fast or last week we had this many failures and this week they’re gone. That actually gives you the validation that you’re doing a great job.

Etienne: Yes and that gets us to the last point, which is that if there are any questions, feel free to ask them now we’ll try to address them.

Stefan: I’ll just read a few of my screen here. Wow, this is fancy. Thank you. Do you have any way to see the classpath of like an execution? We will. So can you break down the time per component build if you have multiple components? Yes, we can break down the time per project.

Etienne: Yes, that’s actually something that is still coming for this first release is its different kinds of breakdowns. So what we showed on the goal execution tab, we have two views of breaking it down. We’ll have more actually the one per page you already have at the beginning, my phase or whatever makes the most sense.

Stefan: I have one question Gradle enterprise uses caching at the task level as far as I understand. How will the same task level caching be achieved and Maven?

Also, how will the compiler avoidance and incremental compilation of incremental be achieved with Gradle enterprise and Maven. So we hook in at the mojo execution level, which is the Maven equivalent of a task we wrapped up module, we look at all its inputs calculate that hash and then look in the cache.

Have we seen this before? If yes. Unpack it into the output directories and it’s done. We don’t need to call the actual mojo. So our extension just replaces a Maven service that is responsible for executing mojos and it’s that we say, well, let’s try caching it first

Stefan: On dependencies will you also show if resolution failed.

Etienne: We do show failed dependency resolution. Yes OK.

Stefan: Show to build again from the rehearsal I did. Thank you for suggesting that. Does Gradle enterprise for Maven also feature a deamon process. No, it does not. Although that is something that we’re thinking about for the longer term because you might have noticed that, especially at the beginning of the build things were really slow and that’s the JVM warming up. Actually I’ve done a lot of profiling lately to reduce the overhead of the cache. And when I look at my profiles now the CPU samples are 70% just in time compilation 30% actually doing stuff. Of course, this is happening on a different cores. So it’s not really slowing down if you’re doing a single for build. It’s not really slowing down builds per se, but it could be so much faster if that code was just hot from the beginning. If we didn’t have to go through the interpreter. So yes, this is a very interesting idea. But not yet widely used.

Etienne: There are actually quite a few concepts we’d like to have in meeting that we can’t do for sure. That would help us to make things or allow us to make things even faster.

Stefan: We would also love to have incremental builds and have the greater incremental Java compiler in Maven which is actually safe as opposed to Maven’s incremental compiler which well there’s too many shortcuts basically. But supporting a proper incremental compiler. Maven is actually quite tough because in Maven a lot of mojos write to the same directory. So figuring out what is supposed to be there what is stale. What did somebody else write, which still should be there, or what did I write and should no longer be there. And so on. That’s quite tough. So it’s not going to happen right away. Maybe someday we’ll see. But you know lots of ideas definitely. Why using a cryptographic hash instead of something like murmur because it’s more important to avoid collisions you really don’t want to have two different projects hash to the same hash key. We want this to be as unlikely as possible. So I mean, yes, there will be collisions but your project would have to be basically garbage data in order to collide with a real project. We never want to real projects to collide and we’ve tested other non-cryptographic hashes and unfortunately, they did sometimes collide when it was two different projects that were feasible. So yeah

Etienne: we have to use a cryptographic hash to be sure everything is quite a systematic almost academic approach. We were investigating the different options in terms of speed reliability or act or avoiding collisions.

Stefan: Exactly So we did try murmur for instance, and many others. But we did settle. I think by now we use MD5, which is fast enough. But avoids collisions as far as possible. How should distributed teams use remote cache? I think you already answered that with the pushing to different nodes.

Etienne: Using all the multiple nodes and you talked to the nearest nodes possibly even enabling the pre-emptive replication

Stefan: Is enabling Gradle enterprise for Maven as simple as adding a plug-in into the pom. Pretty much there’s a file called dot Maven-slash-extensions.XML. Because the cache and also the scans are a so-called Maven core extension because they enhance or replace some core Maven services. But yes, it is as simple as adding a little snippet to your project. And of course, having a Gradle enterprise license for the cache.

Etienne: So we’re not a traditional plug-in because that would be too late over an extension.

Stefan: Well yeah. And plug are also project scoped so we want to be able to hook in as soon as Maven starts not much as per some project. That’s why it’s a core extension Will it work with the public scan site?

Etienne: Yes, that’s a good question. And yes, it will. So for Gradle, we already have a public site or a public place where you can publish scans to, which is called scans.gradle.com that’s actually external URL. And so any Gradle build you run you are able to run build scan with plug-in or you say dash-dash-scan it publishes the scan to there.

Nobody can see it except yourself because unless they know that URL. It’s a very encrypted URL or it’s a very random URL. So it’s pretty much impossible to guess. And we will offer the same for free for Maven. So if you run a Maven build and you have the extension enabled. We will publish a scan by default to scans.gradle.com and you will see exactly what we’ve shown today and what we’re singling out for the first two years.

What you don’t get on scans.gradle.com is the caching. The caching is a pure Gradle enterprise feature and Develocity you install on-premises as well as the scan listing and comparison, which is also Develocity. But anything about a single scan. So you can use it. You can share it with others because that’s good. You can use it to post a link in a form saying, “I cannot run my build can you please help me” post that build scan from your Maven build there to be reviewed. That’s all going to be available there.

Stefan: And the other part of the question was will it be able to work with the Autofactory as a build cache node.

Etienne: It will not work with Autofactory, it will be really optimized to work with the Develocity build cache.

Stefan: Can you fix the shitty Maven way trademark? Nowhere we’re really going to work with how Maven works. We’re going to accept the way it works there. There’s good and bad about it and some things you know especially it’s restrictive nature actually make certain things easier for us like the caching there’s way less things that you can do wrong. So you know, it’s not all bad.

Stefan: What is the price of Gradle enterprise?

Etienne: So that’s a question for sales. But we’re happy and we’ll see that on the last slide to give like a personal demo.

Was I going to do another training or a real training that goes deeper and there are those questions in contact with sales can be asked because we want to make sure that we have a pretty standardized process to make sure you get the most out of using Develocity and we help you see the difference of before and after. So it’s really worth to take that into account.

Stefan: And the last question I have here is can you compare Develocity caching with the Takari lifecycle plugin. So the Takari lifecycle plugin for those who don’t know, it’s basically a drop in replacement for the compiler, surefire, JAR and so on all those modules replaces them all with one that has been optimized and I think it also comes with its own incremental compiler.

So it doesn’t do caching. So it does some incremental compilation although given how Maven works. I’m very skeptical that it’s actually correct. I haven’t tried it out. But giving all the overlapping outputs is actually very unlikely unless they put a major, major engineering effort in it.

And yeah, it won’t do caching, it will only, even if it does incremental builds as soon as you switch branches, or you know you do a change and you go back to a previous change or you don’t get the hits from the remote. There’s no remote cache. So you don’t get all the things that you CI server already built for you or one thing that we didn’t talk about. CI servers benefiting from pipelines

You know there’s one stage in your pipeline where you compile everything. And then maybe you have like 10 20 parallel test builds or you know that one person who had 700 modules you probably have 700 parallel test builds.

And they all don’t have to recompile the code and you also don’t have to jump through any hoops, like on now.

I’m going to copy the output from that stage to that stage. No, the build cache just takes care of it. Everything just comes from the cache except the things that you haven’t run yet.

Which saves a huge amount of time. So, all of these things are not really handled by the Takari life cycle plugin. So, it is probably an improvement over vanilla Maven. We might be able to make it cacheable.

And you know make it even better. So yeah, it’s kind of orthogonal I would say.

Oh, that’s just big questions if you have Thanks also.

Let me go to the last slide.

Etienne:

So, we’ll send everybody the video of the recording. For those that weren’t able to watch it or want to share it with somebody else. Like I said, if you want to have a personal demo you can also get in touch with us and we’ll do live training on using Develocity for Maven in March timeframe. You can sign up and we’ll go into more detail to show you how we enable the extension by then and place. We’ll be out. Yeah 2019 0.1 and how you configured to cache. Like we didn’t show any kind of configuration before I show you how you do custom values all those things and how you work with the data that will all be covered and in that training. Yeah So that was it from our side.

Stefan: All right.

Thank you, Etienne

Etienne:

Thanks, everyone. All right. We’ll see you next time.

Run a FREE Build Scan

DPE University

Events & Webinars

Speed up Apache Maven Builds with Develocity