Improve build & test reliability with

Failure Analytics

Unreliable builds cause downtime, waste compute resources, and are a massive distraction. They also negatively affect the quality of the code that is shipped. Builds become unreliable when problems are too expensive to find, too hard to reproduce for root cause analysis, and when fixes can not be correctly prioritized because their relative impact is unknown. With Develocity you can leverage analytics to proactively find unreliable builds and tests, learn how many people and environments are affected by the problem, share information about them, and understand the root cause efficiently.

The Benefits of Reliable Builds and Tests

Reliable builds and tests ensure that:

End-users and customers have a great experience with better quality and faster updates

Developers have confidence in the test suites which encourages a culture of accountability and good behavior (like building earlier and more often and fixing broken tests)

Management improves metric and KPI outcomes used to drive business success, such as productivity and efficiency, quality of service, and speed of delivery

Failure Analytics solution components include Test Failure Analytics for flaky and other problematic test failures and Build Failure Analytics for avoidable local and CI failures.

Using Test Failure Analytics for Flaky Test Management

A flaky test is a non-deterministic test caused by code that produces both a “passing” and “failing” test result. Develocity detects flakiness for a single build by re-running failed tests and across builds by inspecting test inputs. When they succeed after re-trying, the test is classified as flaky, but it still must be remedied.

Flakiness is not necessarily rooted in the test, it can also indicate flaky behavior in your development environment and production code. Without a remedy you shift the consequences of unreliable code to users and customers and flakiness accumulates, which increases build and test time and wastes valuable compute resources. A less obvious consequence is the degree to which flaky tests poison your culture by reducing confidence in your test suite. This may result in engineers building less often and paying less attention to writing tests.

To address these pains, Develocity provides Flaky Test Management to help you proactively detect flakiness in your application and tests. This allows you to prioritize which to fix first.

Specifically, with Develocity you can:

Detect flaky tests reliably for all Maven, Gradle and Bazel builds, including local ones by using a retry-mechanism and Build Scan®.

Prioritize flaky test remediation based on frequency of occurrence, recency, and impact for a custom set of builds.

Observe flaky test history to measure how their negative impact is increasing or decreasing over time.

Fix flaky tests efficiently using data reported by the Test Failure Dashboard and associated Build Scan®. This includes test history across all local and CI builds, common traits among flaky runs, differences when compared to stable runs, and test methods used for specific runs.

Test Dashboard with Flaky Test Detection Enabled

Develocity screenshots of a Test Dashboard analyzing the most disruptive flaky tests.

Prioritize the most disruptive flaky tests, and use trend analysis to pinpoint the root causes.

Using Test Failure Analytics for Other Problematic Test Failures

Test failures are the most common cause of build failures and many test failures are avoidable. Further, not all avoidable test failures are flaky tests (e.g. unstable environments also cause a large number of test failures).  

Test Failure Analytics addresses the systemic issues that result when these test failures are left unaddressed and allowed to fester. The result is compounding test suite failures and a lack of developer motivation to address the problems because it’s become too difficult to determine where to start. 

What’s needed is an easy way to triage failed tests and separate the healthy tests that detected a real bug in production code, from flaky tests, and completely broken tests that fail frequently and require immediate attention. Test Failure Analytics allows you to:

Determine priorities for remediation by generating a list of test cases grouped by classes and packages that fail most frequently.

Get a timeline view of when particular tests started to fail (“before” fix picture) which can be used to understand your starting baseline level of test failure volume and can be helpful in root cause determination.

Streamline root cause analysis with a listing of build environment metadata for failed test executions.

View performance characteristics of failed test executions compared to passing ones.

Verify the establishment of a new reduced test-failure-frequency baseline and monitor on an on-going basis with an extended timeline view (“after” fix picture).

Analyze test failures by Gradle task or Maven goal with a dashboard of Build & Test Failure Trends for local and CI builds.

Test Failure Dashboard

Develocity screenshots of test failure and performance trends.

Slice and dice your build performance and test data using dynamic filtering.

Build Failure Analytics

Build Failure Analytics is used to avoid local and CI failures. Specifically, it is used to detect, group, and prioritize avoidable build failures for remediation based on impact and determine the root cause of the incident faster. It does this by providing an intuitive dashboard of build and CI failure metrics. Including the ability to:

Classify non-verification (a.k.a avoidable) build failures from verification failures (e.g. JUnit, Checkstyle.)

Fix non-verification failures efficiently using data reported by the Failure Dashboard and associated Build Scan®. This includes failure history across all local and CI builds, common traits among failed runs, differences when compared to successful runs

Prioritize remediation for failures based on the frequency of occurrence and impact

Search build failures using any part of an error message (as you would on StackOverflow)

Deliver a non-verification failure history to measure how the negative impact of failures is decreasing or increasing over time

Build & CI Failures Dashboard

Develocity screenshots of a Failure Dashboard analyzing unexpected build failures.

“Am I the only one seeing this failure?” Search Maven and Gradle build failures like StackOverflow or identify common yet disruptive build failures using Develocity’s advanced failure analysis features.

Get Started with Develocity

Request a Test Drive