Developer Productivity Engineering Blog

How Coursera Reduced Scala Compilation Times By 125 Hours a Week

With Hydra, we gain 125 hours of development time otherwise lost to long compilation times every week, and countless more on our Continuous Integration infrastructure.

-David Guo, Dev Infrastructure Engineer at Coursera

Coursera is the leading online educational platform, empowering more than 35 million learners across the world to access over 2600 high-quality courses from top educational institutions. The Coursera platform is the central component that enables teachers to share their knowledge, students to uplevel and enhance their skills, and enterprises to continuously train their workforce.

Scala and Play for 80 microservices

Coursera’s platform backend is entirely written in Scala, fully leveraging Play Framework with its great scaling performance, and is split into a multitude of services. Today, there are more than 80 microservices built in Scala. The persistence layer uses Apache CassandraAWS Aurora and MySQL. The front-end stack is based on ReactJS and GraphQL. Mobile development happens in native, Kotlin for Android and Swift for iOS.

Today, Coursera employes 90 developers with about 60 developers working on the Scala backend, and the engineering force will continue to grow over the next years. Our workflow involves careful code reviews and pull requests that are automatically validated on our CI cluster. We regularly see over 20 pull requests per day, with occasional spikes. To handle all this we automatically scale up and down our AWS cluster based on demand. Moreover, developer machines have to remain available to each engineer.

Speed Up Compiling 2 Million LoC

All backend services live in the same repository (it’s a mono-repo) with almost 2 million lines of Scala code. Despite the code size of each service being relatively small, compiling the backend repository takes about 30 minutes with the vanilla Scala compiler. As these long compile times became unbearable, David Guo, Productivity Engineer at Coursera, started to look for a solution to alleviate the problem.

We didn’t have to change a single line of code. It just worked!

-David Guo, Dev Infrastructure Engineer at Coursera

Guo evaluated Hydra in August 2017 and was quickly convinced by its immediate and reliable effect on speeding up compilation times. Once enabled, it got out of the way and let the engineers focus on the task at hand, while being reassured that compilation time is under control. Coursera’s development team didn’t have to change a single line of code. Today, with Hydra in place Coursera is saving 125 hours of development time every week, and countless more on our CI infrastructure.

Guo had evaluated a few other alternatives, including the Bazel build tool, but found that at the time these tools proved to be either immature or required non-trivial rewrites. By comparison, Hydra integrated effortlessly, with a set up time of around 5 minutes. In addition, Coursera benefits from compilation metrics reported in the Hydra Dashboard, which helps the team visualize and drill down into compilation metrics for keeping everything under control.

When asked if Hydra was of long-term value to the Coursera team, Guo replied: We adopted Hydra in August 2017, and we are not looking back.