Taboola Blog
Engineering
Breaking the Scale Barrier – Smarter Test Selection

Breaking the Scale Barrier – Smarter Test Selection

Posted by Shirel Hadad · Aug 07 · 4 Minutes read

A build process is a critical procedure triggered by new code changes (Git push). Its primary responsibility is to validate the integration of new code without disrupting the existing code and ensuring the successful execution of all unit tests.

However, as the number of unit tests grows, ensuring code stability becomes increasingly challenging due to the impracticality of running all tests within the constraints of time and resources.

In this article, we will explore the optimizations we have implemented to achieve a faster and more efficient build process.

The build process challenge

Our development process revolves around a monorepo, a single repository housing all our code, that daily deploys its code to production. By leveraging optimized build strategies, we ensure that developers receive rapid feedback without sacrificing productivity. Integrated within our monorepo workflow, fast builds enable us to swiftly identify and address issues, ensuring that only validated and reliable code reaches production. This approach empowers our developers to work efficiently and deliver high-quality code with confidence.

A build process starts once Git commits are pushed. With each addition of code, it becomes imperative to validate the project’s test suite. Initially consisting of a few hundred unit tests, the test suite has now grown significantly to approximately 13,000 tests. As the number of unit tests increased, it became essential to devise a more efficient approach for determining which tests need to be rerun, instead of executing all the unit tests indiscriminately.

The build process was divided into two main types:

Build of the Main branch.
Build of a Feature Branch (any branch other than the Main branch).

For the Main branch, the process remains straightforward. When a pull request (PR) is merged into the Main branch, it triggers a build that runs all the existing unit tests as a precautionary measure.

However, for Feature Branches, a new build is initiated when changes are pushed to Git. In this scenario, the selection of tests to run depends on the changed files from the new commits.

The tests-to-run are divided into batches, which are then executed in parallel on many machines. The selection of unit tests for each batch is based on their duration, ensuring the optimal allocation of resources and minimizing the overall time required to complete the unit test runs.

In the past, the unit test selection was done at the module level. If a change was detected in a file within Module A, all tests from modules dependent on Module A were executed. However, as the number of tests continued to grow, necessitating more resources, we recognized the need for process optimization.

Our solution, known in Taboola as the Achilles Project, focuses on selecting unit tests to run at a finer resolution—specifically at the class level. Rather than considering the module to which a Java class belongs, we now examine the unit tests from all modules that depend on the modified class and execute only those relevant tests. It has saved us around 3 minutes from an average build time of 23 minutes.

The solution:

The Achilles Project comprises two parts: data collection and determining the tests to run during the build process.

For Main builds we apply only the first part, collecting the tests data, and run all the tests. We use the Main data as a baseline.

Data Collection

To collect the necessary data, we employed a Java Agent. Essentially, a Java Agent is a specially crafted JAR file that utilizes the Instrumentation API provided by the JVM to modify the bytecode loaded in the JVM. The java agent gives the ability to collect loaded classes while running unit tests.

Then, we implemented a listener that extends org.junit.runner.notification.RunListener, which listens to each running test. Once a test finishes, the listener utilizes Java Instrumentation to collect the loaded classes for that test. The data is then saved as a map of tests to loaded classes.

After running all the tests, we gather all the test data (list of tests -> loaded classes) and transform it into a list of classes with associated tests to run. This information is stored as a JSON file in our database (HDFS in our case). Consequently, we can now select which tests to run based on the stored data in the HDFS, eliminating the need to execute all tests.

Determining the Tests to Run

During any Feature Branch build, a POST request is sent to “Achilles server”, with three parameters: Current branch name, changed files, and closest Main commit.

The Achilles server retrieves older data saved for the branch, and for the given Main commit. A record of data is a json file containing a map of: class name to list of tests to run.

Achilles takes the data from the two records, merges it, and returns the tests-to-run according to the given changed files.

Saving

As of the time this was written, 55.3% of the tests are avoided by Achilles, something like 7150 from 13,000 unit tests. Relative to the module-level solution.

In minutes – it saved us ~3 minutes from build time of ~23 minutes. A significant improvement.

Note: During this article, our focus has primarily been on Java classes. However, the Achilles Project also provides the capability to track non-Java files using Java ClassLoader. This extends the versatility of our solution, allowing us to effectively handle changes and tests associated with various file types beyond Java classes.

Credits: Alon Pilberg – Project lead, and Maria Saleh Naser.

The build process challenge

Data Collection

Determining the Tests to Run

Saving

Create Your Content Campaign Today!