This is the fourth post of twelve for Mixmax Advent 2020.
About a year ago, we introduced a suite of end-to-end Puppeteer-based tests to our CI pipeline, in order to automate away the manual smoke testing that was required before changes could be deployed to production. In its infancy, this test suite was notoriously flaky, to the point where engineers were resigned to restarting the test suite a couple of times a day to get the tests to pass on a second attempt (about 1 in every 7 test runs failed for one reason or another). In the past year, we've reduced the false negative rate of the test suite by a significant margin (roughly 1 in every 40 test runs fails now), and in the process, picked up a number of tips and tricks for debugging and maintaining Puppeteer-based tests.
Recommendations
-
Automatically retry tests
Since you're automating a real(ish) browser, things like the browser being slow or the network having a hiccup can cause integration or end-to-end tests to fail; in practice, we often saw transient failures when one of the microservices that the test suite touched was being deployed to on our staging/testing environment. You can manually retry the test when this happens, but we found it more effective to retry tests upon failure automatically. If you're using the
jest-circus
test runner, this can be accomplished with a single line of code:jest.retryTimes(2); // Retry twice on failure for a total of 3 test runs.
-
Use a consistent test set up across testing environments
If you utilize a consistent test set up, like a Docker container, that can be used with minimal changes in both development and in your CI pipeline, you'll have a higher degree of confidence in your tests passing when they're added to the main test suite. You'll also be able to easily debug errors in your CI pipeline; simply run the tests inside a container from your localhost, but point it at your testing environment to see what's going on with your tests (you can also add
console
anddebugger
statements to the tests for further observability). -
Grab debugging information on failure
It's often useful to use Puppeteer’s built-in screenshot functionality to see what’s happening in your application when a test is running:
// Take a screenshot every second. setInterval(async () => { await page.screenshot({ path: `./${Date.now()}.png` }); }, 1000);
This works well during development, but often can’t be used for debugging problems specific to the CI pipeline, as it requires access to the machine's filesystem. In that scenario, you can instead try:
// Take a low-quality JPEG and dump it to the log in base64 format const data = await this.page.screenshot({ type: 'jpeg', encoding: 'base64', quality: 1 }); console.log(`data:image/jpeg;base64,${data}`);
when the test fails; the resulting image can be viewed by placing it in the
src
attribute on anyimg
element, like so:<div> <img src=`data:image/jpeg;base64,${data}`/> </div>
-
Try not to make too many assumptions
This is less of a concrete item, and more of a general piece of advice for end-to-end/integration test writing, but one that I've found invaluable for ensuring the integrity of your tests and for avoiding flaky tests in the first place; by "assumptions", I mean assumptions about the initial and end states of your application before and after your test runs. Say, for example, that you want to write a test for your to-do application that creates a new to-do item - you might set up the test to click a "create" button, give the to-do item the title "Pay bills", hit a "save" button, and ensure that the to-do item titled "Pay bills" renders in your list of to-do items. Then, after your test is complete, you'd likely clean up your testing environment by removing the newly created to-do item.
But what if the clean up operation fails? Now you have a pre-existing to-do item in your testing environment with the title "Pay bills", and at least one portion of your test suite is (erroneously) guaranteed to succeed. Instead, a better practice here is to make very few assumptions about your testing environment, and attempt to isolate your test as much as possible. For example: one of our test suites sends an email to and from the same test user using the Mixmax Chrome extension, and waits for the email to arrive in the test user's inbox. To ensure that the email we're waiting for is the same one we've sent, each test run generates a uuid, and uses that uuid as the email's subject, which allows us to confidently say that our application has sent an email successfully.