Wayfair Tech Blog

Testing our Confidence: Scaling Software Quality with Automated Testing


As Wayfair’s business has scaled, so too has the size of our engineering teams – not to mention the complexity of the software features we’re building. At the same time, moving quickly in the global tech landscape has never mattered more. As a modern tech company, one simply cannot afford to slow down the rate of feature delivery.

While it’s fairly easy to rollback and recover from a broken build, it’s not entirely without pain. As we have more and more teams working on an ever growing list of features, we wanted to find a better way to have confidence in what we were building.

Enter automated developer testing. We’ve had automated tests at various levels for quite some time at Wayfair, but historically they haven’t been a big focus. There have been many others who’ve written about the value of automated tests and how to go about writing them. What is exciting about our approach is the way in which we’ve gone about introducing developer testing into our engineering culture. At every step, we’ve worked to keep up our focus on speed and iterative improvement in incremental steps that we can validate.

A tale of two teams and their backlog of bugs

Today’s story starts in late 2017 with the Idea Boards and Room Planner teams. These teams own the features that allow our customers to save products they’re considering as a purchase, as well as a tool that allows them to visualize products in a virtual room space, respectively.

Up until about six months ago, developers had focused on writing just the minimum amount of code necessary to get features working and deployed. By late 2017, our backlog had grown to include nearly 70 active bugs. Our teams tried various strategies to reduce the amount of bugs we were experiencing. We added further manual QA resources, we dedicated an engineer full time to fixing bugs, and we even stopped all feature work to engage in periodic team-wide bug sprints. At the end of the day, however, none of these strategies seemed to be getting us to a place that reduced the overall rate of bugs coming in. It was becoming difficult to build on top of our existing code with confidence.

With this kind of dilemma, we decided as a team to embark on the challenge of learning how to write automated developer tests, beginning with figuring out the right kind of tests to write. I should take a moment at this point to define what we might mean by automated developer tests.

There are many different types of automated developer tests. These tests can range from web crawlers executing Selenium scripts, acceptance testing active production systems, all the way down to tiny unit tests focused a single function or method. We wanted to add tests that would help grow team confidence in software quality, while also helping our developers continue to ship features quickly. We needed tests that could run in seconds as opposed to minutes, and tests that would be natural and intuitive to build.

Embarking on an exciting adventure in confidence

As it happened, our team’s interest in trying out testing as a solution to our bug problem coincided with several other company efforts to keep our software workflow and engineering practices on the cutting edge.

Shortly before we decided to ramp up our efforts of writing more developer tests, our Infrastructure and Architecture teams had build pipelines put in place to run all automated tests before any code was merged into our master production build. This infrastructure is an essential support system for any automated test writing. For automated tests to have any value they must be run consistently, and there’s no better way of enforcing this than if they’re required to pass before merging new code into production.

So, what kind of code are we talking about? At Wayfair, our frontend consists of React and plain ol’ JavaScript while our backend server-side code exists delightfully in PHP. We were experiencing the majority of our bugs in JavaScript and a good 70%+ of the work the Idea Boards and Room Planner teams were doing was in our frontend codebase.

Jasmine, Enzyme, and most recently Jest were the three JavaScript testing libraries that we had chosen for writing developer tests; they provide straightforward and intuitive developer APIs. They also allow us to write tests in the same programming language as the code being tested.

To kick things off we evaluated the past 3-4 sprints worth of bug tickets, and identified the most critical places in our app that we had experienced serious bugs. We then wrote dedicated tickets to add developer tests for that behavior. We also allowed the scope of these tickets to include, if necessary, light refactoring to the feature codebase while fleshing out tests for it.

The tests that we began writing took two different forms: The smallest, lowest level tests would cover a single component, JavaScript function, or other single code file to assert that it behaved the way it was written. You might also know these as unit tests. For the second type, we used the Enzyme testing library to write user-level tests that would incorporate many layers of our JavaScript code. These tests would mock out the API/data layer and assert that when a user filled out a certain form, or clicked a certain button, they would see the correct response in the HTML.

After one or two sprints of dedicating a single engineer to focus exclusively on writing test code and learning how to write tests, we engaged some additional help from another Storefront team, the larger engineering “superpod” our two focus teams belong to. We had a testing expert from one of our Infrastructure teams embed themselves with the Idea Boards and Room Planner teams for a sprint. Our borrowed test expert spent a dedicated day pair programming with every engineer, where they focused on helping them learn what kind of tests to write for their specific feature work.

Consistency breeds success... and good habits

What happened next? Fast forward to March 2018, and we have now had all Idea Boards and Room Planner engineers writing tests for several months. Our team norms have evolved to the point that every feature ticket includes writing (or updating) our automated tests. Below you can see a picture of how our bug backlog has changed over time since we embarked on writing and maintaining a suite of automated developer tests.

While our team has also made an effort to invest time and effort into fixing bugs as soon as we discover them, an exciting trend here is that the rate of discovering (or creating) new bugs has decreased over a six month period. This is the proof in the pudding that the engineering time we’re investing into writing tests is paying off. We hadn’t stopped building and shipping new functionality during this time by any means. As I write, we have just launched two large, new features (both covered by developer tests) for item saving and room planning that will dramatically revamp the way customers use these tools. In a more anecdotal form of feedback, our Room Planner and Idea Boards engineers believe the tests we now have as help prevent them from causing other bugs that they would have missed, with our QA engineers remarking how hard it has been to “break our stuff”.

It’s still early days for automated testing here at Wayfair, but we’ve already begun the process of sharing what we’ve learned to other feature teams. One reason why the adoption of writing automated tests has gone well so far is due to it being a bottom-up movement. By taking a gradual approach within a single team we’ve been able to gather incremental feedback and show evidence of it working.

As more teams consider adopting this method of feature development, each team can ramp up their engineers in a way that doesn’t disrupt their product or engineering goals. If you want to make big changes to a large engineering organization, its critical to start with small, and most importantly, measurable steps. Like so many other challenges in engineering, it’s all about breaking down the problem and thinking creatively about how to tackle it. Long live automated tests!