The following blog outlines how the Database team at Wayfair took a principles based approach to transforming how our Engineers interact with and manage their database infrastructure; moving from a largely operational world to one that is characterized by choice and autonomy.
Where we started
A few years ago, Wayfair was largely a SQL Server operation with a highly monolithic architecture characterized by tight controls, a web of nested data dependencies, and elongated cycle times to introduce relatively simple changes to production. Our historic database delivery model was highly operational and heavily centralized. For an engineer to provision, deploy, or manage a database they had to submit a ticket, wait in a queue, and engage with a Database Administrator.
As contradictory as it may sound, some of the antipatterns that formed out of desire to iterate quickly and put new functionality into the market actually began to slow our development velocity. After mounting frustrations from our internal users - Wayfair’s 3,000+ software engineers and data scientists - we hit an inflection point and decided to fundamentally rethink our database delivery model.
The introduction of Database-as-a-Service (DBaaS)
To say we had a mandate may overstate reality, but some observations from talking to our users became clear:
- Engineers wanted the autonomy to control their own database development; from provisioning to deployment to resource allocation, and everything in between. Gone are the days of administrators in the driving seat.
- Engineers wanted to be able to choose which database engine are fit for purpose for their use cases
- On-demand access to infrastructure was critical to enabling velocity and growth
These needs became our guiding design principles from which we came up with the concept for the Wayfair Database-as-a-Service Platform (DBaaS). Our generalized goal with the platform model was to get out of the business of building one-off database instances or extending our various monoliths everytime a new use case came to be. Instead, we wanted to offer self-service tooling for the provisioning, deployment, and management of databases. We envisioned a catalog of engine choices that went beyond SQL Server, as well as integrations with other internal systems that would otherwise take weeks to set up and configure if an engineer were to spin their own database instance directly in a cloud console of their choice.
Our DBaaS Platform consists of four conceptual layers; presentation, orchestration, infrastructure, and deployment.
Our presentation layer is what our users primarily interact with. For this, we joined forces with a broader group of platform teams at Wayfair and implemented the Backstage open source project. Within Backstage, users are able to step through workflows for requesting and modifying databases - all through the declaration of metadata. Behind the scenes (yes, we like the Backstage puns), we implemented a service to maintain the context of all of our database workloads.
This backend database management service is core to our orchestration layer where we also implemented supporting services such as an approval engine and task executor.
Our infrastructure layer is a mix of VMs running SQL Server, as well as Google Cloud’s Spanner and CloudSQL database offerings. Interacting with the infrastructure layer is a mix of PowerShell and Terraform tooling.
Once provisioned, our deployment layer enables users to leverage a Migrations Toolkit built on BuildKite to deploy changes such as schema to all three database engines in the same fashion.
Measuring what we built
Reflecting back on the goals we set out to accomplish, we’ve been hyper focused on measuring our impact during each step of this journey. We’ve focused our measurement strategy on three core metrics; cycle time for core interactions, net promoter score, and adoption.
While it's hard to tease out strong leading indicators of future success, we’re pretty pleased with some of the accomplishments we’ve achieved to date:
- When we started, the NPS for our offerings was -21%. Now, a little over a year and a half later, that score is +27%.
- Database provisioning requests take 3-5 days to process. Now, the mean time from “Request” to “Provisioned” is around 10 minutes.
- Adoption is a tough number to contextualize as teams decoupling existing systems and build new ones, but we now have more than 375 active users across more than 200 production databases built through the new platform.
Where we’re heading next
At this point in our journey, we feel well affirmed that our product thesis is valid. Our users are engaged, adoption is increasing at a faster pace than ever before, and the intake of new feature requests shows no signs of slowing down. When we consider our roadmap, we’re looking into relation alternatives such as NoSQL and graph offerings, as well as extending our selection of integrations to further enhance the user experience.
If you’re interested in joining us for this journey, we have both product and engineering roles on our teams and we’d love to hear from you. Head over to our
Careers page to see our open roles. Be sure to follow us on
LinkedIn to get a look at life at Wayfair and see what it’s like to be part of our growing team.