How Wayfair optimizes content placement considering competing short term marketing and long term business goals
While Wayfair’s various machine learning models provide us an algorithmic way to find the most relevant content for a given customer based on their past behavior, there may be instances where we want to deviate from this. This is often the case with new and emerging product categories, where users have not had a chance to show their preferences. For example, if we optimize our models to select the category our customer is the most likely to purchase, we would show significantly more of the categories we have traditionally sold more of, such as sofas and rugs. This is because our models are trained based on past information rich with data on historically popular products. However, these recommendations might not be optimal for long-term customer needs and company growth.
Most companies and marketing teams overcome these limitations through the use of promotions (which override personalization and show all customers the same messages for a few weeks) or randomly splitting incoming traffic to see content based on what percentage of customers we want to see a specific category of product, like Kitchen Appliances. The latter is what we call Share of Voice - the proportion of each message shown to our customers. For example, if 20% of customers see a Kitchen Essentials banner, the Share of Voice for Kitchen Essentials would be 20%.
These approaches give us a simple lever to quickly adjust what messages customers see based on a desired target. However, such instruments do not explicitly account for what different customers are actually interested in. This is the central problem the Share of Voice Optimization Engine addresses: e.g. while directing 20% of traffic to see Appliance messaging, can we also optimize for customers who have the highest predicted propensity scores for Appliances? This will allow us to maximize the relevance of what we show by using our model outputs that predict a customer’s likelihood to purchase from each category.
The Share of Voice (SoV) optimization engine is designed to maximize customer-level message relevance while still meeting the desired SoV targets. It uses a constrained optimization solver to find the optimal combination of Customer x Message pairs such that we maximize cumulative relevance for all of our customers while meeting SoV targets set by our marketing, brand and other teams at the company.
It’s helpful to look at a concrete application. Let’s take the Homepage (www.wayfair.com), for example, and imagine it has 4 placements where we can display messages to customers. Let’s assume further that we want to populate these placements from a pool of 50+ types of messages. The types of messages could vary from storage (“Storage Solutions for Every Room”) to appliances (“Major Appliances. Major Brands. No Sweat”), or our other services like Wayfair Registry.
The problem of deciding what to show customers can be visualized as a 3-dimensional cube representing Customer x Messages x Placements. For example, the 1 in the top-left position of the cube means Customer #1 will receive F&D (Furniture and Decor) messaging on Placement #4.
The goal of the Share of Voice Optimization Engine is to find the optimal combination of 1s and 0s within this cube - i.e. one that maximizes the cumulative customer-level message relevance while meeting the desired SoV targets set by the business (as well as other logical or business constraints; more details later).
It’s also helpful to understand the key business inputs and model inputs that go into the SoV Engine in order to produce the optimization output (i.e. the optimal 1/0s for the decision-cube). The figure below shows some of the key inputs which we then cover one by one in the next section.
Our Wayfair Brand team and other leaders provide upper and lower bound SoV targets. The table below is an example of how category SoV targets could look like:
Placement x Message Applicability Matrix
In many cases, not every message can be shown in every placement. For example, we might want to only show products in one placement, while one of our services in another placement, and one of our brands in the third. This is handled via a Message x Placement applicability matrix. If the value for the [X, Y] cell in this matrix is:
- 1 = Msg X can be shown on Placement Y
- 0 = Msg X cannot be shown in Placement Y
We may have to enforce additional logical or business constraints on the optimization problem. Below are a few examples:
- Ensure that every placement is populated with a message (i.e. we don’t have empty placements)
- Ensure that a given customer doesn’t get the same message across multiple placements (i.e. we don’t want a customer seeing “Rugs” in 2 placements)
- May want to have more nuanced constraints as well (ex: don’t show credit card offers to existing cardholders, etc.)
Each of these business or logical constraints must first be translated to mathematical constraints that the solver understands. For example, to ensure that each placement is populated with exactly one message we can enforce the row-wise sum on the yellow side of the cube (placement x message) to be exactly 1. Or similarly, ensuring that a customer doesn’t see the same message across multiple placements can be expressed as a column-wise constraint on the purple side of the cube.
Model Scores | Customer x Message Relevancy
We need a way to describe the relevancy of each message to each customer to implement the algorithm, as this is the ultimate goal of the maximization function - showing the most relevant content to our customers under our set of constraints. Currently at Wayfair we have a range of propensity scoring models that predict the likelihood of our customers to purchase any of our product classes (e.g. Area Rugs, Fridges, etc.). We can leverage these models to describe the degree to which a given message is relevant to a given customer. For example, if I want to find out how relevant an ad about Renovation is to a specific customer, I can use the scores from our Flooring, Lighting and Vanity propensity models for that customer. In this discussion we will be referencing product-focused (i.e. should we talk to you about Rugs vs. Appliances) models and messages. If, however, for a particular placement, we want to show customers messages about one of our service offerings or brands, the appropriate service or brand model would be most suitable (e.g. Registry model for Registry.).
Placement Value Matrix
Given we are optimizing for SoV across multiple placements (4 in this hypothetical example), we need to consider whether we think all placements on the homepage are worth the same. For example, one could meet the desired SoV target for “Appliances”, but what if the optimizer happened to put all the Appliance messaging in the lowest-positioned placement, which gets the least amount of impressions? To prevent this from happening we would like the SoV Optimizer to have an understanding of the relative value of each placement within a placement. We do this by providing the SoV Optimizer a placement value matrix. The matrix simply represents how valuable we think a given placement is compared to others. This can be based on historical impressions, clicks, or other metrics like orders when clicked.
Output of Share of Voice Optimization
While the Customer x Message x Placement cube is a helpful way to visualize the decision-space for which we are optimizing, the concrete output produced can also be thought of as a simple table similar to below:
When a customer arrives on Homepage, a call is made to the appropriate endpoint which returns the list of messages or “topics” to show each customer or device we recognize for each placement based on the daily batch output of our algorithm. Unrecognized customers are given a random assignment based on SoV targets. For example, the messages we would like to show customer 123 are Mattress, Reno, Wayfair Credit Card (PLCC), and Kelly Clarkson Home for Placement 1 to Placement 4 respectively. Note the Share of Voice module only provides the “message” or “topic” the customers should see. A separate downstream application will select and render the actual content we would like to show for that message.
Engineering The Solution
To develop the solution, we worked with Gurobi, a mathematical optimization solver. This problem can be defined in the scope of a variant of the Generalized Assignment Problem, which Gurobi is well equipped to solve. We code each [Customer, Message, Placement] combination as a binary decision variable to optimize over. We then define each constraint inside Gurobi, as well as the objective function as described above. Lastly we run the solver and produce an output that is uploaded to our internal data stores and services for consumption by other Wayfair teams.
We use a batching and parallelizing process to increase our algorithm’s speed, as the number of permutations of [Customer, Message, Placement] and the number of constraints can easily reach into the billions. Batching and parallelizing allows us to reduce the number of customers we choose to optimize at once, as randomly sampling customers leads to nearly equivalent results as optimizing all customers at once, with a large reduction in processing time.
The Share of Voice algorithm is a powerful tool to allow us to make the optimal tradeoffs between short term profit maximization and long term business objectives. We’ve already had successful applications and are working on enabling future integrations through a robust engineering platform. Share of Voice is just one of many tools we use across Wayfair and Data Science to make the best possible recommendations for our customers! We’re always working on improving and developing new ideas, algorithms and integrations every day.