Wayfair works with over 25,000 suppliers to sell more than 30 million products in a variety of styles from classic to contemporary at compelling price points. In order to create a frictionless shopping experience, we need to enable customers to explore our massive catalog efficiently, and find the right products that help them create their unique sense of home.
However, new products face additional challenges because they suffer from the ‘cold start’ problem. Products new to the Wayfair catalog usually need more time to mature and land in the hands of customers.
One way to solve this problem is to jumpstart new products through a complex series of investments to accelerate their success. However, with more than 200,000 new products added each quarter and still growing, equally investing in all new products doesn’t make economic sense.
Predicted Winners is a suite of machine learning models and modules that are developed to tackle these challenges by predicting a new product’s long-term sales potential. The models help identify new products with high customer engagement and expected return on investment early, and help Wayfair prioritize investment efforts.
Predicted Winners is built around four pillars that work with each other as shown in Figure 1.
Figure 1. Predicted Winners is built around four pillars that build on each other to power early identification of winning products.
The first component of Predicted Winners is the Day Zero model. As the name suggests, Day Zero predicts a new product’s long-term sales potential at product launch or even before. At this stage, only product intrinsic features are available such as wholesale costs, product images, product descriptions and features to serve as model input. We have deep learning models to extract information from product images, descriptions as well as features and generate embeddings. Day Zero takes these embeddings together with wholesale costs as model features.
The Day Zero model integrates with the storefront sort algorithm. New products with high Day Zero scores are identified as winners. “Sort” will consequently place them at better starting positions in the storefront, helping customers find these products easily. Early customer engagement is captured by our Continuous Winners model to further accelerate the identification of successful new products that resonate with customers.
The next component in Predicted Winners is the Continuous Winners model that leverages these early customer signals to predict a new product’s long-term sales potential with higher degrees of accuracy. Continuous Winners model takes into account a variety of signals related to customer engagement: these include the number of visits to a product page, the number of people that add a product to their carts, and the number of orders etc. Continuous Winners arranges the signals from customer engagement as time series. Features extracted from this time series data are used to predict the long term sales of the product.
Figure 2 shows an overview of the approach for the Continuous Winners model. New products that receive high Continuous Winners scores are identified as winners. These winners become new candidates for Wayfair to negotiate exclusivity partnerships with suppliers. Products that successfully become exclusive receive additional treatment and benefits (e.g. high quality merchandising, better sort positions, flagship brand inclusion etc.). This significantly improves customers’ shopping experience by enabling them to find high quality new products that fit their personal style at competitive price points.
Figure 2. Overview of Continuous Winners model
With products identified as winners receiving increased investments to improve their performance as shown above, we need to avoid the self-fulfilling bias - i.e. did we identify winner products, or did the investments create winner products?
Here’s where Sentinel comes in, Sentinel is a continuous testing framework that controls exposure variables that are likely to introduce bias into our identification of winners, along with the data used to train Predicted Winners models. This ensures Predicted Winners models continue to identify the intrinsic potential of winning products, separate from the exposure benefits provided to the winners.
The development of Predicted Winners has been a multi-year effort by our team of scientists, engineers, product managers and analysts. There were many innovations along the journey and three are selected in this blog on improved feature engineering, universal model architecture design, and optimized training objectives.
(1). Improved feature engineering
Continuous Winners leverages customer engagement time series data as model input. Extracting representative features from raw time series data is a critical step to building a successful machine learning model. Previously, we used human judgment to design and extract the most relevant features. In recent months, however, we began using Long Short Term Memory (LSTM) neural networks to learn to automatically extract the most important and representative features.
LSTM networks encode long term memory (the features related to customer engagement that are learned over a six month period) and short term memory (how the system is responding to more recent customer signals). A benefit of using LSTM is it can capture mutli-dimensional comovement patterns of multivariate time-series features (i.e. how orders change when product views and add-to-carts change), which is usually overlooked by human judgment. As expected, we found that using LSTM networks has significantly enhanced predictive power as compared to our human curated features.
(2). Universal model architecture design
Both Day Zero and Continuous Winners are neural network models built within each product class. Despite strong model performance on average, model performance can degrade for classes with fewer products and/or low customer traffic. In addition, it is also challenging to cover new classes immediately given no training data. To tackle this challenge, we redesigned our model architecture to make it more universal. We draw on knowledge sharing among different categories – such as couches, beds, and tables. Previously, all of our models operated on discrete class levels. Now, we have one model ruling for a number of classes, which essentially enables the discrete classes to transfer knowledge. For example, the learnings from the lawn chair class can be used to help drive more accurate predictions for a related class like outdoor sofas.
Knowledge sharing has also unlocked another key advantage for Wayfair: the ability to identify winners for new offerings that might have limited data. This comes in handy during instances such as when we launch a new product class. To state the obvious, we have limited historical data for the products associated with such a class. Now, we can draw on knowledge shared by richer datasets and use Predicted Winners models to identify winners from this new class.
(3). Optimized training objectives
Another innovation that we enabled with Predicted Winners is to optimize training objectives to align with ground truth values. For revenue based predictions, we use a combination of Bernoulli and Log-Normal distributions to capture the true characteristics of ground truth revenue data. Negative binomial distribution is used for training models to predict order counts. This training objective optimization not only improves model performance but also unlocks additional capability of producing a measure of uncertainty. Incorporating uncertainty gives our business stakeholders valuable information that they can use to mitigate risks.
Our team executes against a full roadmap to continually improve Predicted Winners and develop new capabilities. We constantly explore new model architectures as well as new features such as neighbor products, product reviews, etc. to improve the accuracy of Predicted Winners models.
We are also scoping out new initiatives to explain why certain products succeed and gain insights to influence the future sales performance of products by making effective suggestions to Wayfair and suppliers. This new capability will position Predicted Winners as Wayfair’s engine to nurture the next generation of bestsellers.