How We Use Machine Learning and Natural Language Processing to Empower Search

Search is critical to the customer experience at Wayfair. What we show on the first page of results is incredibly important for users. The graph below shows what many people might already know when it comes to browsing and search habits: The majority of customers rarely look beyond the products on the first page of search results.

In essence, if customers can’t find something easily, they are probably not going to buy it. The key to delivering relevant results is to understand a customer’s intent. However, not all queries express intent clearly. While queries such as “red leather sofa” and “glass coffee table” are fairly clear, some queries such as “coastal decor” are not since decorative objects, wall art, rugs, or even pillows can be categorized and shown under the results of “coastal decor”. To address this, we use a combination of techniques, including Machine Learning and Natural Language Processing (NLP), to surface the right results for customers. Keep reading to get a glimpse into how we use these techniques.

Query Intent Engine

To begin with, we extract useful information from various datasets including our product catalog, search logs, browsing patterns, clicks, and also information from user-generated content such as product reviews. This data is then fed into our search system, used throughout various areas in our backend as outlined in the diagram below.

Queries are analyzed first by the Query Intent Engine. This is a service that parses each query to understand customer intent using semantic knowledge. It also gathers intelligence from a machine learning model before arriving at a decision. It can identify whether any part of a query is either a product attribute, class name, category, dimensions, or a brand. When someone searches for “glass coffee tables”, the Intent Engine determines that the word “glass” likely refers to the value of attribute ‘Top Material’ in coffee tables. It then directs the Search Engine accordingly, i.e, to show the Coffee Tables category with the “Top Material: Glass” filter applied.

The semantic knowledge required to complete this action is compiled by dedicated NLP jobs that pull data from the various sources mentioned above. These jobs extract semantic information from text and store it in optimized data structures to facilitate fast search. Following this, NLP jobs apply a series of transformations and cleanup steps including tokenization, stemming, applying stopwords, and synonyms. Additional intelligence comes from a machine learning model that predicts the types of products to show, depending on the query. This model is essentially a query classifier. It is trained on historical click data through the NLP pipeline. Having been equipped with information from multiple sources, the Intent Engine then decides on the next best action.

Using this approach, we successfully identify customer intent on a large portion of queries and send many users directly to an appropriate page with filtered results. In the cases when the Intent Engine cannot match any of the available semantic information, this then triggers a “Keyword Search” on the Search Engine.

Learning To Rank For Keyword Searches

In our Keyword Search approach, we rank products over the the whole catalog without any auto-filtering, as noted in our previously described case. Results rely upon their relevance score and ranking in our Search Engine. The Search Engine runs on the open source Apache Solr Cloud platform, popularly known as Solr. A number of techniques, including Learning To Rank (LTR), have been applied by our team to show relevant results. In this technique, we train another machine learning model used by Solr to assign a score to individual products. This model is trained on clickstream data and search logs to predicts a score for each product. These scores will then ultimately determine the position at which a product shows up in search results. The model improves itself automatically over time as it receives feedback from the new data that is generated every day. For queries such as “coastal room decor”, the Intent Engine detects multiple intents, as outlined above. In such cases, both machine learning models come into play – the query classifier and the LTR model.

This approach works well for the majority of keyword searches we encounter, but there are certain low volume queries that are tricky to handle. These queries appear somewhat regularly in search logs, but the sheer variety of them makes it difficult to find a one-stop solution. Such queries make up the so called Long-Tail of Search.

The Long-Tail Keyword Search Problem

Historically, customers tend to have a mixed experience with Long-Tail queries. To this end, we always try to improve the search experience on these queries by creating innovative solutions. For example, consider a query like “Blankets for mom”. Part of the intent here is clear – ‘find a blanket’ – but how do you know which blankets are good for moms? We label searches like these Perception Queries. These queries mainly describe the perception a customer has about a product. You cannot fully rely on the information included in the product catalog for such queries, as it would typically result in a suboptimal search experience. So, how do we find out what customers actually think about the products they are trying to describe using Perception Queries? It looks like we have a dataset containing exactly the same information – Customer Reviews.

Applying NLP on Customer Reviews

In reviews, people describe rather passionately what they like or dislike about a certain product. It's a rich source of information that we rarely see being used for search purposes in the industry. To illustrate how valuable review data is, let's look at the below rug review from a customer who owns dogs.

The customer is describing why the rug works well for them and their dogs. Now, the product description for this rug may not directly contain the word “dog”, but the same information hidden in a customer review is incredibly relevant if you are searching for something along the lines of “dog safe rugs”.

So, how do we extract this kind of information from reviews? While it turns out that it is not so easy, it is certainly possible. Reviews contain unstructured text, the language used may not always follow the rules of grammar, and may also contain misspellings. It is also the case that a review will have a lot of irrelevant information. Thankfully, proper use of NLP can help in extracting a significant portion of the hidden information we need, if not all of it. We have built a number of NLP tools to process a large percentage of this unstructured data. We use techniques such as Opinion Extraction and Sentiment Analysis to identify specific keywords, also known as Product Aspects, from reviews that are useful for product search. We store these keywords in a separate database where they are ranked using number of factors, such as how often the keyword appears in a review. A handful of these terms are then picked as tags for related products. When a user searches for one of the Perception Queries like “dog safe rugs” or “Blankets for mom”, we show them products that have relevant positive mentions in reviews.

These techniques have improved the conversion rate of some of the long-tail queries at Wayfair. It’s worth noting that the usefulness of customer insights generated from reviews is not only limited to search. These insights are helpful in many other ways, such as the improvement of product offerings by learning about what customers like or dislike about a product, improvements for packaging of items that are often damaged when shipping, etc.

Since implementing Machine Learning and NLP into our Search Engine data processing pipeline at Wayfair, we've had great results with customers finding more of what they're looking for. Utilizing Product Reviews in this way is rather new for our team, and the results have been incredibly promising so far. We are always trying to improve and would love to hear your comments if you are also using NLP in your search ecosystem. Wayfair will be giving a talk on “ Using Opinion Mining and Sentiment Analysis to Discover Hidden Product Features for E-Commerce Search” at the upcoming Search & AI Conference in Montreal, happening from October 15-18. For those unable to be there, watch out for a video of the talk to be published on the Activate YouTube Channel in the coming months.