Earlier this month, the Pricing Algorithms team had a Community Service Hack Day here at our office, dedicating an afternoon to work on simple tools that could be used as part of any project. The day was an opportunity to fill any knowledge gaps in how our team overcomes common data problems, improve documentation and testing, and ensure that everyone on our rapidly growing team can learn the technical strengths of other team members.
Internal hackathons are great for lots of reasons. They promote collaboration and engagement across teams. They provide a time for creative exploration and a time to try out new and untested tools and methods. Wayfair hosts a company-wide Hackathon every year, but our team is growing large enough and fast enough to warrant a hackathon of our own.
Here was our sales pitch to encourage folks to opt-in for the event
Are there code snippets in your repositories that could be helpful to other people on the team? Do you have a jupyter notebook that could serve as a template for some routine task? Do you want to learn how to spin up your Python code into a library? Interested in expanding your programming toolkit to include new and amazing things?
And of course, the shared document of everyone’s planned potluck contributions didn’t hurt:
Some of our crowdsourced food options
Two noteworthy projects which came out of our hack day were an XGBoost wrapper which adds some new features to this popular library, and a new Python library for conducting permutation tests (aka randomization tests).
A New Modeling Package for XGBoost
by Tom Wentworth
As part of Algo Dev Day we released the beta version of a new modeling package which automates the engineering and data preparation required to create a machine-learning model using XGBoost. This frees up time for Data Scientists to focus on less mechanical aspects of their job. Its setup is simple, and adds new features beyond what XGBoost provides.
The idea came about during an upgrade to LineShip, our shipping cost estimation model. LineShip V4 uses output from five different algorithms (three of which use XGBoost) and each needs to be run separately (split) for different subsets of orders (large parcel, small parcel, B2B, B2C, etc.). In the end, we train about 80 separate sub-algorithms, including 48 trained instances of XGBoost. With all these model variations, we want to reuse code as much as possible and make sure we have a clean log of settings used at each split.
Using our XGBoost wrapper, a user only needs to specify model features (or perhaps functions to create features) and label variables. Base_model takes care of the rest, providing a pre-coded structure to store the model, enable saving and loading to disk, efficiently perform one-hot-encoding, create the DMatrix for XGBoost and more. Additional features include model splitting, wherein one’s data is split into groups that are independently trained and evaluated, and context-based parameter overrides, allowing one to override model settings (such as features used) based on various conditions or data splits. Additionally, all features are controlled by one settings file, allowing the user to quickly iterate through different ideas without modifying their code.
A New Library for Permutation Tests
by Linjia Li and Tim Scully
Teams hard at work
At Wayfair, we frequently use experiments to validate the effects of new pricing strategies and algorithms. The design and evaluation of these pricing tests are more challenging than a typical customer-based A/B test, as the latter would require offering different prices to different customers. To ensure a fair and seamless customer experience, we often perform tests on sets of similar products that are randomly assigned to test and control groups. We use Principal Component Analysis (PCA) and K-means to cluster similar products into groups and then randomly divide each group into test and control groups. We can compare the effect of a new pricing strategy by comparing the treatment and control groups.
Permutations tests allow us to determine if the treatment effect is statistically significant or simply a result of noise caused by the random assignment of groups, we implement a permutation test. To determine how much power the observed effect has, we bootstrap the group assignment many times and compare the actual observed effect with the bootstrapped distribution. These tests are common in Pricing Algorithms, and so one Hack Day project was to build a central code base for performing permutation tests. The project saves our team time and reduces the chance of bugs.
The final result, “DataPermuter,” uses pandas, numpy, and multiprocessing to simulate the random assignment procedure. The user inputs a dataframe on the experiment outcome and a response metric of interest. The package returns the estimated distribution of the response metric, a “power” indicating how extreme the observed effect is, and a plot that demonstrates where the observed response lies under the estimated distribution. The package applies to all tests that are structured with the random assignment framework – whether the assignments are products, customers, product classes, etc.
And Many Other Projects
All together we received 8 group projects, covering all levels of technical difficulty. Other projects included centralized repositories for configuration files, template notebooks for onboarding new hires, and suites of unit tests for more mature data science projects (we learned a lot about pytest and hypothesis).
Our Pricing Algorithms hackathon was a tremendous success. Not only did we build some cool products for our team, but we also learned a lot from each other and had a lot of fun in the process. Interested in joining our team? Click here for a list of Data Science openings.