If you live in web space and are moderately interested in the optimization of your user experience through A/B testing or exploration-based methods, you've probably seen a dramatic increase chatter in the Twitterspere lately. [And if you're living in the web space and you aren't interested in user experience optimization, then you're leaving a ton on the table - I'll not preach, but really, get with it!] In the past few weeks there have been a number of thoughtful posts from independent web developers as well as from established startups in the space. In fact, they string together in what is a pretty interesting dialogue and aside from the fun of the tit for tat tone and little jabs here and there, there is a lot of great information being shared. Information that you should be aware of in you are working solo or in an environment where you don't have a dedicated analytics team or data analyst to save the day and tell you what the data is really saying.
A/B Testing and Multi-Armed Bandit Dialogue
Here is how things have shaped up to date. All good and quick reads that I highly suggest you read now or at least, Instapaper for later.
- 2012-May-28: 20 lines of code that will beat A/B testing every time [Steve Hanov]
- 2012-Jun-o1: Why multi-armed bandit algorithm is not “better” than A/B testing [team at Visual Website Optimizer]
- 2012-Jun-03: Why Multi-armed Bandit algorithms are superior to A/B testing [Chris Stucchio]
Related References
There are a number of great references called out in these posts, including:
- How Not To Run An A/B Test - the trappings of false significance analysis
- What you really need to know about mathematics of A/B split testing
- Algorithms for the Multi-Armed Bandit Problem
And a couple recent posts on large-scale A/B and mutlivariate tests at leading web companies and in the political realm:
- The A/B Test: Inside the Technology That’s Changing the Rules of Business [Wired, Apr 2012]
- Reverse engineering targetted emails from the 2012 Obama campaign [FlowingData]
The simple truths of A/B testing
- Testing, whether through A/B, multivariate or multi-armed approaches is always* better than not testing/optimizing. Get off your duff and learn enough to make it happen or hire someone if you're not a technical person.
- *Drawing incorrect conclussions from a test is always worse than not testing. If you don't plan on validating your results using statistically-sound methods (e.g. chi-squared analysis) with a known margin of error then don't make any big bets on your results. You can continue to fly by the seat of your pants in a day-to-day way and expect that you're working to your cause's benefit but don't expect to find an "answer" and lock in that approach.
- Not all platforms are built the same. I don't care what Visual Website Optimizer or Google Website Optimizer tell you. Most consumer platforms are built to increase the conversion rates of simple funnels. For anything more complex, e.g. long-term impact of major site redesign A vs. B, you are going to have to dig in and build your own solution to properly cohort your web visitors and gather the non-conversion metrics that will help you judge success over time (e.g. Time on Site, Page Views, engagement rate of X, etc. – whatever it is that makes your business move)
- Not all tests yield significant results. Most will not. After running hundreds of tests at Ask.com and more recently, working on dynamic A/B testing of headlines while at SAY Media, I can say without fear of contradiction that more times than not, the time you spend testing will yield nearly no gain. Don't let it get you down. It's part of the game. Look at the results, infer some causal elements for those results being similar, even choose some random variations that you can't back-up with simple logic, and get back to testing. You don't know your users as well as you think. It's a hard lesson but the sooner you learn it, the sooner you'll start making changes that will truly move the needle for your business. Unburden yourself from the fallacy that you are like the majority of your users.
