Logo

Beyond A/B Testing: Understanding Simpson's Paradox

Winston Li
Feature image

Marketers know A/B testing is a go-to method for determining the effectiveness of different strategies. Whether you're comparing two media channels, creative formats, or ad placements, understanding how to interpret the results is crucial. However, there’s a hidden statistical phenomenon that can trip up even seasoned marketers: Simpson's Paradox (and no, it’s not a new plot twist on The Simpsons).

Simpson's Paradox happens when trends appear in different groups of data but disappear or reverse when the data is aggregated. It’s a reminder that data, when not carefully analyzed, can be misleading.

Let’s break it down with an example.

The Case of Media A and Media B

Imagine you’re running an A/B test to compare the performance of two media channels—let’s call them Media A and Media B. The goal is to determine which channel drives higher conversions. You test both media in three different cities and record the results.

Media A vs Media B

Here’s what the data shows:

At first glance, it seems obvious: Media A outperforms Media B in every city. But when you aggregate the total conversions across all three cities, Media B emerges as the better performer overall, with a total conversion rate of 7.3% compared to Media A’s 5.6%.

What just happened?

The Power of Weighted Averages

This statistical oddity is known as Simpson’s Paradox, and it arises because of weighted averages. In this case, most of Media A's traffic comes from City 1, which has the lowest conversion rate among the three cities. Conversely, most of Media B's traffic comes from City 3, the city with the highest conversion rate.

The result? Despite Media A performing better in each individual city, Media B wins out overall when you consider the total traffic distribution across all locations.

What Marketers Can Learn from Simpson's Paradox

Simpson's Paradox is more than just an interesting quirk—it’s a valuable lesson in how we interpret A/B test results. Here’s what marketers need to keep in mind:

1. Design Your Experiments Carefully

A/B testing is a powerful tool, but it must be designed thoughtfully. Ensure you’re not just looking at aggregate data but also drilling down into subgroups, whether by geography, demographic, or behavior.

2. Look from Multiple Perspectives

The same dataset can tell different stories depending on how it’s sliced. Don’t limit your analysis to just one viewpoint. Examine the data from different angles to uncover hidden patterns or anomalies.

3. Sample Size Matters

In the example, the traffic distribution across cities is unbalanced, which caused the paradox. Uneven sample sizes can skew your results, making it essential to account for differences in traffic volume when interpreting A/B tests.

4. Think Holistically

A/B tests provide a snapshot of how one variable performs in a controlled environment, but the real world is complex. Don’t assume that patterns seen at the local level will always scale up to the aggregate level—or vice versa.

Simpson's Paradox is a reminder that numbers can be deceiving if not properly analyzed. As a marketer, taking a holistic approach to data analysis is critical, especially when running A/B tests. Be mindful of potential pitfalls in your experiment design, always question what the data tells you, and remember that aggregated data doesn’t always align with what’s happening on a granular level.

In a world where data drives decisions, understanding statistical anomalies like Simpson’s Paradox can make all the difference in making informed, effective marketing choices.

By keeping these lessons in mind, marketers can navigate the complexities of data analysis and A/B testing with greater confidence and clarity.


 

Enjoyed this article? Dive deeper into statistics to boost your marketing campaigns.