Skip to content

Knowledge · Conversion Engineering

A/B Testing for Mid-Market Sites: When You Have the Traffic, When You Don’t

14 days kickoff → live $3K–$15K+ scope-tiered WCAG 2.1 AA baseline

A/B testing is the most-misapplied tool in mid-market CRO. The math doesn’t work below 30K sessions/month. Most agencies sell it anyway. Here’s the actual sample-size math, when you should test, and what to do when you can’t.

№ 01The sample size math, plain

The basic sample size formula for a two-proportion test depends on baseline conversion rate, minimum detectable effect (MDE), confidence level, and statistical power. The shorthand most CRO practitioners use:

  • Baseline 2%, MDE 20% relative, 95% CI, 80% power: ~7,000 sessions per variant
  • Baseline 3%, MDE 20% relative, 95% CI, 80% power: ~4,800 per variant
  • Baseline 3%, MDE 10% relative, 95% CI, 80% power: ~19,000 per variant
  • Baseline 5%, MDE 10% relative, 95% CI, 80% power: ~11,000 per variant

For two variants (control + treatment), double those numbers for total sessions. Then add 30-50% buffer for noise from day-of-week and traffic-source mix. A mid-market site doing 8K sessions/month would need 8-12 months for the 10% MDE test. Seasonality kills the result.

№ 02The 30K threshold explained

Why 30K sessions/month as the practical floor? Because below that, a properly-powered test of a meaningful lift (10-20% relative) on a primary conversion takes longer than the 4-week window where day-of-week patterns repeat cleanly. Run a test for 6 weeks and you’ve crossed a holiday, a payroll cycle change, or a Google algorithm update. Confounding variables dominate the signal.

30K isn’t a magic number — it’s the level at which most mid-market sites can run a clean 14-21 day test. If your site is at 60K+, you can test more aggressively. If you’re at 5K, you need different tools entirely.

№ 03Sequential / Bayesian testing for thin-traffic sites

Sequential testing (also called Bayesian A/B testing in CRO tools) allows continuous monitoring and earlier stopping without inflating false-positive rates the way frequentist peeking does. Tools that implement it well: Optimizely (Stats Engine), VWO (SmartStats), Convert.com, GrowthBook (open source).

The trade-off: sequential tests still need volume, just less of it. You might cut required sample size by 30-50%, not 90%. A 30K-floor site becomes a 15-20K-floor site. Still won’t make 5K-session sites testable for primary conversions — but it makes microconversion tests viable.

№ 04Pre/post analysis: the underrated alternative

If you ship a change and conversion goes from 2.1% to 2.7% across 8,000 sessions over the next 6 weeks, was the change responsible? Maybe. Pre/post analysis requires honest baselining: 8-12 weeks of pre-change data, control for seasonality, control for traffic source mix.

Pre/post isn’t as clean as A/B — you can’t rule out external factors. But for thin-traffic sites where A/B is impossible, it’s the honest answer. Document the limitations, name the confounders, and don’t over-claim. A directional result with named caveats beats a fake A/B test with falsified math.

№ 05What to test if you can test — and what not to

If you have the traffic, test things that change the page’s function, not its appearance. Worth testing: form length, form steps, pricing presentation, primary CTA copy, hero section structure. Not worth testing: button colors, font weights, image swaps (unless the image carries primary information).

The high-leverage tests are usually 3-7 per year on a mid-market site, not 12 per month. The work between tests is instrumentation and analysis, not constant variation.

What to avoid

  • Running an A/B test for “a week” with no sample-size calculation. The test is decorative. The result is noise.
  • Calling a 2% absolute lift “significant” without reporting the confidence interval. Significance is a binary calculation, not a vibe.
  • Testing two color variants of the same button. Even if you have the traffic, the expected lift is tiny and not worth the engineering hour. Test function, not finish.