How many visitors do I need for an A/B test?

It depends on your baseline conversion rate and the minimum effect you want to detect. As a rough guide, for a 5% baseline and 10% relative improvement you need about 15,000 visitors per variant at 95% confidence. Use this calculator for your exact numbers.

What is minimum detectable effect (MDE)?

MDE is the smallest difference between control and variant that your test can reliably detect. Smaller MDE requires much more traffic. 10 to 20 percent relative improvement is a common setting for most CRO tests.

Should I use 90, 95 or 99 percent confidence?

95 percent is the standard for most product and marketing tests. 90 percent is acceptable for fast iteration in early-stage products. 99 percent is used when the decision is costly to reverse.

What is statistical power?

Statistical power is the probability that the test will detect a real effect when one exists. The industry default is 80 percent, meaning 4 out of 5 real effects will be detected.

A/B Test Sample Size Calculator — Free, Accurate, No Signup

Your test parameters

Baseline conversion rate 2.0%

Your current conversion rate (control)

Minimum detectable effect (relative) 10%

Smallest improvement you want to detect (e.g. +10% means going from 2% to 2.2%)

Daily visitors (optional)

Total visitors across all variants per day, used to estimate test duration

▸ Advanced settings

Statistical significance (confidence)

90% 95% 99%

Testio's engine uses 90% (Bayesian P≥95% with early-stopping rule)

Statistical power

70% 80% 90%

Probability of detecting an effect when one truly exists

Number of variants

2 (A/B) 3 (A/B/C) 4 (A/B/C/D)

You need

9,434

visitors per variant

Total visitors

18,868

Test duration

~19 days

✓ Accurate — this calculator uses the exact same two-proportion z-test formula as Testio's internal engine, so results match what you'll see in production.

⚠ Low traffic warning — at your current daily volume, this test will take over 30 days. Consider a larger MDE or more aggressive changes.

Run this test in Testio →

How to use this calculator

Before you launch an A/B test, the single most important question is: do I have enough traffic to detect a meaningful difference? Running a test with too few visitors is a waste of time — you'll either never reach significance, or (worse) you'll declare a false winner and ship a change that doesn't actually work.

This calculator answers that question. You plug in your current conversion rate and the minimum improvement you care about detecting, and you get the number of visitors per variant you'll need before trusting the result.

Choosing your baseline conversion rate

Look at your analytics for the page, button, or funnel step you want to optimize. Use the conversion rate from the last 30-90 days as your baseline. Avoid seasonal spikes (Black Friday, launch weeks) — they're not representative of normal traffic.

Choosing your minimum detectable effect (MDE)

MDE is expressed as a relative change. If your baseline is 2% and you set MDE at 10%, you're saying "I want to detect a lift from 2% to 2.2% or bigger."

5-10% — large sites (100k+ visitors/month). Detects small, subtle wins.
10-20% — most businesses. Standard industry setting.
20-40% — low-traffic sites. You can only reliably detect big swings.

Smaller MDE needs much more traffic. Cutting MDE in half requires roughly 4x the visitors. If your math says you need 50,000 visitors per variant and you don't have that traffic in a reasonable timeframe, test bigger changes instead — aim for 20%+ improvements that don't need a microscope to see.

Significance level and power explained

Statistical significance (confidence) is the probability that your observed difference is not just random noise. 95% is the academic standard and what most tools default to. Testio's engine defaults to 90% because it pairs it with a Bayesian stopping rule that cross-validates results — you get equivalent rigor, faster.

Statistical power is the opposite question: if there really is a difference, how likely is the test to detect it? 80% is the industry default. Higher power means lower false-negative rate, but needs more traffic.

The formula under the hood

n = 2 × p̄(1-p̄) × (z_α + z_β)² / (p₂ - p₁)²

Where p₁ is the baseline rate, p₂ is the expected variant rate (baseline + MDE), and p̄ is the pooled average. z_α and z_β come from the standard normal distribution based on your chosen significance and power.

This is the exact formula implemented in Testio's backend at apps/api/src/lib/statistics.ts — no approximations, no rounding tricks.

Common mistakes to avoid

Peeking at results early. Checking before reaching sample size and stopping when you see a "significant" result inflates your false positive rate dramatically.
Running too many simultaneous tests on the same page. Variants interact, making results unreliable.
Ignoring novelty and day-of-week effects. Run tests for at least a full business cycle (7+ days) even if you reach sample size earlier.
Optimizing for the wrong metric. A higher click-through rate on a CTA means nothing if downstream revenue stays flat.

Frequently asked questions

Do I really need this much traffic? Other calculators give smaller numbers.

Most "smaller number" calculators sacrifice rigor for marketing appeal. They use generous assumptions (one-tailed tests, no correction for multiple variants, lower power). This calculator uses the same formula and defaults as Testio's real winner-detection engine, so the number matches what you'll actually need in production.

What if I have multiple variants (A/B/C or A/B/C/D)?

Each variant needs the full sample size. If you need 10,000 per variant and you're running 4 variants, you'll need 40,000 total visitors. More variants also slightly increase false-positive risk unless you apply a correction — Testio handles this automatically in production.

What's the difference between frequentist and Bayesian A/B testing?

Frequentist (the method this calculator uses) requires a pre-committed sample size. Bayesian methods allow "early stopping" once the posterior probability crosses a threshold. Testio uses Bayesian decisions in production (probability-to-be-best ≥ 95% with expected loss < 0.001), which often declares winners faster than this calculator predicts — but this calculator gives you the upper bound safe number.

Can I reduce the sample size I need?

Three ways: (1) test bigger changes that produce larger lifts, (2) optimize a page with a higher baseline conversion rate, (3) accept lower statistical power (risking more false negatives). Never just "wait until it looks significant" — that's not a valid reduction.

Ready to run your A/B test?

Visual editor, automatic winner detection, and real-time results. 3-day free trial, from $9/mo.

Start free trial →