All examples for Power Analysis Sample Size Calculator

A/B Test Sample Size for Signup Conversion

Calculate required sample size and test duration for an A/B test on a signup page with 4.2% baseline conversion, targeting a 15% relative improvement with 750 daily visitors per variant.

Input

What you provide to the skill

A/B test for signup page. Baseline 4.2% conversion, want to detect 15% relative increase. 750 daily visitors per variant.

Output

What the skill generates for you

A/B TEST POWER ANALYSIS RESULTS

Your Test Parameters

  • Baseline conversion rate: 4.20%
  • Minimum detectable effect: 15% relative increase
  • Target conversion rate: 4.83% (absolute increase of 0.63%)
  • Daily traffic per variant: 750 visitors
  • Statistical power: 80%
  • Significance level (alpha): 5%

Required Sample Size

  • 17,031 visitors per variant (34,062 total)

Test Duration

  • ~23 days (22.7 days at current traffic)
  • Recommended: 4 weeks to reach full sample

Sensitivity Analysis

How sample size and duration change with different effect sizes:

Relative Effect Target Rate Sample/Variant Total Sample Duration
10% 4.62% 37,493 74,986 50 days
15% 4.83% 17,031 34,062 23 days
20% 5.04% 9,784 19,568 13 days
25% 5.25% 6,391 12,782 9 days
30% 5.46% 4,527 9,054 6 days
40% 5.88% 2,644 5,288 4 days

Critical Stopping Rules

⚠️ DO NOT STOP EARLY

  • Wait the full 23 days until BOTH variants reach 17,031 visitors
  • Do NOT peek at results and stop when p < 0.05
  • Make ONE decision at the end of the test period

Why this matters: Early stopping when results “look good” inflates your false positive rate from 5% to 26%+. You risk shipping a worse variant.

Recommendations

  1. Plan for 4 weeks: This gives buffer for traffic fluctuations
  2. Sample size is moderate: At your traffic level, this is a realistic test duration
  3. If you need faster results: Consider testing larger effects (20%+) or increasing traffic allocation
  4. Document stopping criteria upfront: Commit to the 17,031 sample size before starting

What This Means

With 17,031 visitors per variant, you’ll have an 80% chance of detecting a 15% relative improvement (from 4.20% to 4.83%) if it truly exists. This is the standard threshold for well-powered experiments.