Statistical Significance
[In Progress]
Know when your results are real, not noise.
A/B testing without statistical significance is just guessing with extra steps. HolyShift tracks significance automatically for every test, tells you when you have enough data to trust the results, and declares a winner when the evidence is strong enough to act on.
What statistical significance means
When you run an A/B test, one variant will always be "ahead" — even after just 10 visitors. The question is whether that lead represents a real difference or random chance. Statistical significance answers that question.
A result is statistically significant when there's enough data to confidently say the difference between variants isn't due to random variation. HolyShift uses a 95% confidence threshold by default — meaning there's less than a 5% chance the observed difference is just noise.
How HolyShift tracks it
You don't need to understand the math. HolyShift handles the calculations and shows you a clear status for each test:
Needs more data
The test has just started or doesn't have enough traffic yet. The current leader might change. Don't make decisions based on this data.
Trending
One variant is pulling ahead, but the result isn't conclusive yet. Keep the test running. This stage tells you the test is working and heading toward a result, but it's not there yet.
Statistically significant
The result is reliable. The winning variant converts better than the others, and there's enough data to trust that conclusion. HolyShift notifies you when a test reaches this stage.
Winner declared
HolyShift has identified a winner with high confidence. You can deploy the winning variant as your primary page — manually or automatically if auto-deploy is enabled.
What you'll see
For each active test, the dashboard shows:
- Conversion rate per variant — the percentage of visitors who completed your primary goal (form submission, CTA click)
- Confidence level — how confident the system is in the current leader (displayed as a percentage)
- Sample size per variant — how many visitors each variant has received
- Estimated time to significance — how much longer the test needs to run based on current traffic levels
- Winner notification — an alert when significance is reached
When to act on results
Act when HolyShift says significant. Not before. Not when one variant "looks like it's winning." Not after a day. Not because you feel like the test has been running long enough.
The most common A/B testing mistake is stopping early. A variant that's winning after 100 visitors might lose after 500. HolyShift's significance tracking protects you from this by telling you exactly when the data supports a conclusion.
Auto-deploy winners
When a test reaches statistical significance, HolyShift can automatically deploy the winning variant as your primary landing page. This closes the optimization loop: test, learn, deploy — without manual intervention.
Auto-deploy is optional. You can review the results and deploy manually if you prefer.
FAQ
What confidence level does HolyShift use?
95% by default. This is the industry standard for A/B testing. It means there's a 5% or lower probability that the observed difference is due to random chance.
How much traffic do I need?
It depends on your base conversion rate and the size of the difference between variants. A rough guide:
- Large differences (e.g., 5% vs 10% conversion) need ~200 visitors per variant
- Small differences (e.g., 5% vs 6% conversion) need ~3,000+ visitors per variant
HolyShift estimates the time to significance based on your current traffic, so you don't need to calculate this yourself.
What if no variant wins?
If the test runs long enough and no variant achieves significance, it means the difference between variants is too small to matter. That's a valid result — it tells you this particular variable doesn't meaningfully affect conversion, and you should test something else.
Can I adjust the confidence threshold?
Not currently. The 95% threshold is fixed to prevent premature conclusions. Advanced confidence settings are on the roadmap.
Does traffic source affect significance?
Traffic composition matters. If one variant gets mostly organic traffic and another gets mostly paid traffic, the comparison isn't fair. HolyShift randomizes visitor assignment across all traffic sources to prevent this. UTM data is tracked per variant so you can verify the traffic mix.
What's next
- Creating variants — set up your A/B test
- A/B testing overview — the full closed-loop system
- UTM tracking — understand traffic sources per variant
