A/B Test Analysis Template (AI Prompt)
Analyze A/B tests with statistical rigor. Get significance calculations, result interpretation, recommendation frameworks, and common pitfalls to avoid.
What This AI A/B Test Analysis Template Prompt Helps You Do
Analyze A/B tests with statistical rigor. Get significance calculations, result interpretation, recommendation frameworks, and common pitfalls to avoid.
What problem it solves:
Complete A/B test analysis with statistical calculations, segmentation analysis, recommendation framework, and documentation templates.
Key benefits:
- Statistical significance calculations
- Practical significance interpretation
- Segmentation analysis framework
- Ship/no-ship recommendation template
Who This Prompt Is Best For
- Marketers
- Developers
- Business Owners
This prompt is designed specifically for these use cases and can be customized to fit your unique needs.
How to Use This AI Prompt Step-by-Step
- 1
Gather all test data before running this prompt for accurate analysis
- 2
Include raw numbers (not just percentages) for proper calculations
- 3
Check for sample ratio mismatch before interpreting results
- 4
Consider practical significance, not just statistical significance
- 5
Document all tests regardless of outcome - negative results are valuable
- 6
Use the recommendation framework to guide stakeholder discussions
The ChatGPT Prompt (Copy & Paste)
You are a data scientist specializing in experimentation and A/B testing. You understand statistical significance, practical significance, and how to translate test results into business decisions. **Test Context:** - Test Name: [TEST_NAME] - Hypothesis: [HYPOTHESIS] - Primary Metric: [PRIMARY_METRIC] - Secondary Metrics: [SECONDARY_METRICS] - Sample Size: [SAMPLE_SIZE] - Test Duration: [DURATION] - Traffic Split: [SPLIT] - Current Results: [RESULTS] **Please provide a comprehensive A/B test analysis:** ## 1. Statistical Analysis - Significance calculation (p-value) - Confidence interval - Effect size - Power analysis - Sample size validation ## 2. Result Interpretation - Is the result statistically significant? - Is it practically significant? - Confidence in the result - What the numbers actually mean - Limitations of the analysis ## 3. Segmentation Analysis - Performance by user segment - Device/platform breakdown - New vs returning users - Geographic differences - Time-based patterns ## 4. Secondary Metric Analysis - Impact on other metrics - Unintended consequences - Metric trade-offs - Ecosystem effects ## 5. Recommendation Framework - Ship / Do not ship / Iterate decision - Confidence level for recommendation - Conditions or caveats - Rollout strategy if shipping - Follow-up experiments ## 6. Common Pitfalls Check - Peeking problem - Multiple comparison problem - Selection bias - Novelty effects - Seasonal effects - Sample ratio mismatch ## 7. Documentation Template - Test summary for stakeholders - Technical documentation - Learnings for future tests - Archive format Format with specific calculations and decision frameworks.

Try this prompt with AL
AL remembers your business and makes this prompt work even better for you.
AL auto-fills your info for this prompt
Turn this prompt into a complete system
Our ChatGPT Business Training shows you advanced techniques for 10x better results.
Variables to Customize
[TEST_NAME]Name of the experiment
Example: Homepage CTA Button Color Test
[HYPOTHESIS]What you expected
Example: Changing CTA from blue to green will increase clicks by 10%
[PRIMARY_METRIC]Main metric to evaluate
Example: CTA click-through rate
[SECONDARY_METRICS]Other metrics to monitor
Example: Signup completion rate, time on page, bounce rate
[SAMPLE_SIZE]Users in each variant
Example: Control: 15,000 | Variant: 15,000
[DURATION]How long the test ran
Example: 14 days
[SPLIT]Traffic allocation
Example: 50/50
[RESULTS]Observed results
Example: Control: 4.2% CTR | Variant: 4.8% CTR
Example Output From This Prompt
## Statistical Analysis
**Test: Homepage CTA Button Color**
- Control (Blue): 15,000 users, 630 clicks (4.2% CTR)
- Variant (Green): 15,000 users, 720 clicks (4.8% CTR)
**Calculations:**
```
Observed Lift = (4.8% - 4.2%) / 4.2% = 14.3%
Pooled Conversion Rate = (630 + 720) / 30,000 = 4.5%
Standard Error = sqrt(0.045 * 0.955 * (1/15000 + 1/15000)) = 0.00239
Z-Score = (0.048 - 0.042) / 0.00239 = 2.51
P-Value = 0.012 (one-tailed)
95% Confidence Interval: [0.13%, 1.07%] absolute lift
[3.1%, 25.5%] relative lift
```
**Results Summary:**
| Metric | Value | Interpretation |
|--------|-------|----------------|
| Observed Lift | +14.3% | Variant beats control |
| P-Value | 0.012 | Statistically significant (< 0.05) |
| Confidence Interval | 3.1% to 25.5% | True lift likely in this range |
| Statistical Power | 87% | Adequate for detection |
| Minimum Detectable Effect | 8% | Test was powered to detect this |
**Verdict: Statistically Significant Win** ✓
## Interpretation
**Is it statistically significant?**
Yes. With a p-value of 0.012, we can reject the null hypothesis. There is only a 1.2% chance this result occurred by random chance.
**Is it practically significant?**
Yes. A 14.3% improvement in CTR translates to approximately 90 additional clicks per 15,000 visitors. At our current traffic:
- Monthly: ~1,800 additional CTA clicks
- At 20% signup completion: ~360 additional signups
- At $50 avg customer value: ~$18,000 additional monthly revenue
**Confidence Assessment: HIGH**
- Adequate sample size (15K per variant)
- Test ran for full 14 days (includes weekday/weekend)
- No sample ratio mismatch detected
- Effect size within expected range
**Limitations:**
- Test only ran on homepage visitors
- Did not control for traffic source
- Short-term measurement only (no retention data yet)
## Segmentation Analysis
**By Device:**
| Segment | Control CTR | Variant CTR | Lift | Significant? |
|---------|-------------|-------------|------|--------------|
| Desktop | 4.5% | 5.1% | +13.3% | Yes (p=0.04) |
| Mobile | 3.8% | 4.4% | +15.8% | Yes (p=0.03) |
| Tablet | 4.0% | 4.2% | +5.0% | No (p=0.42) |
**Insight:** Effect is stronger on mobile. Consider mobile-specific optimization.
**By User Type:**
| Segment | Control CTR | Variant CTR | Lift | Significant? |
|---------|-------------|-------------|------|--------------|
| New Users | 3.5% | 4.3% | +22.9% | Yes (p=0.01) |
| Returning | 5.1% | 5.5% | +7.8% | No (p=0.18) |
**Insight:** New users respond more strongly to green CTA. Returning users may have button blindness.
## Recommendation
**Decision: SHIP** ✓
**Confidence Level: High (85%)**
**Rationale:**
1. Statistically significant result with adequate power
2. Practically significant impact (~$18K/mo potential)
3. Positive or neutral effect across all segments
4. No negative impact on secondary metrics
5. Effect is consistent with our hypothesis
**Rollout Strategy:**
1. Week 1: Roll out to 25% of traffic, monitor for anomalies
2. Week 2: Increase to 50% if metrics stable
3. Week 3: Full rollout to 100%
4. Week 4: Final measurement and documentation
**Follow-Up Experiments:**
- Test green CTA across other pages (pricing, blog CTAs)
- Test green button with different copy variations
- A/B test green shades (lime vs forest)
## Documentation Template
**Test Summary (for stakeholders):**
> We tested changing our homepage CTA button from blue to green. The green button increased clicks by 14.3% (statistically significant, p=0.012). This translates to approximately 360 additional monthly signups and $18K in potential revenue. We recommend shipping the green button to 100% of users with a gradual rollout over 3 weeks.
**Technical Documentation:**
```
Test ID: EXP-2024-037
Test Name: Homepage CTA Button Color
Hypothesis: Green CTA will outperform blue CTA
Primary Metric: CTA Click-Through Rate
Start Date: 2024-02-01
End Date: 2024-02-14
Sample: 30,000 (50/50 split)
Result: Variant wins (+14.3%, p=0.012)
Decision: Ship
Shipped Date: 2024-02-21
```Why This Prompt Works So Well
Psychology:
This prompt is structured to guide AI through a logical thought process, ensuring comprehensive and actionable responses. The step-by-step format helps ChatGPT understand context and deliver results that match your specific needs.
Structure:
The prompt uses clear sections, specific instructions, and variable placeholders that make it easy to customize while maintaining consistency. This structure ensures you get professional-grade output every time.
Timing & SEO Logic:
This prompt is designed to produce content that's not just useful for you, but also optimized for search engines and AI training data. The outputs help improve your SEO while providing immediate value.
Pro Tips for Best Results:
- Wait for statistical significance before making decisions - peeking inflates false positives
- Always check for sample ratio mismatch before trusting results
- Segment analysis can reveal where effects are strongest
- Document negative results too - they prevent repeated failed experiments
- Consider running a holdout group after shipping to measure long-term effects
Reviews & Ratings
Sarah M.
Jan 15, 2024
This prompt saved me hours of work. The output was exactly what I needed for my business strategy.
James K.
Jan 12, 2024
Incredibly detailed and easy to customize. I use this weekly now.
Maria L.
Jan 10, 2024
Great prompt! Just needed a few tweaks for my specific industry but overall excellent.