Measure and Learn

With AB Tasty you can leverage your campaigns with ease:

  • During campaign: analyze your live hits to monitor real-time user interactions

  • At the end of campaigns: ensure your results are statistically significant and actionable, wait for Readiness.

  • After campaigns: analyze your reports, learn from them, and iterate to keep improving your CRO

Access your reporting

1

Go to your campaigns dashboard.

2

Click open your campaign reporting.

Monitor live hits

The Live Hits feature in AB Tasty allows you to monitor real-time user interactions on your campaigns. This is especially useful for quality assurance (QA) and for verifying that your campaign is tracking the right events before you start collecting data for analysis.

1

In the reporting page, at the bottom of the left side panel, click on the live hits button (Thunderbolt icon).

The request can take up to 30 seconds to be approved. The button then changes to Live hits ready.

2

Click on "View live hits" to access the window displaying with current hits.

Hits are displayed in the window for 10 minutes, after which they disappear.

3

Monitor your live hits.

Read our Live hits on the reporting article to have more details on this specific reporting view.

Ensure your campaign report readiness

The campaign readiness indicator is based on the campaign’s primary goal performance. When the primary goal is ready, meaning that it has reached the required number of days, conversions, and visitors, the campaign is considered ready as well and reliable.

When the campaign results are reliable, the reporting button in the campaign dashboard changes to display a green check mark instead of grey or orange graphic bars.

Read our Reporting Readiness article to have more details on this feature and learn how to read each goal readiness.

Analyze your reports: our best practices

1

Check experiment health before you interpret results

  • Status and scope: included pages, targeted audience, devices, traffic allocated per variant

  • Volume and coverage: enough users/events in the analysis window; include at least one full business cycle (often one to two weeks with a weekend)

  • Data quality

    • Primary objective is tracked correctly (for example, purchase, lead, key click)

    • No tracking degradation during the experiment (release, consent, tag changes)

    • No obvious anomalies (bot spikes, errors)

  • Sample ratio mismatch (SRM): the traffic share per variant should be close to the planned allocation. If there is a large gap, investigate before deciding

2

Read the summary card: five numbers that matter

  • Exposures and conversions: users (or sessions) exposed per variant and conversions on the primary KPI

  • Conversion rate (CR): conversions / exposures

  • Relative uplift

  • Absolute impact (for clarity): extra conversions per 1,000 visitors

  • Business value (if available): revenue per visitor, average order value, margin

3

Read the statistics:

  • Probability to be best (chance to win)

  • Credible interval around the uplift

  • Simple rule: probability ≥ 95% and stable data → you can consider a decision

4

Monitor guardrail metrics before you ship

  • Check that secondary indicators do not degrade (for example, error rate, performance, bounce, complaints, margin)

  • If the primary KPI is positive but a guardrail is significantly negative, prefer iterating over a blunt rollout

5

Review segments for consistency

  • Look at two to three key cuts: device (mobile/desktop), new vs returning users, traffic source

  • Do not decide based only on a segment if your targeting was not segmented from the start

  • If a major segment reacts differently, capture the insight for a future targeted personalization

6

Apply simple decision rules

  • Ship = ask your team to develop it or transform your test into personalization

    • Positive uplift and statistical threshold met (probability ≥ 95% or p ≤ 0.05)

    • Sufficient duration (at least one business cycle)

    • Guardrails are fine

  • Hold

    • Result is near the threshold, trend is not stable, or volume is still low

  • Iterate

    • No detectable difference and/or the minimum detectable effect (MDE) is not reachable soon; revisit the hypothesis, design, or targeting

  • Roll back or stop

    • Negative impact with threshold met, or a data issue (SRM, tracking)

Risk tip: for sensitive changes, use a progressive rollout with a feature flag and monitor guardrails thanks to our FE&R solution

7

Document the learning so you can reuse it

Document:

  • Hypothesis, winning variant, uplift, interval, notable segments

  • Decision taken, and the next experiment to run

Both the Learning Library and Ideas Backlog help you document hypotheses, results, and next steps for each experiment. For more details, see our documentation on experiment learnings and backlog management.

Airvoyage's example

After running the A/B test for a full week, Amari opens the AB Tasty reporting dashboard to analyze the results of the Social Proof widget experiment on the flight booking page.

The data looks promising.

Metric

Variant A (Control)

Variant B (Social Proof)

Visitors

50,000

50,200

Conversion Rate

5.0%

5.6%

Uplift

+12%

Absolute Impact

+6 conversions per 1,000 visitors

In practical terms, this means that for every 1,000 visitors, the Social Proof widget generated 6 additional bookings compared to the original page.

Statistical Summary

Amari reviews the statistical confidence of the results directly within AB Tasty:

  • Chances to win: 97%

  • confidence interval for uplift: [+3%, +20%]

Guardrail metrics (such as bounce rate and page load time) remained stable throughout the experiment.

“That’s great news. The widget boosted bookings without any negative side effects on performance or engagement.”

Amari’s Decision

Based on the data, Amari confidently decides to ship the winning variation across all hotel property pages. The uplift is significant, the impact clear, and the experience stable.

“With this kind of lift, I can demonstrate tangible ROI for our optimization efforts — and make a strong case for scaling behavioral design tactics.”

Encouraged by the results, Amari also plans a follow-up mobile personalization experiment, since early data suggests the gain was even stronger among mobile users.

“Next step: adapt the Social Proof message for mobile travelers. Quick wins like this can add up fast.”

Common pitfalls to avoid

  • Stopping too early (peeking) before the trend stabilizes

  • Ignoring an SRM or a tracking issue

  • Concluding from an unplanned micro-segment

  • Confusing relative and absolute uplift

  • Ignoring quality or service metrics (guardrails)

Quick glossary

  • Exposure: a user who saw a variant

  • Conversion: completion of the objective (purchase, lead, key click)

  • Primary KPI: main success metric

  • Secondary KPI or guardrail: safety metrics to monitor for side effects

  • Uplift: relative improvement of the primary KPI

  • Interval (confidence or credible): plausible range of the effect

  • p-value or probability to be best: indicators of statistical strength

  • SRM: abnormal traffic split between variants

  • MDE (minimum detectable effect): smallest effect you plan to detect given your volume

Last updated

Was this helpful?