Understand the practical Use of RevenueIQ
Discover how to accurately measure and optimize revenue in your experiments thanks to our patented feature. For a deeper dive, download our whitepaper.
The most important KPI in e-commerce is revenue. In an optimization context, this means optimizing two axes:
Conversion: “Turning as many visitors as possible into customers.”
Average Order Value (AOV): “Generating as much value as possible per customer.”
However, CRO often remains focused on optimizing conversion. AOV is often neglected in analysis due to its statistical complexity. AOV is very difficult to estimate correctly with classic tests (t-test, Mann-Whitney) because of highly skewed purchase distributions with no upper bound. RevenueIQ offers a robust test that directly estimates the distribution of the effect on revenue (via a refined estimation of AOV), providing both probability of gain (“chance to win”) and consistent confidence intervals. In benchmarks, RevenueIQ maintains a correct false positive rate, has power close to Mann-Whitney, and confidence intervals four times narrower than the t-test. By combining the effects of AOV and CR, it delivers an RPV impact and then an actionable revenue projection.

To learn more, read our RevenueIQ White paper
Context & Problem
In CRO, we often optimize CR due to a lack of suitable tools for revenue. Yet, Revenue = Visitors × CR × AOV; ignoring AOV distorts the view.
AOV is misleading:
Unbounded (someone can buy many items).
Highly right-skewed (many small orders, a few very large ones).
A few “large and rare” values can dominate the average.
In random A/B splits, these large orders can be unevenly distributed → huge variance in observed AOV.
Limitations of Classic Tests
t-test: Assumes normality (or relies on the Central Limit Theorem for the mean). On highly skewed e-commerce data, the CLT variance formula is unreliable at realistic volumes. Result: very low power (detects ~15% of true winners in the benchmark) and gives very wide confidence intervals → slow and imprecise decisions.
Mann-Whitney (MW): Robust to non-normality (works on ranks), so much more powerful (~80% detection in the benchmark). But only provides a p-value (thus only trend information), not an estimate of effect size (no confidence interval) → impossible to quantify the business case.
RevenueIQ: Principle
It uses and combines two innovative approaches:
Uses a bootstrap technique to study the variability of a measure with unknown statistical behavior.
Instead of measuring the difference in average baskets, it measures the average of basket differences. It compares sorted order differences between variants (A and B), with weighting by density (approx. log-normal) to favor “comparable” pairs. This bypasses the problem of very large observed value differences in such data.
And it deduces:
The Chance to win (probability that the effect is > 0), readable for decision-makers.
Narrow and reliable confidence intervals on the AOV effect as well as on revenue.
Benchmarks (AOV)
Alpha validity (on AA tests): good control of false positives. Using a typical 95% threshold exposes only a 5% false positive risk.
Statistical power measurement: 1000 AB tests with a known effect of +€5
MW Test: 796/1000 winners, ~80% power.
t-test: 146/1000, only 15% power.
RevenueIQ: 793/1000 (≈ equivalent to MW). ~80% power.
Confidence interval (CI): RevenueIQ produces CIs of €8 width, which is reasonable and functional in the context of a real effect of €5. With an average CI width of €34, the t-test is totally ineffective.
CI coverage: The validity of the confidence intervals was verified. A 95% CI indeed has a 95% chance of containing the true effect value (i.e., €0 for AA tests and €5 for AB tests).
From AOV KPI to Revenue
Beyond techniques and formulas, just remember that RevenueIQ uses a Bayesian method for AOV analysis, allowing this metric to be merged with conversion. Our competitors use frequentist methods, at least for AOV, making any combination of results impossible. Under the hood, RevenueIQ combines conversion and AOV results into a central metric: visitor value (RPV). With precise knowledge of RPV, revenue (€ or other currency) is then projected by multiplying by the targeted traffic (for a given period).
Real Case (excerpt) Here is a textbook case for RevenueIQ:
Conversion gain is 92% CTW, encouraging but not “significant” by standard threshold.
AOV gain is at 80% CTW. Similarly, taken separately, this is not enough to declare a winner.
The combination of these two metrics gives a CTW of 95.9% for revenue, enabling a simple and immediate decision, where a classic approach would have required additional data collection while waiting for one of the two KPIs (CR or AOV) to become significant.
For an advanced business decision, RevenueIQ provides an estimated average gain of +€50k, with a confidence interval [-€6,514; +€107,027], allowing identification of minimal risk and substantial gain.
What This Changes for Experimentation
Without RevenueIQ: “inconclusive” results (or endless tests) ⇒ missed opportunities.
With RevenueIQ: faster, quantified decisions (probability, effect, CI), at the revenue level (RPV then projected revenue).
Practical Recommendations
Stop interpreting observed AOV without safeguards: it is highly volatile.
Avoid filtering/Winsorizing “extreme values”: arbitrary thresholds ⇒ bias.
Measure CR & AOV jointly and reason in RPV to reflect business reality.
Use RevenueIQ to obtain chance to win + CI on AOV, RPV, and revenue projection.
Decide via projected revenue (average gain, lower CI bound) rather than isolated p-values.
Conclusion
RevenueIQ brings a robust and quantitative statistical test to monetary metrics (AOV, RPV, revenue), where:
t-test is weak and imprecise on e-commerce data,
Mann-Whitney is powerful but not quantitative. RevenueIQ enables faster detection, quantification of business impact, and prioritization of deployments with explicit confidence levels.
Last updated
Was this helpful?