Guides
Guides
experimentation
Measurement is where most lifecycle programs fool themselves. Running tests without sample-size math. Declaring winners from noise. Confusing last-click revenue with incremental revenue. These guides cover the discipline that separates real learning from confirmation theatre.
A lifecycle team that runs 20 A/B tests a year at p=0.05 should statistically expect 1 false-positive winner from pure noise. Most teams don't track how many tests they've run, so the false winners become 'learnings', propagate through the playbook, and quietly underperform. The gap between the claimed lifts and the aggregate program improvement is the tax of undisciplined experimentation.
The guides in this category cover the full testing stack. Sample size calculation — the 5-minute math that tells you whether a test can detect the effect you're looking for before you run it. The holdout group pattern — randomly suppressing a small population from a program so you can see its real incremental lift, not just its last-click attributed revenue. A/B testing structure — one primary metric, pre-registered, sized for a realistic effect, read at the end, not during.
Then the measurement stack. Cohort retention analysis — the one chart that tells you if retention is actually improving, stratified by cohort week or signup channel. Attribution models and which one to use for which question (first-touch for acquisition, last-click for transactional, multi-touch for anything in between, holdout for the honest incrementality answer). Send-time optimisation and the gap between vendor-claimed and measured lift. False-positive prevention and how to spot a 'winning' test that will not replicate.
Read these before you run the next test. Running an underpowered test isn't neutral — it spends the audience and produces conclusions that range from useless to actively wrong.
Most email A/B tests produce winners that don't reproduce. Three reasons keep showing up: under-powered samples, the novelty effect, and weak readout discipline. This guide is about designing tests that actually drive decisions instead of theatre.
10 min read
AdvancedEmail is the fastest place to try a new price, and the easiest place to learn the wrong lesson. What you can test cleanly, what you can't, and the measurement traps that quietly turn price tests into expensive false positives.
9 min read
IntermediateWithout a holdout, lifecycle ROI is attribution-model guesswork with a spreadsheet. With one, you get a defensible number you can actually put in front of finance. Here's how to size, run, and read a holdout — and the three mistakes that quietly invalidate the result.
9 min read
AdvancedAttribution debates are half epistemology, half politics. Last-touch is wrong but defensible. Multi-touch is more accurate but less defensible. Incrementality is the only one that answers the causal question — and it's the slowest. Here's which model to use for which question, and why.
10 min read
AdvancedA cohort retention curve is the single most useful analytical artifact in lifecycle marketing. It isolates real program impact from the compounding noise that every other metric hides, and it's the one view that survives every limitation of the simpler numbers. Here's how to build one and how to read it without kidding yourself.
9 min read
IntermediateMost email A/B tests are powered to detect effects far larger than the test could actually produce. The result: false positives and false nulls, with confident conclusions in both directions. Sample size calculation fixes this before you send. Takes 5 minutes. Here's the 5-minute version.
8 min read
IntermediateEvery ESP markets an STO feature and every vendor deck shows lift. The honest version: STO moves open rate 3–8%, rarely revenue, and only for certain program types. Here's when it's worth turning on.
7 min read
AdvancedRun enough A/B tests and some will show 'significant' lift from pure noise. Programs that ship every significant winner end up with a collection of imaginary improvements they can't tell apart from real ones. Here's how to spot the fakes and avoid the trap.
8 min read
AdvancedLast-click attribution makes lifecycle look bigger than it is. Incrementality testing strips out users who would have converted anyway and surfaces the real number. This is how to design a test that produces a figure you can defend in front of a CFO.
9 min read
IntermediateA winning A/B test with 4% aggregate lift might be a 20% win in one segment and a 10% loss in another. The aggregate is an average of opposing effects. Segment analysis catches it — and lets you ship the win to the segments that benefit while not shipping the loss to the ones that don't.
8 min read
AdvancedEvery vendor case study shows AI personalisation moving the numbers. Most internal post-mortems show the lift evaporating once a proper holdout is in place. The gap between the two is the measurement methodology. Here's the framework for proving — to yourself, your CFO, and the auditor — whether AI personalisation is actually earning its place.
8 min read