Beyond AB Testing: How I Analyze Real Product Impact Without an Experiment

July 26, 2025

When people think of product analytics, the first thing that usually comes to mind is AB testing. And sure — AB tests are powerful. But let’s be real: they don’t solve everything. Many times, we’re working in a world without experiments, and just relying on AB test results without deeper validation can be misleading.

In this article, I’ll walk through a real-world-style product analytics project from start to finish. The goal is to show how I approach problems when AB testing isn’t available — using a combination of exploratory data analysis (EDA), regression modeling, propensity score matching, and business scenario modeling.

Case Setup: Focus Mode

Let’s say we’re working at a productivity app — something like Notion or Evernote. The product team recently launched a new Focus Mode that hides distracting UI elements, helping users concentrate better while writing.

They come to us with a question:

“We want to know if Focus Mode helps users stick around — not just in the first week, but over the long term. Can you figure that out?”

Sounds like a retention question. Let’s break it down step-by-step.

Step 1: Clarification

Always start by clarifying the problem. It seems obvious, but trust me — assumptions kill good analysis.

I’d ask:

What exactly do you mean by retention? Is it 7-day, 30-day, or something else?
How was the feature launched? Was it gradually rolled out? Randomized? Or launched to everyone?

Let’s assume the team responds:

“We’re interested in 30-day retention. And no, we didn’t run an AB test. We launched it to everyone.”

Okay, so this is a classic observational study scenario. No experiment, no randomization — meaning we’ll have to untangle causality the hard way.

Step 2: Exploratory Data Analysis (EDA)

Before diving into modeling, I’d run some quick EDA to understand the data structure and early signals.

Simulated Dataset:

We’d use user-level data with the following columns:

user_id
region
device
marketing_channel (acquisition source)
user_tenure_days
focus_mode_enabled (1 if user used focus mode at least once in the last 30 days)
retention_30d (1 if user retained after 30 days)
session_duration
session_frequency
time_to_first_use_focus_mode
focus_mode_opt_out (1 if user opted out)

Initial Cuts:

Let’s say:

Focus Mode users have a 35% 30-day retention rate
Non-users have a 29% rate

That’s a 6-point gap. Promising, but not necessarily causal.

Now segment by device:

iOS: 39% retention
Desktop: 34%
Android: 31%

This tells us iOS users retain better overall — maybe due to UX differences. Important context for our next step.

Step 3: Logistic Regression

To estimate the causal impact of Focus Mode while controlling for other variables, I’d build a logistic regression model.

Setup:

Target: retention_30d (binary)
Features: focus_mode_enabled, device, region, marketing_channel, user_tenure_days, session_duration, session_frequency

Example Outputs:

Let’s say:

focus_mode_enabled has an odds ratio of 1.2 → users who use Focus Mode are 20% more likely to retain
device = Android has an odds ratio of 0.85 → Android users are 15% less likely to retain

Multicollinearity:

Session frequency and session duration may be highly correlated (engaged users tend to score high on both). I’d check VIF (Variance Inflation Factor):

VIF = 1 / (1 – R²)
VIF > 5 → high multicollinearity

If VIF is high, I might:

Ignore it (if interpretability > prediction accuracy)
Combine metrics into an engagement index
Drop the less relevant variable

Step 4: Propensity Score Matching (PSM)

Regression is good — but let’s validate it.

Focus Mode users may be fundamentally different from non-users. They could be more tech-savvy, engaged, or motivated. That’s where PSM helps.

Process:

Build a logistic regression to predict probability of using Focus Mode, using covariates like region, device, tenure, session behavior, etc.
Calculate a propensity score for each user.
Do 1:1 nearest-neighbor matching between Focus Mode users and similar non-users.
Calculate the ATT (Average Treatment Effect on the Treated) — the difference in 30-day retention after matching.

Let’s say:

After matching, Focus Mode users still show a 4.0% lift in 30-day retention

Balance Check:

Use Standardized Mean Difference (SMD)
SMD < 0.1 → good balance between treatment and control groups

Step 5: Scenario Modeling

Now let’s translate that 4.0% lift into business terms.

Assume:

1 million users per month
$10 ARPU (average revenue per user)

Then:

+4% retention = 40,000 more retained users
40,000 × $10 = $400K incremental revenue per month
Annually: $4.8M incremental revenue

Even if we apply a 50% decay or confidence interval (say 3.5–5.5%), we’re still looking at $1.7M–2.7M per year. Not bad.

Step 6: Recommendations

Based on everything above, here’s what I’d suggest:

Short-term:

Run retargeting campaigns to non-users, especially iOS segment

Long-term:

Personalize the onboarding for Focus Mode based on user type
Monitor guardrail metrics: churn, bounce rate, opt-out rate
If issues arise, propose AB tests to refine or rollback the feature

Final Thoughts

This is a full walk-through of how I would approach a product analytics problem when there’s no AB test available.

AB testing is great — but real-world analytics often needs more. When experimentation is off the table, we can still uncover insights using observational methods, careful validation, and business modeling.

Product Analytics