Cohort Churn Anomaly Investigation

Systematic investigation framework for identifying why a specific cohort has elevated churn, with SQL queries, statistical validation, root cause decision tree, and week-by-week action plan.

Input

What you provide to the skill

October 2024 cohort has 12% churn versus normal 5%. Same size cohorts. Need to identify what went wrong with that specific month.

Output

What the skill generates for you

Cohort Churn Anomaly Investigation

Investigation Framework

October 2024 Cohort: 12% churn
Normal Baseline: 5% churn
Difference: +7 percentage points (2.4x higher)
Cohort sizes: Same size (equal comparison)

When one cohort churns significantly more, investigate in this order:

1. Acquisition Source Changes

Hypothesis: Different traffic sources attracted different user intent

Check:

Marketing channel mix by cohort
Campaign changes in October
Traffic source breakdown
Any new ad campaigns or partnerships launched

SQL Query:

SELECT
  DATE_TRUNC('month', signup_date) AS cohort,
  acquisition_source,
  COUNT(*) AS users,
  SUM(CASE WHEN cancel_date IS NOT NULL THEN 1 ELSE 0 END) AS churned,
  ROUND(100.0 * SUM(CASE WHEN cancel_date IS NOT NULL THEN 1 ELSE 0 END) / COUNT(*), 1) AS churn_rate
FROM subscriptions
WHERE signup_date >= '2024-09-01' AND signup_date < '2024-12-01'
GROUP BY 1, 2
ORDER BY 1, 2;

What to look for:

Did October have higher % from a specific channel (e.g., paid ads, affiliate)?
Did any single source show >15% churn rate?
New marketing campaign that launched in October?

Common finding: New ad campaign or channel brought low-intent users (40% of cases)

2. Product Changes

Hypothesis: Release affected new user experience

Check:

Releases deployed in late September or October
Onboarding flow changes
Feature flags affecting new users
A/B tests running for new signups
Pricing page or sign-up flow changes

Comparison Table:

Metric	Baseline Cohorts	Oct 2024 Cohort	Difference
Onboarding completion %	?	?	?
Core feature adoption (D7)	?	?	?
Time to first value	?	?	?
Account setup completion	?	?	?

Common finding: Onboarding redesign or feature change broke first-time experience (30% of cases)

3. Pricing/Offer Changes

Hypothesis: Different trial terms or promotional pricing

Check:

Trial length by cohort (any October changes?)
Discount codes used in October
Plan tier distribution
Any October promotions or sales
Price changes implemented

SQL Query:

SELECT
  DATE_TRUNC('month', signup_date) AS cohort,
  plan_tier,
  trial_length_days,
  COALESCE(promo_code, 'none') AS promo_code,
  COUNT(*) AS users,
  SUM(CASE WHEN cancel_date IS NOT NULL THEN 1 ELSE 0 END) AS churned,
  ROUND(100.0 * SUM(CASE WHEN cancel_date IS NOT NULL THEN 1 ELSE 0 END) / COUNT(*), 1) AS churn_rate
FROM subscriptions
WHERE signup_date >= '2024-08-01' AND signup_date < '2024-12-01'
GROUP BY 1, 2, 3, 4
ORDER BY 1, 2, 3, 4;

Common finding: Aggressive discount attracted price-sensitive users who churn after discount ends (10% of cases)

4. Seasonality & External Events

Hypothesis: Time-based or market factors

Check:

Q4 budget freezes (if B2B product)
Beginning of holiday season behavior
Competitor launches or announcements
Industry news in October

Common finding: Q4 budget freezes or holiday distractions (20% of cases)

5. Statistical Validation

Use chi-square test to confirm difference is real, not noise:

from scipy.stats import chi2_contingency
import numpy as np

# Assuming same cohort sizes, 12% vs 5% churn
cohort_size = 100  # Replace with actual cohort size

baseline = [cohort_size * 0.95, cohort_size * 0.05]  # [retained, churned]
october = [cohort_size * 0.88, cohort_size * 0.12]
observed = [baseline, october]

chi2, p_value, dof, expected = chi2_contingency(observed)

print(f"Chi-square statistic: {chi2:.4f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("\n✓ Difference is statistically significant (p < 0.05)")
    print("  This is NOT random noise - there's a real issue to investigate")
else:
    print("\n⚠ Difference may be random noise (p >= 0.05)")
    print("  Monitor November cohort before making major changes")

Root Cause Framework

Once data is collected, use this decision tree:

Did acquisition mix change significantly?
├─ YES -> Likely cause: Low-intent traffic from new source
│         Action: Adjust targeting or pause channel
└─ NO -> Continue

Did product experience change for new users?
├─ YES -> Likely cause: Broken onboarding or feature change
│         Action: Rollback or fix the experience
└─ NO -> Continue

Did pricing/offers differ?
├─ YES -> Likely cause: Price-sensitive users churning post-promo
│         Action: Reconsider discount strategy
└─ NO -> Continue

Does October historically show elevated churn?
├─ YES -> Likely cause: Seasonal pattern
│         Action: Accept variance, prepare for next October
└─ NO -> Continue

Are engagement metrics significantly lower in Oct cohort?
├─ YES -> Likely cause: User quality or experience issue
│         Action: Deep dive on specific engagement gaps
└─ NO -> May be statistical noise, monitor next cohort

Action Plan for October 2024 Investigation

Week 1 (Immediate):

Run acquisition source SQL query -> Identify if traffic mix changed
Review October release notes -> Find product changes affecting new users
Check pricing/offer log -> Spot any promotional differences
Run behavioral deep dive query -> Quantify engagement differences

Week 2 (Analysis):
5. Fill in comparison table with query results
6. Identify the metric with biggest difference between Oct and baseline
7. Form specific hypothesis: “October cohort churned more because [X]”
8. Validate hypothesis by checking correlation within October cohort

Week 3 (Action):
9. Implement fix based on root cause:

If acquisition: Pause or retarget low-quality channels
If product: Rollback problematic change or add user guidance
If pricing: Adjust promotional strategy for price-sensitive users
If seasonal: Accept as expected variance, monitor Nov/Dec

Week 4 (Measure):
10. Monitor November cohort for improvement
11. Track metrics that showed biggest Oct gap
12. Document learnings for future cohort analysis

Investigation Checklist

Acquisition source query run -> Any channel had disproportionate representation?
Release timeline reviewed -> Any changes deployed 9/15 - 10/31?
Pricing/promotions checked -> Different offers in October?
A/B tests reviewed -> Any experiments affecting new users?
Onboarding metrics compared -> Lower completion in October?
Support ticket patterns -> More tickets from October cohort?
Competitor activity researched -> New launches or campaigns?
Seasonal patterns checked -> October historically different?
Statistical test run -> Difference confirmed as significant?

Probability Analysis

Based on industry patterns, most likely culprits:

40% probability: Acquisition source change bringing lower-intent users
30% probability: Product/onboarding change that degraded new user experience
20% probability: Seasonal/external factor (Q4 timing, competitor launch)
10% probability: Pricing/promotional difference attracting wrong users

Start with acquisition source analysis - it’s the fastest to validate and most common root cause.

Cohort Churn Anomaly Investigation

Input

Output

Cohort Churn Anomaly Investigation

Investigation Framework

1. Acquisition Source Changes

2. Product Changes

3. Pricing/Offer Changes

4. Seasonality & External Events

5. Statistical Validation

Root Cause Framework

Action Plan for October 2024 Investigation

Investigation Checklist

Probability Analysis

About This Skill

More Examples

Pre-Churn Behavior Pattern Analysis

Exit Survey Strategy for Freelance PM Tool