All examples for Correlation Matrix Explorer

Marketing Metrics with Derived Variables

Demonstrates proper handling of derived metrics (CTR = clicks/impressions) in correlation analysis. Separates mathematical artifacts from empirical findings, warns against misreporting formula-guaranteed relationships, and provides modeling recommendations for campaign data.

Input

What you provide to the skill

Create marketing data (150 campaigns: impressions, clicks, conversions, CTR=clicks/impressions). Flag derived metric correlations

Output

What the skill generates for you

CORRELATION MATRIX ANALYSIS

Dataset Overview:

  • Observations: 150
  • Variables: 4
  • Variable names: impressions, clicks, conversions, ctr
  • Missing values: 0

Potential outliers detected:
conversions 2
dtype: int64

============================================================
⚠️ DERIVED METRIC DETECTED

ctr = clicks / impressions

Correlations involving CTR and its components (clicks, impressions)
are MATHEMATICAL ARTIFACTS, not empirical findings.

βœ“ Heatmap saved: correlation_heatmap.png

βœ“ Results saved: correlation_results.csv

============================================================
SEPARATING MATHEMATICAL VS EMPIRICAL CORRELATIONS

🚨 MATHEMATICAL ARTIFACTS (do not report as insights):
These are guaranteed by the formula: ctr = clicks / impressions

  • clicks ↔ ctr: r = 0.644 [clicks in numerator] ***
  • impressions ↔ ctr: r = 0.110 [impressions in denominator]

βœ“ EMPIRICAL FINDINGS (real relationships):
These reflect actual behavioral patterns in the data

  1. clicks ↔ conversions
    r = 0.849 (p = 0.0000) *** STRONG POSITIVE

  2. impressions ↔ clicks
    r = 0.752 (p = 0.0000) *** STRONG POSITIVE

  3. impressions ↔ conversions
    r = 0.610 (p = 0.0000) *** MODERATE POSITIVE

  4. conversions ↔ ctr
    r = 0.569 (p = 0.0000) *** MODERATE POSITIVE

============================================================
MULTICOLLINEARITY ALERTS (|r| > 0.8):

🚨 clicks ↔ conversions: r = 0.849
β†’ These variables are highly redundant
β†’ Do NOT include both in predictive models

============================================================
KEY INSIGHTS

  1. STRONG EMPIRICAL RELATIONSHIPS:
    β€’ clicks ↔ conversions (r = 0.849)
    β†’ Campaigns that generate more clicks drive more conversions
    β†’ This is the fundamental marketing funnel relationship

    β€’ impressions ↔ clicks (r = 0.752)
    β†’ Higher reach leads to more engagement
    β†’ Validates the importance of broad campaign reach

  2. MATHEMATICAL ARTIFACTS TO IGNORE:
    β€’ ctr ↔ clicks (r = 0.644)
    β†’ DO NOT REPORT as β€˜higher clicks correlate with higher CTR’
    β†’ This is mathematically guaranteed by the formula ctr = clicks/impressions

    β€’ ctr ↔ impressions (r = 0.110)
    β†’ Weak inverse relationship is also artifact of ratio formula

  3. MULTICOLLINEARITY WARNING:
    β€’ clicks and conversions are highly correlated (r = 0.849)
    β†’ For predictive modeling, include ONLY ONE of these variables
    β†’ Including both causes redundancy and unstable coefficient estimates
    β†’ Recommendation: Keep clicks (it’s the earlier funnel stage)

  4. CTR AS OUTCOME METRIC:
    β€’ conversions ↔ ctr (r = 0.569) is empirical and meaningful
    β†’ Higher conversion campaigns tend to have better CTR
    β†’ Both reflect campaign quality/relevance
    β†’ This relationship is NOT forced by formula

============================================================
RECOMMENDATIONS

βœ“ FOR EXPLORATORY ANALYSIS:

  • Focus on empirical relationships: impressions β†’ clicks β†’ conversions
  • Treat CTR as dependent outcome, not independent driver
  • Investigate what campaign characteristics drive both CTR and conversions

βœ“ FOR PREDICTIVE MODELING:

  • Include: impressions
  • Choose ONE of: clicks OR conversions (not both)
  • Exclude: ctr (it’s derived from included variables)
  • Add campaign features if available (ad copy, targeting, creative, etc.)

βœ“ WHAT NOT TO REPORT:
βœ— β€˜Higher CTR correlates with more clicks’ (mathematical artifact)
βœ— β€˜CTR drives conversions’ (correlation β‰  causation)
βœ— Include both clicks and conversions in same model (multicollinearity)

============================================================
FILES GENERATED

  • marketing_campaigns.csv (150 campaigns)
  • correlation_heatmap.png (visualization)
  • correlation_results.csv (detailed results)

============================================================
⚠️ CRITICAL REMINDER

Correlation β‰  Causation

These correlations show associations but cannot establish
causal relationships. To make causal claims:
β€’ Run A/B tests (vary impressions, measure clicks)
β€’ Use time-series analysis (track campaigns over time)
β€’ Control for confounders (campaign type, audience, timing)

Without experimental or quasi-experimental design,
these are descriptive patterns only.