All examples for Statistical Assumption Checker & Diagnostic Guide

ANOVA with Multiple Assumption Violations

Decision guidance when standard ANOVA assumptions are violated. Demonstrates outlier investigation protocol, comparison of alternative methods (Kruskal-Wallis vs Welch's ANOVA), and provides implementation code when data has skewness, outliers, and heterogeneous variances.

Input

What you provide to the skill

My dataset has 25 students with reaction time measurements that look skewed. Detected 2 extreme outliers (likely errors). Three groups with very different variances. Need to decide: standard ANOVA or alternatives?

Output

What the skill generates for you

STATISTICAL ASSUMPTION CHECKER & DIAGNOSTIC GUIDE

ANALYSIS CONTEXT:
Statistical test: One-way ANOVA
Dependent variable: reaction_time
Groups: 3 levels (n = 8-9 per group)
Total sample: N = 25

STATED CONCERNS:
• “Skewed” distributions
• 2 extreme outliers (likely errors)
• Very different variances across groups

================================================================================
ASSUMPTION VIOLATIONS IDENTIFIED

  1. NORMALITY: ✗ VIOLATED
    Evidence: Skewed distributions, driven by outliers
    With n=8-9: One outlier = 11-12% of group data

  2. HOMOGENEITY: ✗ VIOLATED
    Evidence: “Very different variances”
    Cause: Outliers inflate variance enormously

  3. INDEPENDENCE: ✓ LIKELY MET
    Assuming between-subjects design

  4. OUTLIERS: ✗ PRESENT
    2 extreme outliers detected
    “Likely errors” - requires investigation!

================================================================================
CRITICAL: OUTLIER INVESTIGATION

You said “likely errors” - THIS IS THE KEY DECISION!

INVESTIGATION STEPS:

  1. Check data entry (typos, decimal errors)
  2. Check measurement validity (equipment, conditions)
  3. Check plausibility (is value physiologically possible?)

DECISION RULES:

SCENARIO A: Confirmed data entry error
→ CORRECT if true value known
→ REMOVE if uncorrectable
→ RERUN ALL ASSUMPTION CHECKS

SCENARIO B: Confirmed measurement error
→ REMOVE the observation
→ Document reason
→ RERUN ASSUMPTION CHECKS

SCENARIO C: Legitimate extreme value
→ KEEP the observation
→ Use ROBUST methods

SCENARIO D: Uncertain
→ KEEP (err on side of inclusion)
→ Use robust methods
→ SENSITIVITY ANALYSIS

CRITICAL ETHICAL POINT:
✗ NEVER remove data just to get significance
✓ ONLY remove with legitimate methodological reason
✓ ALWAYS document removals

================================================================================
ANSWER TO YOUR QUESTION

“Need to decide: standard ANOVA or alternatives?”

→ USE ALTERNATIVES, NOT STANDARD ANOVA

WHY:
✗ Normality violated (skewed)
✗ Homogeneity violated (different variances)
✗ Outliers present
✗ Small sample (n=8-9 → CLT not protective)

================================================================================
RECOMMENDED ALTERNATIVES

❌ DO NOT USE: Standard One-Way ANOVA
Results would be unreliable

✓ PRIMARY: KRUSKAL-WALLIS TEST (RECOMMENDED)

Why best for your data:
• NO normality assumption ✓
• NO homogeneity assumption ✓
• Robust to outliers ✓
• Works with small samples ✓

How it works:
• Ranks all observations (1, 2, 3… 25)
• Tests if mean ranks differ across groups
• Outliers become “just another rank”

Post-hoc: Dunn’s test with Bonferroni correction

Python:
from scipy import stats
H, p = stats.kruskal(group_A, group_B, group_C)

R:
kruskal.test(reaction_time ~ group, data = df)

✓ SECONDARY: WELCH’S ANOVA

When to use:
• If you prefer parametric (test means)
• Normality is “close enough”

Limitations:
• Still assumes approximate normality
• Less robust to outliers than Kruskal-Wallis

Post-hoc: Games-Howell test (NOT Tukey)

Python:
import pingouin as pg
pg.welch_anova(data=df, dv=‘reaction_time’, between=‘group’)

R:
oneway.test(reaction_time ~ group, data = df, var.equal = FALSE)

================================================================================
DECISION TREE

STEP 1: Investigate outliers

Are they errors?
↓ ↓
YES NO/UNCERTAIN
↓ ↓
REMOVE USE ROBUST
& RERUN METHODS
↓ ↓
Assumptions Kruskal-Wallis
now met? (primary)

YES → Standard ANOVA
NO → Use alternatives

================================================================================
BEST PRACTICE: SENSITIVITY ANALYSIS

Run BOTH methods:

  1. Kruskal-Wallis (primary)
  2. Welch’s ANOVA (comparison)

If both agree → Strong, robust conclusion
If they disagree → Violations matter; report both

================================================================================
WHAT TO REPORT

METHODS:
“Prior to analysis, ANOVA assumptions were evaluated. Shapiro-Wilk tests
indicated departures from normality due to skewed distributions. Levene’s
test indicated significant heterogeneity of variance. Two extreme outliers
were identified; [describe investigation outcome]. Given violations of
normality and homogeneity with small samples (n=8-9 per group), we
employed the Kruskal-Wallis test, a non-parametric alternative robust to
these violations.”

RESULTS:
“The Kruskal-Wallis test revealed [significant/no] differences in reaction
time distributions, H(2) = X.XX, p = .XXX. Post-hoc Dunn’s tests with
Bonferroni correction indicated…”

================================================================================
KEY LEARNING POINTS

  1. Outliers dominate small samples (1 outlier = 11-12% of n=8-9)
  2. Outliers often cause BOTH normality AND homogeneity violations
  3. Small samples (n<30) are less forgiving of violations
  4. Robustness hierarchy: Standard ANOVA < Welch’s < Kruskal-Wallis
  5. Never auto-remove outliers - investigate first
  6. Sensitivity analysis strengthens conclusions