Something went wrong!
Hang in there while we get back on track
ANOVA with Multiple Assumption Violations
Decision guidance when standard ANOVA assumptions are violated. Demonstrates outlier investigation protocol, comparison of alternative methods (Kruskal-Wallis vs Welch's ANOVA), and provides implementation code when data has skewness, outliers, and heterogeneous variances.
Input
What you provide to the skill
My dataset has 25 students with reaction time measurements that look skewed. Detected 2 extreme outliers (likely errors). Three groups with very different variances. Need to decide: standard ANOVA or alternatives?
Output
What the skill generates for you
STATISTICAL ASSUMPTION CHECKER & DIAGNOSTIC GUIDE
ANALYSIS CONTEXT:
Statistical test: One-way ANOVA
Dependent variable: reaction_time
Groups: 3 levels (n = 8-9 per group)
Total sample: N = 25
STATED CONCERNS:
• “Skewed” distributions
• 2 extreme outliers (likely errors)
• Very different variances across groups
================================================================================
ASSUMPTION VIOLATIONS IDENTIFIED
-
NORMALITY: ✗ VIOLATED
Evidence: Skewed distributions, driven by outliers
With n=8-9: One outlier = 11-12% of group data -
HOMOGENEITY: ✗ VIOLATED
Evidence: “Very different variances”
Cause: Outliers inflate variance enormously -
INDEPENDENCE: ✓ LIKELY MET
Assuming between-subjects design -
OUTLIERS: ✗ PRESENT
2 extreme outliers detected
“Likely errors” - requires investigation!
================================================================================
CRITICAL: OUTLIER INVESTIGATION
You said “likely errors” - THIS IS THE KEY DECISION!
INVESTIGATION STEPS:
- Check data entry (typos, decimal errors)
- Check measurement validity (equipment, conditions)
- Check plausibility (is value physiologically possible?)
DECISION RULES:
SCENARIO A: Confirmed data entry error
→ CORRECT if true value known
→ REMOVE if uncorrectable
→ RERUN ALL ASSUMPTION CHECKS
SCENARIO B: Confirmed measurement error
→ REMOVE the observation
→ Document reason
→ RERUN ASSUMPTION CHECKS
SCENARIO C: Legitimate extreme value
→ KEEP the observation
→ Use ROBUST methods
SCENARIO D: Uncertain
→ KEEP (err on side of inclusion)
→ Use robust methods
→ SENSITIVITY ANALYSIS
CRITICAL ETHICAL POINT:
✗ NEVER remove data just to get significance
✓ ONLY remove with legitimate methodological reason
✓ ALWAYS document removals
================================================================================
ANSWER TO YOUR QUESTION
“Need to decide: standard ANOVA or alternatives?”
→ USE ALTERNATIVES, NOT STANDARD ANOVA
WHY:
✗ Normality violated (skewed)
✗ Homogeneity violated (different variances)
✗ Outliers present
✗ Small sample (n=8-9 → CLT not protective)
================================================================================
RECOMMENDED ALTERNATIVES
❌ DO NOT USE: Standard One-Way ANOVA
Results would be unreliable
✓ PRIMARY: KRUSKAL-WALLIS TEST (RECOMMENDED)
Why best for your data:
• NO normality assumption ✓
• NO homogeneity assumption ✓
• Robust to outliers ✓
• Works with small samples ✓
How it works:
• Ranks all observations (1, 2, 3… 25)
• Tests if mean ranks differ across groups
• Outliers become “just another rank”
Post-hoc: Dunn’s test with Bonferroni correction
Python:
from scipy import stats
H, p = stats.kruskal(group_A, group_B, group_C)
R:
kruskal.test(reaction_time ~ group, data = df)
✓ SECONDARY: WELCH’S ANOVA
When to use:
• If you prefer parametric (test means)
• Normality is “close enough”
Limitations:
• Still assumes approximate normality
• Less robust to outliers than Kruskal-Wallis
Post-hoc: Games-Howell test (NOT Tukey)
Python:
import pingouin as pg
pg.welch_anova(data=df, dv=‘reaction_time’, between=‘group’)
R:
oneway.test(reaction_time ~ group, data = df, var.equal = FALSE)
================================================================================
DECISION TREE
STEP 1: Investigate outliers
↓
Are they errors?
↓ ↓
YES NO/UNCERTAIN
↓ ↓
REMOVE USE ROBUST
& RERUN METHODS
↓ ↓
Assumptions Kruskal-Wallis
now met? (primary)
↓
YES → Standard ANOVA
NO → Use alternatives
================================================================================
BEST PRACTICE: SENSITIVITY ANALYSIS
Run BOTH methods:
- Kruskal-Wallis (primary)
- Welch’s ANOVA (comparison)
If both agree → Strong, robust conclusion
If they disagree → Violations matter; report both
================================================================================
WHAT TO REPORT
METHODS:
“Prior to analysis, ANOVA assumptions were evaluated. Shapiro-Wilk tests
indicated departures from normality due to skewed distributions. Levene’s
test indicated significant heterogeneity of variance. Two extreme outliers
were identified; [describe investigation outcome]. Given violations of
normality and homogeneity with small samples (n=8-9 per group), we
employed the Kruskal-Wallis test, a non-parametric alternative robust to
these violations.”
RESULTS:
“The Kruskal-Wallis test revealed [significant/no] differences in reaction
time distributions, H(2) = X.XX, p = .XXX. Post-hoc Dunn’s tests with
Bonferroni correction indicated…”
================================================================================
KEY LEARNING POINTS
- Outliers dominate small samples (1 outlier = 11-12% of n=8-9)
- Outliers often cause BOTH normality AND homogeneity violations
- Small samples (n<30) are less forgiving of violations
- Robustness hierarchy: Standard ANOVA < Welch’s < Kruskal-Wallis
- Never auto-remove outliers - investigate first
- Sensitivity analysis strengthens conclusions
About This Skill
Educational guide for graduate statistics students learning to test and interpret parametric test assumptions including normality, homogeneity of variance, linearity, independence, and outliers with clear explanations.
View Skill DetailsMore Examples
ANCOVA Assumption Checking with Covariates
Comprehensive assumption validation for ANCOVA with multiple treatment groups and covariates. Covers missing data handling, normality per group, homogeneity of variance, linearity of covariates, homogeneity of regression slopes, and independence with detailed decision frameworks.
Independent T-Test Assumption Checking
Complete assumption validation for comparing two groups with a t-test. Demonstrates the five key assumptions (missing data, normality, homogeneity of variance, independence, outliers) with educational explanations, Q-Q plot interpretation guidance, and methods section template.