Regional Customer Satisfaction Analysis

Shows one-way ANOVA for comparing satisfaction scores across 4 regions, including post-hoc tests, effect size, and both business and academic reporting formats.

Input

What you provide to the skill

We tested customer satisfaction (1-10 scale) across 4 regions: North (n=35), South (n=32), East (n=38), West (n=36). Same survey, different people. Do regions differ significantly?

Output

What the skill generates for you

Recommended Test: One-Way Between-Subjects ANOVA

Why This Test?

Based on your description:

✓ ONE continuous dependent variable (satisfaction scores 1-10)
✓ ONE categorical independent variable (region) with FOUR levels (North, South, East, West)
✓ Independent groups (different people in each region)
✓ Adequate sample sizes (n=32-38 per group, total N=141)

Assumptions to Check

Normality: Satisfaction scores approximately normal within each region
- Check: Shapiro-Wilk test, Q-Q plots
- ANOVA is robust to mild violations with balanced groups
- If violated: Kruskal-Wallis test
Homogeneity of Variance: Similar variance across regions
- Check: Levene’s test
- If violated: Welch’s ANOVA
Independence: Each satisfaction score independent of others (satisfied by different people)

Python Code

from scipy.stats import f_oneway, levene, shapiro
import numpy as np

# Your data
north = np.array(<your_data>)  # 35 satisfaction scores (1-10)
south = np.array(<your_data>)  # 32 satisfaction scores (1-10)
east = np.array(<your_data>)   # 38 satisfaction scores (1-10)
west = np.array(<your_data>)   # 36 satisfaction scores (1-10)

# Descriptive statistics
print(f"North (n=35): M = {np.mean(north):.2f}, SD = {np.std(north, ddof=1):.2f}")
print(f"South (n=32): M = {np.mean(south):.2f}, SD = {np.std(south, ddof=1):.2f}")
print(f"East (n=38): M = {np.mean(east):.2f}, SD = {np.std(east, ddof=1):.2f}")
print(f"West (n=36): M = {np.mean(west):.2f}, SD = {np.std(west, ddof=1):.2f}")

# Check normality for each region
for name, group in [("North", north), ("South", south), ("East", east), ("West", west)]:
    stat, p = shapiro(group)
    print(f"{name} normality: p = {p:.4f} {'(normal)' if p > 0.05 else '(non-normal)'}")

# Check equal variance (homogeneity of variance)
stat, p_levene = levene(north, south, east, west)
print(f"\nLevene's test for equal variance: p = {p_levene:.4f}")
if p_levene > 0.05:
    print("Assumption of equal variance satisfied")
else:
    print("WARNING: Unequal variances detected - consider Welch's ANOVA")

# Run one-way ANOVA
F_stat, p_value = f_oneway(north, south, east, west)
print(f"\n=== ANOVA Results ===")
print(f"F-statistic: F(3, 137) = {F_stat:.3f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("\n✓ SIGNIFICANT: Regions differ significantly in customer satisfaction")
    print("  → Proceed to post-hoc tests to identify which regions differ")
else:
    print("\n✗ NOT SIGNIFICANT: No evidence of regional differences in satisfaction")

Effect Size

# Eta-squared (proportion of variance explained by region)
all_scores = np.concatenate([north, south, east, west])
grand_mean = np.mean(all_scores)

# Sum of squares between groups
ss_between = sum([len(g) * (np.mean(g) - grand_mean)**2 
                  for g in [north, south, east, west]])

# Total sum of squares
ss_total = sum([(x - grand_mean)**2 for x in all_scores])

eta_squared = ss_between / ss_total
print(f"\nEta-squared = {eta_squared:.3f}")

# Interpretation
if eta_squared < 0.01:
    print("Small effect (< 1% of variance explained)")
elif eta_squared < 0.06:
    print("Medium effect (1-6% of variance explained)")
else:
    print("Large effect (> 6% of variance explained)")

Post-Hoc Tests (If Significant)

If the ANOVA is significant, run Tukey’s HSD to identify which specific regions differ:

from scipy.stats import tukey_hsd

# Run Tukey's HSD post-hoc test
res = tukey_hsd(north, south, east, west)
print("\n=== Tukey's HSD Post-Hoc Comparisons ===")
print(res)

# Manual pairwise comparisons with labels
import itertools
from scipy.stats import ttest_ind

regions = {'North': north, 'South': south, 'East': east, 'West': west}
comparisons = list(itertools.combinations(regions.keys(), 2))

print("\nPairwise comparisons:")
for reg1, reg2 in comparisons:
    t_stat, p_val = ttest_ind(regions[reg1], regions[reg2])
    diff = np.mean(regions[reg1]) - np.mean(regions[reg2])
    print(f"{reg1} vs {reg2}: Diff = {diff:+.2f}, p = {p_val:.4f} {'*' if p_val < 0.05 else ''}")
    
print("\nNote: Apply Bonferroni correction (α = 0.05/6 = 0.0083) for multiple comparisons")

Non-Parametric Alternative (If Assumptions Violated)

from scipy.stats import kruskal

# Kruskal-Wallis H-test (non-parametric alternative)
H_stat, p_value = kruskal(north, south, east, west)
print(f"\nKruskal-Wallis H-test:")
print(f"H-statistic: {H_stat:.3f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Significant difference between regions (non-parametric test)")

Business Reporting Template

Executive Summary:

We analyzed customer satisfaction scores (1-10 scale) across four regions: North (n=35), South (n=32), East (n=38), and West (n=36).

Key Finding:

[If p < 0.05]: Regional differences in satisfaction are statistically significant (p = .XXX, F(3,137) = X.XX).
- Effect size (eta-squared = .XX) indicates that region accounts for X% of satisfaction variance.
- Highest satisfaction: [Region] (M = X.XX)
- Lowest satisfaction: [Region] (M = X.XX)
- Actionable insight: Investigate practices in top-performing region for potential rollout.
[If p > 0.05]: No statistically significant regional differences detected (p = .XXX).
- All regions show similar satisfaction levels (overall M = X.XX).
- Actionable insight: Regional strategy can be standardized; differences are likely due to chance.

Academic Reporting Template (APA Format)

“A one-way between-subjects ANOVA was conducted to compare customer satisfaction scores across four geographic regions (North, South, East, West). The analysis revealed [a significant/no significant] effect of region on satisfaction scores, F(3, 137) = X.XX, p = .XXX, η² = .XX. [If significant: Post-hoc comparisons using Tukey’s HSD indicated that…]”

Common Pitfalls to Avoid

Don’t run six separate t-tests (North vs South, North vs East, etc.)
- This inflates Type I error rate to ~26% instead of 5%
- Use ANOVA first, then post-hoc tests if significant
Don’t ignore effect size
- p < 0.05 tells you differences are real, not random
- Eta-squared tells you if differences are meaningful
- With large samples, tiny differences can be “significant” but unimportant
Don’t confuse statistical and practical significance
- A difference of 0.3 points on a 10-point scale might be statistically significant but practically irrelevant
- Consider: What size difference would actually change business decisions?
Don’t forget assumption checks
- ANOVA is fairly robust, but severely violated assumptions require alternatives
- With nearly equal sample sizes (32-38), homogeneity violations are less concerning

Next Steps

Run assumption checks first (normality, equal variance)
Execute ANOVA with your actual data
If significant: Run Tukey’s HSD to identify which regions differ
Calculate effect size to assess practical importance
Interpret in business context: What would you do differently based on results?