All examples for Categorical Variable Profiler

Customer Satisfaction Regional Analysis

Comprehensive analysis of 30-customer survey dataset with Region, Age_Group, Satisfaction, and Product_Category variables. Demonstrates complete workflow including frequency tables, chi-square tests with multiple testing correction, effect size calculations (Cramér's V), and visualization generation. Identifies critical West region dissatisfaction issue (57% vs 0% in South) and strong statistical associations between variables.

Input

What you provide to the skill

Profile the categorical variables in /tmp/test_customer_data.csv

Output

What the skill generates for you

Analysis Complete! 🎉

I’ve successfully executed the Categorical Variable Profiler skill on your customer data. Here’s what was delivered:

📊 Key Insights

Critical Findings:

  1. West Region Crisis: 57% dissatisfaction rate vs 0% in other regions - all 4 dissatisfied customers are from West
  2. Clothing Problem: 100% of dissatisfied customers purchased Clothing products
  3. South Region Excellence: Perfect satisfaction record (100% satisfied/very satisfied)

Statistical Associations (FDR-corrected, all significant):

  • Region × Satisfaction (p=0.033, Cramér’s V=0.474) - moderate-large effect
  • Region × Product_Category (p<0.001, Cramér’s V=0.780) - large effect
  • Satisfaction × Product_Category (p=0.011, Cramér’s V=0.566) - large effect

📁 Generated Files (11 total)

Visualizations:

  • 4 distribution bar charts (Region, Age_Group, Satisfaction, Product_Category)
  • 6 cross-tabulation heatmaps showing relationships between variable pairs

Data:

  • chi_square_results.csv - Statistical test results with effect sizes

All files are saved in /tmp/

⚠️ Important Notes

  • Sample size is small (n=30) which limits statistical power
  • Chi-square assumptions violated due to small expected cell frequencies
  • Results should be validated with larger sample (recommend n≥100)
  • Despite small sample, large effect sizes suggest genuine associations

🎯 Top Recommendations

  1. Immediate: Investigate West region operations and Clothing product line
  2. Short-term: Replicate South region best practices, target 25-34 age group
  3. Long-term: Increase data collection, add more variables for deeper analysis

The analysis revealed a clear pattern: West region + Clothing = Dissatisfaction, while other regions with Electronics/Home products show high satisfaction.