Something went wrong!
Hang in there while we get back on track
Customer Demographics Dataset
Analyzing a customer dataset with age, income, and spending score columns. Demonstrates basic distribution analysis with symmetric distributions and appropriate standardization recommendations for variables with heavy tails.
Input
What you provide to the skill
Analyze distributions in a sample customer dataset with age, income, and spending_score columns (500 rows)
Output
What the skill generates for you
Distribution Analysis Report
Summary Statistics
| Variable | Count | Mean | Median | Std | Min | Max | Q1 | Q3 | IQR | Skewness | Kurtosis | Missing | Missing% |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| age | 427 | 40.1101 | 40 | 10.1431 | 22 | 57 | 31 | 49 | 18 | -0.0416 | -1.215 | 0 | 0 |
| income | 427 | 69543.3 | 70000 | 22232.3 | 33000 | 108000 | 49000 | 89000 | 40000 | -0.0046 | -1.2596 | 0 | 0 |
| spending_score | 427 | 54.192 | 52 | 16.3012 | 28 | 82 | 40 | 70 | 30 | 0.1548 | -1.3137 | 0 | 0 |
Distribution Shapes
| Variable | Distribution | Skewness | Kurtosis |
|---|---|---|---|
| age | Symmetric, heavy-tailed | -0.0416 | -1.215 |
| income | Symmetric, heavy-tailed | -0.0046 | -1.2596 |
| spending_score | Symmetric, heavy-tailed | 0.1548 | -1.3137 |
Normality Tests
| Variable | Test | Statistic | p-value | Result |
|---|---|---|---|---|
| age | Shapiro-Wilk | 0.9522 | 0 | Non-normal |
| income | Shapiro-Wilk | 0.9455 | 0 | Non-normal |
| spending_score | Shapiro-Wilk | 0.9329 | 0 | Non-normal |
Transformation Recommendations
| Variable | Recommendation |
|---|---|
| age | Z-score standardization if needed |
| income | Z-score standardization if needed |
| spending_score | Z-score standardization if needed |
Saved: /workspace/distribution_plots.png
About This Skill
Automated distribution analysis for numeric dataset variables with statistics, visualizations, and transformation recommendations.
View Skill DetailsMore Examples
IoT Sensor Data with Large Sample
Analyzing sensor readings with 6000 observations to trigger Kolmogorov-Smirnov test instead of Shapiro-Wilk. Demonstrates correct test selection for large samples, handling of zero-inflated error counts, and identification of normally distributed sensor measurements alongside skewed operational metrics.
Sales Data with Skewed Distributions
Analyzing sales metrics including revenue, units sold, profit margin, and discount rate. Demonstrates handling of mixed distribution types - right-skewed revenue requiring log transformation, normally distributed profit margins, and moderate skew requiring square root transformation.