Distribution Analyzer

Pro v1.0.0 1 view

Automated distribution analysis for numeric dataset variables with statistics, visualizations, and transformation recommendations.

What You Get

Automate hours of exploratory data analysis work. Get comprehensive distribution insights, normality tests, and actionable transformation recommendations in seconds.

The Problem

Data analysts and scientists spend significant time manually analyzing distributions across multiple variables before modeling. This includes calculating statistics, creating visualizations, running normality tests, and determining appropriate transformations - repetitive work that delays actual analysis.

The Solution

Run a single command to analyze all numeric variables in your dataset. The script calculates comprehensive statistics (mean, median, std, quartiles, skewness, kurtosis), classifies distribution shapes, runs appropriate normality tests (Shapiro-Wilk or Kolmogorov-Smirnov based on sample size), recommends transformations, and optionally generates a visualization grid with histograms and box plots.

How It Works

  1. 1 Load CSV or Excel file and identify all numeric columns for analysis
  2. 2 Calculate comprehensive summary statistics including central tendency, spread, and shape measures
  3. 3 Classify distribution shapes based on skewness and kurtosis thresholds
  4. 4 Run appropriate normality tests (Shapiro-Wilk for n<5000, K-S for larger samples)
  5. 5 Generate transformation recommendations based on distribution characteristics
  6. 6 Optionally create visualization grid with histograms (KDE overlay) and box plots

What You'll Need

  • CSV or Excel file with numeric data
  • Python 3.9+ (auto-handled by uv)