Data Type Validator & Fixer

Pro v1.0.0 1 view

Validate and fix data types in datasets - convert dates stored as strings, numbers with currency symbols, percentages, and booleans to proper types with error handling.

What You Get

Get properly typed, analysis-ready datasets with detailed validation reports showing what was converted and what failed.

The Problem

Data imported from spreadsheets, CSVs, or external sources often has type issues: dates stored as text, numbers with currency symbols or commas, percentages as strings, and inconsistent missing value representations. These issues cause analysis failures, broken calculations, and visualization errors.

The Solution

Automatically inspect column content patterns to infer correct types. Convert date strings to datetime, remove currency symbols and parse as numeric, transform percentage strings to decimals, map text booleans to proper boolean types, and standardize missing value representations. Generate detailed validation reports showing original vs new types, success/failure counts, and problematic values. Export cleaned datasets ready for analysis.

How It Works

  1. 1 Load dataset and display current data types with sample values for each column
  2. 2 Identify columns that should preserve as strings (IDs, ZIP codes, values with leading zeros)
  3. 3 Detect type patterns: dates, currency, percentages, booleans, mixed missing value formats
  4. 4 Confirm ambiguous formats with user (US vs European dates, columns with leading zeros)
  5. 5 Apply type conversions with error handling, coercing unconvertible values to NaN
  6. 6 Generate validation report with conversion summary and list of problematic values
  7. 7 Export cleaned dataset and optional data dictionary documenting all changes

What You'll Need

  • Python with pandas library
  • Dataset file (CSV, Excel) or inline pasted data