All examples for Data Type Validator & Fixer

Survey Data with Mixed Missing Values

Handles survey response data with multiple missing value representations (N/A, n/a), percentage strings, and mixed date formats (YYYY-MM-DD and MM/DD/YYYY). Shows how the skill standardizes missing values to NaN and converts percentages to decimals.

Input

What you provide to the skill

check data types in this survey data: ID,Score,Completion,Response_Rate
R001,4,2024-01-10,85%
R002,N/A,2024-01-11,n/a
R003,5,01/12/2024,90%

Output

What the skill generates for you

============================================================
INITIAL INSPECTION
============================================================
Current data types:
ID object
Score float64
Completion object
Response_Rate object
dtype: object
Sample values:
ID Score Completion Response_Rate
0 R001 4.0 2024-01-10 85%
1 R002 NaN 2024-01-11 NaN
2 R003 5.0 01/12/2024 90%
============================================================
STANDARDIZING MISSING VALUES
============================================================
============================================================
TYPE VALIDATION REPORT
============================================================
1. Converting 'Score' column:
Original type: object
New type: float64
Successful: 2 | Failed: 1
Note: 1 values could not be converted (now NaN)
2. Converting 'Completion' column:
Original type: object
Detected formats: YYYY-MM-DD, MM/DD/YYYY
Using: format='mixed' for auto-detection
New type: datetime64[ns]
Successful: 3 | Failed: 0
3. Converting 'Response_Rate' column:
Original type: object (percentage strings)
New type: float64
Transformations: removed %, divided by 100
Successful: 2 | Failed: 1
============================================================
FINAL DATA TYPES
============================================================
ID object
Score float64
Completion datetime64[ns]
Response_Rate float64
dtype: object
Cleaned data preview:
ID Score Completion Response_Rate
0 R001 4.0 2024-01-10 0.85
1 R002 NaN 2024-01-11 NaN
2 R003 5.0 2024-01-12 0.90
============================================================
DATA SUMMARY
============================================================
Total records: 3
Date range: 2024-01-10 00:00:00 to 2024-01-12 00:00:00
Score: Mean=4.50, Range=[4-5]
Response Rate: Mean=87.5%
Exported: /tmp/survey_data_cleaned.csv

Cleaned CSV:

ID,Score,Completion,Response_Rate
R001,4.0,2024-01-10,0.85
R002,,2024-01-11,
R003,5.0,2024-01-12,0.9