Something went wrong!
Hang in there while we get back on track
Survey Data with Mixed Missing Values
Handles survey response data with multiple missing value representations (N/A, n/a), percentage strings, and mixed date formats (YYYY-MM-DD and MM/DD/YYYY). Shows how the skill standardizes missing values to NaN and converts percentages to decimals.
Input
What you provide to the skill
check data types in this survey data: ID,Score,Completion,Response_Rate
R001,4,2024-01-10,85%
R002,N/A,2024-01-11,n/a
R003,5,01/12/2024,90%
Output
What the skill generates for you
============================================================
INITIAL INSPECTION
============================================================
Current data types:
ID object
Score float64
Completion object
Response_Rate object
dtype: object
Sample values:
ID Score Completion Response_Rate
0 R001 4.0 2024-01-10 85%
1 R002 NaN 2024-01-11 NaN
2 R003 5.0 01/12/2024 90%
============================================================
STANDARDIZING MISSING VALUES
============================================================
============================================================
TYPE VALIDATION REPORT
============================================================
1. Converting 'Score' column:
Original type: object
New type: float64
Successful: 2 | Failed: 1
Note: 1 values could not be converted (now NaN)
2. Converting 'Completion' column:
Original type: object
Detected formats: YYYY-MM-DD, MM/DD/YYYY
Using: format='mixed' for auto-detection
New type: datetime64[ns]
Successful: 3 | Failed: 0
3. Converting 'Response_Rate' column:
Original type: object (percentage strings)
New type: float64
Transformations: removed %, divided by 100
Successful: 2 | Failed: 1
============================================================
FINAL DATA TYPES
============================================================
ID object
Score float64
Completion datetime64[ns]
Response_Rate float64
dtype: object
Cleaned data preview:
ID Score Completion Response_Rate
0 R001 4.0 2024-01-10 0.85
1 R002 NaN 2024-01-11 NaN
2 R003 5.0 2024-01-12 0.90
============================================================
DATA SUMMARY
============================================================
Total records: 3
Date range: 2024-01-10 00:00:00 to 2024-01-12 00:00:00
Score: Mean=4.50, Range=[4-5]
Response Rate: Mean=87.5%
Exported: /tmp/survey_data_cleaned.csv
Cleaned CSV:
ID,Score,Completion,Response_Rate
R001,4.0,2024-01-10,0.85
R002,,2024-01-11,
R003,5.0,2024-01-12,0.9
About This Skill
Validate and fix data types in datasets - convert dates stored as strings, numbers with currency symbols, percentages, and booleans to proper types with error handling.
View Skill DetailsMore Examples
European Number Format Conversion
Demonstrates handling of European number formats where periods are thousands separators and commas are decimal separators (1.234,56 = 1234.56). Also shows ID column detection and preservation logic for columns containing 'ID' in the name.
Sales Data with Currency and Dates
Validates a simple sales dataset with date strings in MM/DD/YYYY format, currency values with dollar signs, and Yes/No boolean text. Demonstrates core type conversions: dates to datetime, currency to numeric, and text booleans to proper bool type.