Something went wrong!
Hang in there while we get back on track
Data Quality Validator
Validate CSV and Excel files before analysis. Detects missing values, duplicates, outliers, and format errors. Generates quality scores, severity-classified issues, and actionable remediation recommendations.
What You Get
Catch data quality issues before analysis with automated validation and severity-classified remediation guidance.
The Problem
The Solution
How It Works
- 1 Request data access via file path, pasted sample, or dataset description with context about intended use
- 2 Load and profile the dataset using pandas to understand structure, data types, and basic statistics
- 3 Run comprehensive validation checks detecting missing values, duplicates, outliers, format errors, and range violations
- 4 Classify all findings by severity level (Critical/High/Medium/Low) and calculate overall quality score from 0-100
- 5 Generate text-based distribution summaries with quartile analysis and completeness bars for each column
- 6 Compile comprehensive quality report with executive summary, detailed issues by severity, and column statistics
- 7 Provide prioritized action recommendations ordered by urgency and offer re-validation after corrections
What You'll Need
- CSV or Excel file up to 500MB and 1 million rows
- Python environment with pandas, numpy, and scipy libraries
- Context about intended use of the data (analysis, migration, reporting, audit)
- Optional: Custom business rules for domain-specific validation
- Optional: Columns that should be unique identifiers (e.g., order_id, patient_mrn)
Get This Skill
Requires Pro subscription ($9/month)
Have Feedback?
Help us improve this skill by sharing your ideas and suggestions.
Request ImprovementsExamples
Clean Customer Master Data - CRM Migration Ready
Validates customer master file with excellent data quality. Demonstrates handling of clean datasets with 100/100 quality score, showing positive confirmation messaging and migration approval process. Illustrates perfect completeness, valid formats, and no critical issues detected.
Inventory Data with Stock and Date Issues
Validates inventory management data containing negative stock levels, out-of-stock conditions, missing quantities, and future restock dates. Demonstrates range validation capabilities and business-context remediation recommendations for operations scenarios requiring immediate action on stock discrepancies.
Sales Data with Revenue Reporting Issues
Validates sales transaction data containing critical issues including negative amounts, duplicate order IDs, missing values, and invalid email formats. Demonstrates severity classification, revenue impact quantification, and prioritized remediation recommendations for business reporting scenarios.