Data Quality
Quality Score Calculation
Quality scores (0.0 to 1.0) are calculated based on:
Quality Weights:
MANDATORY fields: 2.0 weight
NESTED datasets: 1.5 weight
OPTIONAL fields: 1.0 weight
Quality Values:
COMPLETE (1.0): Field present and valid
NOT_COMPLETE (0.0): Field missing but allowed
ERROR (-1.0): Field invalid or violates rules
Formula:
Quality = (Sum of weighted values) / (Total weight)
Quality Facts
The quality_facts metadata field tracks specific issues:
Examples:
"Column email is mandatory but value is invalid"
"Column phone is optional but value is not present"
"Column address is nested but value is incomplete"
Improving Quality
Complete mandatory fields - Ensure required data is present
Validate data formats - Use correct formats for email, phone, dates
Fill nested structures - Complete hierarchical data
Fix validation errors - Address issues in
_errorsmetadata