Data & Infrastructure

Data Quality Score

📖

Definition

A data quality score is a quantitative metric — or set of metrics — that summarizes how well a dataset meets defined standards across multiple quality dimensions. Common dimensions include completeness (what percentage of expected records and fields are present), accuracy (how closely values match the real-world entities they represent), consistency (whether the same fact is recorded identically across systems), timeliness (how current the data is relative to when it was captured), and uniqueness (absence of duplicates). Scores are typically computed by data quality frameworks or observability tools and can be tracked over time, compared across datasets, or used as gatekeeping criteria for pipeline promotion or model training eligibility.

In AI and commerce operations, data quality scores serve as both a diagnostic tool and an accountability mechanism. A model trained on low-quality data will inherit its defects — products with inaccurate attributes will be mis-categorized; customers with incomplete profiles will receive poor recommendations; inventory records with staleness will cause fulfillment failures. By making quality scores visible, persistent, and tied to business outcomes, organizations create the conditions for treating data quality as a continuous engineering discipline rather than a periodic cleanup project. Embedding quality thresholds as automated gates in ML pipelines — refusing to retrain a model if input quality falls below a defined threshold — is an increasingly common practice in mature data organizations.

🔗

AI-Ready DataBig dataCustomer Data Platform (CDP)Data Lineage

Last updated: May 12, 2026

Definition

Related Terms