New Report from QUANTUM Reveals Fragmented Data Quality Landscape
A new landscape analysis report, published by the EU-funded QUANTUM project, provides the first comprehensive mapping of data quality (DQ) assessment practices across European health data holders. The study combines a systematic review of 66 open-source data quality tools with survey responses from 27 institutions across 13 European countries, offering critical insights for the European Health Data Space (EHDS).
QUANTUM (Quality, Utility and Maturity Measured) is an EU-funded project with 36 partners, including CESSDA. Its goal is to create a common label system for Europe that assesses and communicates the quality and utility of datasets for scientific and health innovation purposes. These labels will help researchers, policymakers, and healthcare professionals quickly identify trustworthy, high-quality data for decision-making and research.
The report uncovers a highly diverse landscape. While most institutions (72-79%) use tools that allow custom metrics and result exports, there is no single standard approach. Programming languages like R and Python dominate in-house data quality assessments, alongside a mix of commercial products, manual procedures, and open-source solutions. Notably, 53% of identified DQ tools are tailored for specific data types (such as omics or imaging), while 47% are general-purpose. The authors recommend a non-invasive, self-assessment labelling tool that works with existing workflows—not against them.
“The wide variety of tools and practices uncovered emphasizes the need for an agnostic, flexible approach to data quality labelling for the EHDS. This suggests that a self-assessment tool with a common set of metrics would be the most feasible solution, as part of the ongoing effort to build a standardized framework for the future European Health Data Space,” concludes the authors.