Background and Problem Statement

As a consequence of recent progress in interoperability and healthcare data exchange, the data quality problem is no longer isolated to local organizations and businesses; it is systemic and more evident than ever. It negatively affects patient care, patient safety, public health, clinical research, insurance claims processing, and the administrative cost of care. Likewise, the potential benefits of AI will not be realized in healthcare unless the quality of clinical and claims data is significantly improved. Patient data quality must be improved at its point of collection as well as when it is exchanged between organizations.

In August 2024, a group of public and private sector leaders met to discuss this topic, recommend a strategy to improve this systemic problem in U.S. healthcare, and raise awareness and seek support at the federal level. This group has emphasized the need for a strategic approach and incremental enhancement, focusing on practical implementation and future progress. This includes leveraging existing EHR technology to standardize API pipelines, implementing data quality control actions, and strengthening feedback loops to ensure two-way exchanges. The framework aims to address the most prevalent and highest-risk data quality deficiencies (i.e., errors, incomplete or missing data critical to quality assessment) by employing a risk-based approach to identify and prioritize these deficiencies. By targeting these critical areas first, the initiative seeks to create a robust foundation for ongoing improvements in data quality.

Healthcare data quality is affected at three levels. Each level has its own unique data quality problems. In addition, the levels are interdependent; data quality problems start at Level 1 and cascade up, through Levels 2 and 3. While we recognize the below steps won’t completely solve the problems associated with data quality, it is believed the initial steps outlined will go a long way to improving data quality for patients, providers, and payers in the years ahead.

The group focused on improving data quality using the framework and is looking for industry commitment in the following areas.

These are specific data points which describe a patient’s vital signs, diagnosis, lab tests and results, medications, allergies, gender, race, ethnicity, immunizations, genome, and more. These data points are typically recorded in Electronic Health Records, public health cases reports, insurance claims, and research repositories using healthcare-specific, standard terminologies such as SNOMED, LOINC, RxNORM, and ICD.

Data quality problems at Level 1: The “standard” terminologies are not human-friendly and their flexible structure allows for creating thousands of data synonyms. For example, there are at least 20 different ways to describe and code “systolic blood pressure,” and 47 different ways to code “COVID positive test result.” These thousands of synonyms have a cascading effect on data models and algorithmic computing errors. In addition to this synonym problem, other data quality problems at this level include data format errors, missing data, duplicate records, and data entry errors in unconstrained fields where the entered data is not checked for required entry or validated against master references.

Examples of validations at this level might include:

Data type validation: Ensuring the data provided matches the type of data that is expected. For example, the data in a field should be numeric, and the data within the field lengths should be appropriate for the data type being captured.
Data format validation: Ensuring the data provided is consistent with an expected data format (e.g., ICD-10 codes must begin with a letter, LOINC code max length of 7, etc.).
Data reasonableness validation: Ensuring the data provided matches a set of predefined accepted patterns where industry consensus has already been settled (e.g., USCDI) or investigating “outliers” (e.g., values that are outside the range of what is reasonable, like 01/01/1000 as a date of birth).
Data Completeness: The measurement of nulls where data is expected.
Duplication: Ensuring the data submitted contains fewer duplicate records.

Opportunities for Level 1 industry commitments

We believe all providers should commit to implementing and exchanging clinical data using the USCDI v1 data model and HL7 FHIR US Core 4.0.0.
- Within 12 months, we believe at least 80% of all contracts, procurements, and vendor agreements should include this provision.
We believe all providers should commit to implementing and exchanging clinical data using USCDI v3 data model and HL7 FHIR US Core 6.1.0 in advance of 1/1/2026 (as required by HTI-1).
- Within 24 months, we believe at least 80% of all contracts, procurements, and vendor agreements should include this provision.
We believe health plans should commit to implementing and exchanging claims data using HL7 FHIR CARIN IG for Blue Button STU 2.0 with at least 80% of their trading partners within 12 months.

This level represents healthcare-specific, standard data models and schemas in which atomic data points are stored for exchange, and later retrieval and analysis. Examples of data models and schemas in this layer include OMOP, USCDI, PCORnet, FHIR, HL7, CDISC, and CDA.

Data quality problems at Level 2: Exchanging data between healthcare entities that use different data models and schemas introduces translation and mapping errors. Also, because the majority of EHRs and other data collection systems were designed with proprietary data models years before or independent of the “standard” data models and schemas listed above, mapping errors between the data models are inherently introduced at the origin of the patient’s data lifecycle, when data is first collected.

Opportunities for Level 2 industry commitments

We would encourage both providers and payers to validate their conformance using ONC’s Inferno test kit (https://inferno.healthit.gov/) and work together over the next 12 months to develop a standard open testing methodology.
In addition, we would recommend providers and payers prove their conformance using tactical, open source scorecards that can be used by all stakeholders which measures both data validity and completeness based on specific data domains.

Algorithms, including AI models, are used to analyze the atomic data in Level 1, which are stored in data models in Level 2. The output accuracy and quality of these algorithms can be no better than the quality of the data and data models they depend on. In other words, garbage in, garbage out. Examples of algorithms at this level include those used to calculate and submit quality measures, identify and manage patients in clinical trials; monitor patient safety, identify and manage public health cases and outbreaks, process insurance claims, and more.

Data quality problems in Level 3: There are two sources of inaccuracies in this level: (1) Programming errors and (2) Cascading impact of poor data quality in Level 1 combined with inconsistent use of data models in Level 2. That is, quite often, by tasking three data analysts to independently produce the same report, results in three different outputs. The differences in outputs can come from programming errors, but, more often than not, are attributed to data quality problems in Level 1 combined with data model mapping problems in Level 2.

This level of data quality involves performing validation of the data and whether it is suitable for its intended use (other terms used by members of the group included “use case” or “algorithm” level) and is “fit for purpose”.

Use Case Specific Validation: Is the data provided complete enough for a valid HEDIS submission?
Statistical Validation: Is the dataset representative and robust enough for statistical analysis of the topic being researched?
Fitness of Data for Replication of Existing Study Designs: Is the data provided complete enough for it to be used to replicate existing observational research design (e.g., OHDSI)?
Evaluation of Surveillance and Disease Detection Efforts: Is the dataset representative and robust enough for use in disease surveillance for epidemiological and population health use cases? Additionally, can it be leveraged for infectious disease, pandemic surveillance and detection, and adverse event surveillance, detection, and reporting?
Suitability and Fitness for Measure Development: Is the dataset representative and robust enough to support research on measure development and validation?
Suitability and Fitness for Evidence Generation and 21st Century Cures Act Regulatory Uses: Is the dataset representative and robust enough for real world evidence generation for regulatory use cases?

Opportunities for Level 3 industry commitments

Within 12 months, we believe providers and payers should commit to submitting clinical data using USCDI v1/HL7 FHIR US Core 4.0 US Core 4.0.0 and claims data using the HL7 FHIR CARIN IG for Blue Button STU 2.0 to test these data quality frameworks and/or scorecards with digital HEDIS content as the use case.
- In conjunction with this commitment, NCQA has agreed to evolve their products and services (e.g., DAV) which currently rely on more manual and burdensome approaches to data quality such as primary source validation.
We believe the industry should commit to implementing the recommendations made by SNOMED CT and LOINC regarding their 2022 collaboration.

More details will be forthcoming at the 2024 ASTP/ONC Annual Meeting in Washington, D.C. If you would like more information on this initiative or have comments on this approach, please contact [email protected].

Insights

A Vision for Incrementally Improving Health Care Data Quality

Background and Problem Statement