Chief Scientific Officer, IQVIA RWS IQVIA UNC-Chapel Hill Chapel Hill, United States
Background: As efforts increase to assess regulatory acceptability of real-world data (RWD) submissions, the number of guidelines around RWD quality considerations have grown. Globally, these align on emphasizing the need for transparency in RWD quality assessments, focused on the “what” and giving opportunity for “how”. The Kahn data quality framework harmonizes assessment methods across standards, defining three assessment categories (conformance, completeness, plausibility) within two contexts (verification, validation). Creation of standardized tools can operationalize RWD guidance in a transparent, reproducible manner.
Objectives: (1) Translate Kahn framework into actionable business rules and thresholds defined in accordance with regulatory guidance (2) Deploy metrics through interactive OMOP-compatible dashboard, enabling determination of data’s fit-for-purpose for generation of real-world evidence.
Methods: We developed a data quality dashboard operationalizing Kahn’s framework, defining business rules, constraints and thresholds measuring conformance, completeness, and plausibility of data measures at the field, table, and concept levels. The dashboard’s assessment of patient ethnicity measures within an electronic medical record (EMR) OMOP data asset is provided as an example. This was in partnership with the Observational Health Data Science & Informatics (OHDSI) community.
Results: 3,312 total data quality checks were deployed within the dashboard at field, table, and concept levels with thresholds ranging from 0-100%, defining the allowable range of unexpected values informing a pass/fail result. For example, capture of “ethnicity” in this EMR data asset returns 12 field-level checks across the conformance and completeness Kahn framework categories. Eleven (92%) passed within allowable thresholds; 1 (0.1%) failed due to mismatched field-level values between tables where high concordance was expected.
Conclusions: Implementing these guidelines within a data quality dashboard enables researchers and regulators to evaluate and communicate the quality of their dataset in a rapid, consistent manner with a shared vocabulary. Real-time results facilitate transparent assessment of fit-for-purpose data.