Identifying pediatric patients for an external control arm for a set of rare disease trials using multiple administrative healthcare data sources: lessons from the road
Background: Rare diseases pose unique challenges to clinical trial implementation, frequently necessitating an external control arm. As a result, the FDA recently released a draft guidance on the use of externally controlled trials. We present a case study of the use of multiple real-world healthcare data sources to identify patients qualifying for an external control arm to contextualize outcomes from two single-arm trials for a rare pediatric cancer.
Objectives: To share experiential findings from an effort to use contemporaneous deidentified administrative healthcare data to contextualize treatment safety and effectiveness from two single-arm trials for treatment of a rare pediatric cancer.
Methods: The study was conducted using two U.S. administrative healthcare databases with access to deidentified medical records: (1) an insurance claims database with >50 million patients, allowing researchers to request records directly from providers, and (2) an electronic health records (EHR) database for >110 million patients. A screening algorithm was applied to structured data elements in each database to identify potentially eligible patients from the same calendar time period as the trials. Medical records were abstracted to examine patient eligibility, with EHR review expedited through natural language processing. Careful consideration was given to alignment of variable definitions between the trials and healthcare data.
Results: Across the two data sources, we initially detected 134 potentially eligible patients through readily available data. Of 74 patients with available medical records, < 5 qualified for the study, and the study did not proceed. We identified three general, and often interrelated, classes of challenges: (1) substantive challenges related to the disease studied (e.g., lack of genotype-specific subclasses within ICD-9-CM compared to ICD-10-CM, rareness of the disease), (2) methodologic challenges (e.g., trial data not inclusive of all confounders, outcome data not always available in observational data), and (3) administrative challenges (e.g., legal and ethical implications arising from rareness of disease and pediatric population; unifying medical record abstraction approaches between EHR and non-EHR records).
Conclusions: While methodologic and administrative challenges were surmountable, substantive challenges prevented the study from moving forward due to few qualifying patients. Because structured data alone were inadequate to find patients with the indication of interest, much effort was expended leveraging unstructured data for sample size verification. Improved capture of rare and complex diseases in structured data would reduce the time and cost of finding eligible patients for rare disease trials.