(234) An iterative phenotype development and evaluation process with medical specialist input: the experience defining progressive multifocal leukoencephalopathy

Sunday, August 27, 2023

8:00 AM - 1:30 PM ADT

Location: Convention Hall

Publication Number: 1225

Presenting Author(s)

Darmendra Ramcharran, PhD, MPH

Sr Director, Safety Innovation and Analytics
GSK, United States

Background: The development of phenotypes of safety outcomes and diseases in administrative claims and electronic health records databases may span from data-driven to clinically derived.

Objectives: To define progressive multifocal leukoencephalopathy (PML) using a new framework to iteratively develop and evaluate phenotypes.

Methods: Data from a large US claims database (2000-2022) were analyzed to define PML. Phenotypes were developed, evaluated, and refined with iterative reviews of interactive patient-level profiles by a neurologist including the type and timing of treatments, location of care, diagnoses, occurrence of death, procedures, and tests relative to time of PML diagnosis code(s). Two definitions were sought, one sensitive and one specific (chosen out of several candidates).

Results: The sensitive phenotype consisted of at least one PML diagnosis, 14 months of continuous prior enrolment, and use of an in scope immunosuppressive treatment 38 to 15 months before the diagnosis. The specific phenotype had the same criteria as the sensitive phenotype, but also required at least 1 brain CT or MRI within one month of the diagnosis or at least one of the following: at least 2 PML diagnoses within 3 months, last prescription of an in scope immunosuppressive prescription within 90 days before or 30 days after the diagnosis, death or inferred death 90 days after diagnosis, hospitalization with the diagnosis, or at least one PML diagnosis by a neurologist, internist/general internist, or infectious disease specialist. After applying criteria, 118 and 89 patients were identified using the sensitive and specific definitions, respectively. The iterative reviews of patient-level profiles enabled empirical validation and minimized sample size loss for the specific definition by including patients with clinical and treatment characteristics consistent with clinical PML presentations. In contrast, the more restrictive alternative specific phenotype candidates (e.g., requiring brain imaging, n=35) resulted in substantial sample size loss.

Conclusions: Our new framework developed and evaluated phenotypes using iterative reviews of patient-level profiles which resulted in a novel specific PML phenotype. This iterative review process may be used to create other (including non-PML) phenotypes, as well as to revalidate phenotypes over time and/or assess whether phenotypes developed and validated in one data source are transportable to others. Further research may examine how this framework may be expanded to electronic health record databases with unstructured data and laboratory measure test results, as well as to identify areas where advanced analytics may be employed.