Vice President, Research OM1, Inc. Tufts University School of Medicine Boston, United States
Background: Validated measures of disease activity and symptom severity are important tools to monitor disease progression and patient outcomes over time. While these measures are commonly recorded in clinical trials, they are often missing or recorded at inconsistent intervals in real-world data (RWD) sources.
Objectives: To assess the feasibility of developing machine learning models to estimate scores for validated measures in four chronic conditions using clinical notes from RWD sources.
Methods: Machine learning methods were used to develop estimation models for the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) for systemic lupus erythematosus; the Expanded Disability Status Scale (EDSS) for multiple sclerosis; the Patient Health Questionnaire-9 (PHQ-9) for depression; and the New York Heart Association (NYHA) classification for heart failure. RWD sources were used to create training and validation cohorts for each model. Model performance was assessed using the area under the receiver-operating-characteristic curve (AUC), calculated using a binarized version of the outcome as low versus high at clinically meaningful thresholds.
Results: The estimated scores produced by the machine learning models were highly correlated with collected disease severity scores. The AUC was 0.91 for the SLEDAI model, 0.91 for the EDSS model, 0.81 for the PHQ-9 model, and 0.85 for the NYHA model. Development of the models across a range of chronic conditions resulted in several findings that are informative for future efforts to develop machine learning estimation models. First, in each condition area, clinicians documented important information about range of symptoms, symptom severity, disease progression, and medication needs in the clinical notes. This information was sufficient for the models to generate estimated scores, even for measures that are patient-reported, such as the PHQ-9. Second, model features were reviewed for clinical relevance but were not restricted to features that approximated items on the validated instrument, enabling the models to consider other features, such as medication needs, resulting in improved model performance.
Conclusions: The successful development of four estimation models for four different chronic conditions suggests that this approach may be useful in other clinical areas. Application of the models to RWD sources may increase the utility of these data sources for research on disease progression, treatment response, and patient outcomes.