Background: The use of external control arms (ECAs) has gained significant attention recently due to the increasing availability of regulatory guidance and the potential to accelerate drug development. They are useful in rare disease where single-arm trials (SATs) are commonplace and control groups are often not feasible nor ethical. For an ECA to be successful, the design and statistical analyses of the data must be carefully planned.
Objectives: To evaluate and describe common statistical approaches for ECAs in rare disease.
Methods: We performed a targeted literature review of statistical methodologies in ECAs in rare disease studies and synthesized our findings to provide an overview of the strengths and limitations of these statistical approaches.
Results: A major concern with the use of ECAs is the potential non-comparability of patients from external controls and SATs. Use of real-world data (RWD) to comprise an ECA can introduce bias in the treatment comparisons, but they can be at least partially mitigated by using appropriate statistical methods.
Propensity score methods adjust for baseline imbalances to the extent explanatory factors are available in the data and estimate average treatment effects. Propensity score methods and multivariable regression require all confounders to be observed in the dataset, but the former can accommodate a larger number of linear and non-linear relationships compared to multivariable regression. Lastly, regression models, where covariates are included in the model, may extrapolate data regardless of the non-overlap of populations.
Bayesian hierarchical modelling is used to restore balance between study arms and weight the contribution of multiple data sources. While it provides a mathematically rigorous methodology for making decisions under intricate scenarios, it can be complex to implement and difficult to interpret as this method requires skills to translate a subjective prior into a mathematically formulated prior.
Finally, performance criteria are based on aggregate level estimands from historical cohorts or selected based on clinical judgement can be set as benchmarks and compared to outcomes from the SAT. Patients from the external comparator are not directly compared with the patients in the SAT. The results are easy to interpret, but this approach does not allow to address differences in patient populations when the benchmark is generated from RWD with small sample sizes.
Conclusions: Several statistical approaches can be used to provide informative analyses regarding treatment effects from ECAs in rare diseases. The suitability of these statistical methods warrants a study-by-study assessment, informed by study design, disease characteristics, assessment of outcomes, and type of RWD data source.