Associate Director of Pharmacoepidemiology Regeneron Pharmaceuticals Chapel Hill, United States
Background: When conducting comparative effectiveness or safety studies using propensity score (PS) matching, exact matching on key covariates is often required. Recent applied papers have used different approaches to exactly balance key covariates between study arms, but it is not yet established how these approaches perform.
Objectives: To assess covariate balance, covariate distribution, matched sample size, and bias, comparing stratified PS matching (STPSM) vs. a two-stage matching (2SPSM) approach.
Methods: A dataset of 10000 individuals was simulated. Variables included a binary treatment, a binary key covariate, a binary and a continuous additional covariate, and a continuous outcome. By design, the covariates were all confounders and effect measure modifiers. Two matching approaches from the applied literature were compared: 1:4 exact matching on the key covariate followed by 1:1 PS matching on the remaining covariates in the exact-matched cohort (2SPSM) and 1:1 PS matching on the two additional covariates within strata of the key covariate (STPSM). STPSM was conducted without calipers, while 2SPSM was conducted both with no caliper and with calipers on the standardized PS from 0.1 to 1.
Results: The true risk difference in the simulated data was 0.68. STPSM achieved adequate balance on all covariates with similar covariate distributions in the matched and target treated populations. The resulting estimate for STPSM was 0.62. With no caliper, 2SPSM achieved adequate balance on the key covariate but poor balance on the two non-key covariates, with an estimate of 0.09. The estimates with calipers ranged from 0.44 to 1.20. Covariate balance was achieved with the most restrictive caliper, but the non-key covariate distributions differed from those of the target treated population. The bias was negligible with a caliper of 0.7; however, the sample size was reduced by 25%, covariate balance was not achieved for the non-key covariates, and their distributions differed from the target treated population.
Conclusions: Although it was theoretically possible to obtain unbiased estimates using either STPSM or 2SPSM in this simulation, the latter required precise knowledge of the optimal caliper to offset the bias due to the lack of covariate balance against the bias due to the difference in covariate distributions between the matched sample and the target treated population. Absent such knowledge, these results suggest that STPSM may be preferred when exact balance on key covariates is required in propensity score matched analyses. Further studies are needed to assess these approaches under additional causal and population structures.