Background: To maximize sample sizes in most real-world data studies using health insurance claims, researchers allow small “administrative” health insurance coverage gaps when defining continuous enrollment. Despite their near-ubiquitous use, no studies have assessed data completeness in the gap period or the impact of different gap thresholds on study outcomes.
Objectives: To characterize enrollment gaps within a large commercial health plan’s claims data and evaluate the impacts of varying gap lengths on sample sizes of a prior study.
Methods: The Aetna Sentinel Common Data Model was used for this study. Three commonly-used gap periods were characterized: < 32 days, < 63 days, and < 93 days. Prevalence of gaps; prevalence and rate of claims within and surrounding gaps; and time-to-claim within gaps were assessed. As an example, we ran an updated version of a prior study among patients diagnosed with diabetes mellitus (DM) to assess how variance in length of gaps affected sample sizes.
Results: There were 23,633,037 eligible patients with ≥1 day of medical coverage between 1/2011-7/2022, of which 8.9% (n=2,103,942) had at least one gap in their coverage. The median gap was 366 days, and periods < 32, < 63, and < 93 days represented 7.8% (n=187,404), 13.7% (n=330,79), and 21.3% (n=513,079) of the gaps, respectively. A total of 32,124 (9.7%) patients with gaps < 63 days had at least 1 diagnosis encounter during their coverage gap, with a mean 13.9 days (SD=13.7, Med = 10) from the start of the gap to the next visit. The incidence rate ratio of diagnosis visits within vs outside gaps was 6.0 per 1,000 gap days, vs 26.6 in patient-matched surrounding non-gap periods. In our real-world use case evaluating 24,898,117 patients, patients needed to have ≥183 days of pre-index enrollment. The sample size using gaps of 0, < 32, and < 63 days was 45,533, 45,576 (0.1% increase), and 45,613 (0.2% increase) for the Type 1 DM cohort, and 1,510,329, 1,511,896 (0.1% increase), and 1,512,940 (0.2% increase), for the Type 2 DM cohort, respectively.
Conclusions: There were drastically fewer medical encounters in gaps compared to continuously enrolled periods, therefore adding excess at-risk time for time-dependent studies. However, in our real case scenario, the minimal increase in sample size when increasing gap duration suggests that the potential impact on study results may be marginal. To determine a suitable enrollment gap, researchers should consider the monitoring period, minimum enrollment length, and prevalence of the health outcome of interest. In cases with short enrollment and highly prevalent outcomes, there is likely no difference in outcomes between using and not using gaps, and no gap should be considered despite a slight reduction in sample size.