Background: Use of linked clinical and genomic real-world data (cgRWD) can be powerful for hypothesis generation and validation. Given the typical time delay in clinical practice between diagnosis and genomic testing, cgRWD by design are subject to left truncation as only patients who lived long enough to be sequenced are included in the cohort.
When calculating overall survival (OS), left truncation can lead to systematic overestimation of survival, if not accounted for by risk-set adjustment (RA) [Tsai et al, 1987]. RA relies on an assumption that delayed entry and event times are independent in the observable part of the distribution. However, timing of the biopsy is often reflective of worsening of clinical course, hence correlated with survival, thus violating the assumption of independence and potentially leading to bias in the resulting analysis [Kehl et al, 2020; Backenroth et al, 2022]. As such, assessments of independence after RA are warranted when examining endpoints in cgRWD.
Objectives: Objectives were three-fold to: 1) assess standard methods for handling delayed cohort entry, 2) introduce a framework for evaluating informative left truncation, and 3) apply diagnostics for informative left truncation and evaluate the results for two cgRWD cohorts.
Methods: Approaches for quantitative assessment of independence with conditional Kendall tau and Cox proportional hazards were validated in a simulation study and applied to the cohorts. Resampling experiments and different data visualizations were implemented to provide an assessment of a potential dependency and, if present, its directionality and effect. The different assessments were then applied and tested with two cgRWD oncology cohorts.
Results: Our simulation for validation of quantitative metrics showed that conditional Kendall tau and Cox Proportional hazards can confirm independence for a truly independent distribution but do not reflect the magnitude of dependency for dependent data.
For both cgRWD cohorts delayed entry due to biopsy timing had strong influence in the OS endpoint (median survival without and with RA: 20.6 months vs. 12.7 months for cohort 1 and 45.1 months vs. 22.6 months for cohort 2). Our assessments confirmed independent delayed entry for cohort 1 and thus confirms that OS will be unbiased after RA. For cohort 2, our assessments showed an increased survival probability for patients with later cohort entry, thus interpretation of results should consider that there is a remaining bias after RA.
Conclusions: Assessment of OS in cgRWD can be heavily biased due to delayed cohort, but can be corrected for with RA. Visual and numerical assessments of dependent truncation are needed to inform whether and to what approximate magnitude the resulting analysis may still be biased.