PhD Student University of North Carolina at Chapel Hill, United States
Background: Often, the first step in conducting pharmacoepidemiologic research is understanding patterns of treatment use in a target population. We can utilize Sankey diagrams to visualize complex treatment switching and discontinuation. However, longitudinal treatment data, especially in administrative claims, can be subject to extensive (and potentially informative) right censoring.
Objectives: To develop a method for visualizing treatment patterns with Sankey diagrams in data subject to right censoring.
Methods: We use logistic regression to estimate the probability of an individual being uncensored and in a particular treatment state conditional on the individual’s prior treatment state. We estimate these transitions at each time point under the Markov assumptions that the probability of being (1) uncensored and (2) in a particular treatment state are independent of all prior treatment states (other than the immediately preceding one). Using the assumption that transitions are independent of each other, to find the size of a longitudinal flow through all the time points we multiply the weighted transition probabilities and then multiply by the total population. We conducted 1000 simulations of a population of 12,000 osteoporosis treatment initiators following a fracture and their one-year treatment patterns post-initiation evaluated at 0-, 6-, and 12-months post-initiation. We induced loss of follow-up on this population by assigning a time to censoring from an exponential distribution based on their treatment at each time point. If the subsequent time point occurred prior to their censoring time, we re-estimated their time to censoring based on their treatment state at that time point. We investigated a differential censoring scenario based on prior treatment and two non-differential censoring scenarios. Finally, we compare the Sankey diagram with complete follow-up to both a complete-case Sankey diagram and a Sankey diagram estimated using the above methods from the cohort with induced censoring.
Results: With minimal, non-differential censoring, the MSE was decreased by 18% by the weighted Sankey compared to the complete case Sankey; however, when non-differential censoring was heavier, the weighted Sankey decreased MSE by 54%. When censoring depended on treatment state, the complete-case Sankey MSE had nearly 83 times greater MSE than the weighted Sankey.
Conclusions: Using Sankey diagrams to visualize treatment patterns without addressing informative censoring can present biased or uninterpretable estimates. Including other predictors of censoring in the censoring model could further improve estimating treatment paths.