Associate Professor Department of Population Health Sciences, Duke University School of Medicine, Durham, NC Durham, United States
Background: The U.S. Medicaid and Children’s Health Insurance Programs (CHIP) insure over 25% of the population and about 40-50% of pregnancies following expansions to the program over the last decade. Adding this low-income population to the U.S. FDA’s Sentinel System is a high priority, but data quality variability among the 50+ contributing jurisdictions requires rigorous preliminary review.
Objectives: To establish jurisdiction-level, beneficiary-level, and record-level criteria for inclusion of Medicaid/CHIP data into a Sentinel Common Data Model (SCDM)-compliant database and to document the fit-for-purpose requirements of the Medicaid/CHIP data for Sentinel System regulatory needs.
Methods: Transformed Medicaid Statistical Information System (T-MSIS) analytic files (TAF) from 2014 to 2018 were utilized. We prioritized TAF variables relevant to the needs of drug safety analyses and identified jurisdictions (states, territories, and the District of Columbia) with acceptable quality on these variables by year and by plan type as reported by Medicaid’s Data Quality (DQ) Atlas. Based on TAF documentation, we devised beneficiary-level inclusion rules to identify individuals for whom Medicaid/CHIP was the primary insurer; and established record-level criteria to exclude records related only to administrative payments.
Results: We selected 12 high priority variables, from over 80 available, within the DQ Atlas to assess fit-for-purpose data for jurisdiction-year-plan combinations with an acceptable level of quality. These variables covered different topics but prioritized complete capture of healthcare utilization and included enrollment, eligibility, claims file completeness, and service use. In 2018, data from 44 jurisdictions met these standards (9 of which met these standards for all 5 years of data included), while 9 jurisdictions were excluded because at least one of the selected high priority variables had poor data quality. We did not exclude source data based on the quality of demographic or provider information since data deficiencies in these domains can be handled during data analysis. Data from about 20% of beneficiaries were excluded due to either dual Medicare/Medicaid eligibility (since Medicare is the primary payer) or eligibility for only partial benefits. Established criteria identified a substantial number of capitated payment records to be deleted.
Conclusions: The U.S. Medicaid/CHIP data is a rich resource of variable quality due to its aggregation across 50+ jurisdictions with differing rules and standards. A rigorous process to determine initial fit-for-purpose criteria among these different data sources is needed as a formal step prior to routine data characterization and quality review to optimize resource management.