(204) Predicting Drug Consumption in Relation to Respiratory Diseases: An Analysis of the National Organisation for the Provision of Health Services ( EOPYY) Dataset in Greece
PhD Candidate Democritus University of Thrace, Greece
Background: Respiratory diseases are a major cause of morbidity and mortality worldwide. Accurate predictions of drug usage are important for ensuring an adequate supply of medications and better health outcomes. In this study, used Azure Machine Learning to predict the total number of prescriptions of drugs used in relation to respiratory diseases.
Objectives: To compare the effectiveness of boosted decision forest regression and decision forest regression in order to create a prediction model for the total prescriptions of drugs used for respiratory diseases.
Methods: The EOPYY dataset was used in this study. It contains information on the year, ICD-10 codes, ATC codes, age group, gender and total prescriptions from 2015 - 2020. The dataset was split into training and testing sets, and two machine learning models, decision forest regression and boosted decision forest regression, were trained on the training set and tested on the testing set.
Results: Based on the comparison of its performance the decision forest regression model was found to be a better choice for predicting total prescriptions . The decision forest regression model showed a lower Mean Absolute Error (235.22 vs 245.12), a higher Root Mean Squared Error (1334.83 vs 1006.87), a higher Relative Squared Error (0.998941 vs 0.568382), a lower Relative Absolute Error (0.724631 vs 0.755122), and a much lower Coefficient of Determination (0.001059 vs 0.431618) compared to the boosted .The forest model was applied to test examples, resulting in predicted "Scored Labels" (the number of prescriptions) for various combinations of variables. The decision forest regression model was tested on a set of examples, providing predicted results for the number of prescriptions. For instance, in the year 2026, the model predicted 25821.25 prescriptions for a 60-69-year-old male with ICD-10 code J44 and ATC_3 code R03A. Another example is from the year 2023, where the model predicted 233.825 prescriptions for a 50-59-year-old female with ICD-10 code U07.1 and ATC_3 code A02B. These examples demonstrate the ability of the decision forest regression model to provide accurate predictions for the number of prescriptions based on various parameters.
Conclusions: Azure Machine Learning was used to predict the total number of prescriptions of respiratory drugs using decision forest regression. The model outperformed the boosted decision forest regression, providing better accuracy and a higher coefficient of determination. These results have significant implications for healthcare professionals and policy makers in ensuring an adequate drug supply and improved health outcomes for patients. However, it is important to note that a specific time period dataset and potential omitted factors that could impact the model's accuracy.