Machine learning prediction models for clinical management of blood-borne viral infections: a systematic review of current applications and future impact.
Ajuwon BI., Awotundun ON., Richardson A., Roper K., Sheel M., Rahman N., Salako A., Lidbury BA.
BackgroundMachine learning (ML) prediction models to support clinical management of blood-borne viral infections are becoming increasingly abundant in medical literature, with a number of competing models being developed for the same outcome or target population. However, evidence on the quality of these ML prediction models are limited.ObjectiveThis study aimed to evaluate the development and quality of reporting of ML prediction models that could facilitate timely clinical management of blood-borne viral infections.MethodsWe conducted narrative evidence synthesis following the synthesis without meta-analysis guidelines. We searched PubMed and Cochrane Central Register of Controlled Trials for all studies applying ML models for predicting clinical outcomes associated with hepatitis B virus (HBV), human immunodeficiency virus (HIV), or hepatitis C virus (HCV).ResultsWe found 33 unique ML prediction models aiming to support clinical decision making. Overall, 12 (36.4%) focused on HBV, 10 (30.3%) on HCV, 10 on HIV (30.3%) and two (6.1%) on co-infection. Among these, six (18.2%) addressed the diagnosis of infection, 16 (48.5%) the prognosis of infection, eight (24.2%) the prediction of treatment response, two (6.1%) progression through a cascade of care, and one (3.03%) focused on the choice of antiretroviral therapy (ART). Nineteen prediction models (57.6%) were developed using data from high-income countries. Evaluation of prediction models was limited to measures of performance. Detailed information on software code accessibility was often missing. Independent validation on new datasets and/or in other institutions was rarely done.ConclusionPromising approaches for ML prediction models in blood-borne viral infections were identified, but the lack of robust validation, interpretability/explainability, and poor quality of reporting hampered their clinical relevance. Our findings highlight important considerations that can inform standard reporting guidelines for ML prediction models in the future (e.g., TRIPOD-AI), and provides critical data to inform robust evaluation of the models.