Robust, reproducible clinical patterns in hospitalised patients with COVID-19
Millar JE., Neyton L., Seth S., Dunning J., Merson L., Murthy S., Russell CD., Keating S., Swets M., Sudre C., Spector T., Ourselin S., Steves C., Wolf J., Docherty A., Harrison E., Openshaw P., Semple C., Baillie JK.
AbstractBackgroundSevere COVID-19 is characterised by fever, cough, and dyspnoea. Symptoms affecting other organ systems have been reported. However, it is the clinical associations of different patterns of symptoms which influence diagnostic and therapeutic decision-making. In this study, we applied simple machine learning techniques to a large prospective cohort of hospitalised patients with COVID-19 identify clinically meaningful sub-groups.MethodsWe obtained structured clinical data on 59 011 patients in the UK (the ISARIC Coronavirus Clinical Characterisation Consortium, 4C) and used a principled, unsupervised clustering approach to partition the first 25 477 cases according to symptoms reported at recruitment. We validated our findings in a second group of 33 534 cases recruited to ISARIC-4C, and in 4 445 cases recruited to a separate study of community cases.FindingsUnsupervised clustering identified distinct sub-groups. First, a core symptom set of fever, cough, and dyspnoea, which co-occurred with additional symptoms in three further patterns: fatigue and confusion, diarrhoea and vomiting, or productive cough. Presentations with a single reported symptom of dyspnoea or confusion were common, and a subgroup of patients reported few or no symptoms. Patients presenting with gastrointestinal symptoms were more commonly female, had a longer duration of symptoms before presentation, and had lower 30-day mortality. Patients presenting with confusion, with or without core symptoms, were older and had a higher unadjusted mortality. Symptom clusters were highly consistent in replication analysis using a further 35446 individuals subsequently recruited to ISARIC-4C. Similar patterns were externally verified in 4445 patients from a study of self-reported symptoms of mild disease.InterpretationThe large scale of the ISARIC-4C study enabled robust, granular discovery and replication of patient clusters. Clinical interpretation is necessary to determine which of these observations have practical utility. We propose that four patterns are usefully distinct from the core symptom groups: gastro-intestinal disease, productive cough, confusion, and pauci-symptomatic presentations. Importantly, each is associated with an in-hospital mortality which differs from that of patients with core symptoms. These observations deepen our understanding of COVID-19 and will influence clinical diagnosis, risk prediction, and future mechanistic and clinical studies.FundingMedical Research Council; National Institute Health Research; Well-come Trust; Department for International Development; Bill and Melinda Gates Foundation; Liverpool Experimental Cancer Medicine Centre.