Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.


Oxford University Lead: Sarah Walker PHE co-lead: Susan Hopkins

We would like to better understand who is affected by antimicrobial resistance and healthcare-associated infections, and why, including the impact of inequalities and aging, and how we can monitor these conditions. Our strategy is to exploit large-scale linked electronic health record (EHR) data from multiple sources to answer the following key questions:

  • How can routine surveillance be automated optimally?
  • What populations are at greatest risk of different healthcare associated infections (HAI) and antimicrobial resistance (AMR)

By automating surveillance using EHR we hope to improve the monitoring and management of infectious diseases, reduce the burden of data collection in the NHS and better predict future trends in antimicrobial usage, HAI and AMR.


Project Title 01:

Assess the ability of routine diagnostic coding data from secondary care to monitor a range of Healthcare Associated Infections (HAI) and pathogens. *New Project* This replaces project in the application titled “To enhance current national reporting frameworks by investigating whether linked data can provide routine surveillance and near-complete epidemiological information” 

PHE leads: Susan Hopkins, Russell Hope OU leads: Sarah Walker, David Eyre Researchers: with OU/PHE HPRU vacancies statisticians/machine learning  

Project start: April 2020 

Pathogen data for COVID, Gram-negative/Gram-positive bacteria, viral causes of hospital-onset respiratory tract infection/diarrhoea will be data linked at local and national levels. This will be used to determine the effectiveness of such methods for national surveillance, reducing requirements for costly nurse-led approaches 

We will:

  • Develop and validate data linkages with emergency care datasets (ECDS) and secondary use statistics (SUS) datasets in addition to traditional Hospital Episodes Statistics (HES) datasets from hospitals to determine the number and frequency of HAI COVID, using novel case definitions. 
  • Extend this dataset to determine risk adjusted morbidity and mortality of HAI COVID compared to community onset hospitalised COVID. 
  • Develop novel dashboards and assess their utility to inform the management of HAI COVID. 
  • Extend these novel linkages to Carbapenemase Producing Enterobacteriaceae (CPE) and validate these datasets against traditionally captured pathogens in mandatory surveillance. 
  • Investigate whether subgroups at higher risk of these HAI can be identified and monitored using such routinely collected data, particularly considering healthcare inequalities such as deprivation, frailty, aging, hospital specialities, care home residence  

Key milestones and timescales: 
January 2021: - Development of a dashboard for HAI-COVID and Submitted publication on morbidity and mortality of hospital onset versus community onset CVID for wave 1 HAI-COVID 

June 2021: -  Determine whether there are differences in the impact of HAI-COVID in different waves of COVID  

December 2021: -  Utilise these data linkages and determine whether effective in monitoring HAI-CPE  


Project Title 02:

Develop a national healthcare worker cohort study to study the impact of COVID-19 - *New project*. Replaces project titled: To enhance current national reporting frame works by investigating whether linked data can provide routine surveillance. 

PHE lead: Susan Hopkins, Meera Chand, Victoria Hall OU lead: Sarah Walker Researchers: OU/PHE HPRU vacancies statisticians 

Project start: April 2020 

  • To determine whether prior infection protects against future infection through regular COVID antibody and PCR monitoring of health care workers (HCW) 
  • To determine which behaviours (both in work and socially) and demographics predict new COVID Infections 
  • To determine the impact of COVID on HCW quality of life and healthcare utilisations in the subsequent 12 months 
  • Improve the understanding of the immunology of re-infections through detailed immunophenotyping 

Key milestones and timescales: 

March 2021: - Submit the protocol for monitoring re-infections January 2021 

October 2021: -  Submit an interim analysis from the wave 2 infections and the impact of immunity  


Project Title 03:

Surveillance of COVID in the OUH staff population - *New Project* 

OU leads: David Eyre, Sarah Walker; PHE lead: Susan Hopkins Researchers: Sheila Lumley, Tim Walker 

Project start: April 2020 

  • Describe the epidemiology of COVID infection in a staff cohort of circa 12,000 
  • Investigate the effect of prior infection with SARS-CoV-2 on preventing reinfection 
  • Determine the serological impact and durability of COVID vaccine on a population of naïve and previously infected staff. 

Key milestones and timescales: 

December 2020: – Publish on the descriptive epidemiology of COVID infection 

January 2021: - Publish in the protective effect of previous COVID infection on reinfection. 

October 2021: - Publish on the serological impact of COVID vaccination on naïve and previously infected subjects 


Project Title 04:

Identifying “at-risk” populations in the COVID-19 Infection Survey *New project* 

OU lead: Sarah Walker; PHE lead: Susan Hopkins Researcher: Emma Pritchard – HPRU PhD student, Koen Pouwels - epidemiologist/health economist

Project start: April 2020 

  • The national COVID-19 infection survey is currently conducting PCR-based swab tests for SARS-CoV-2 on approximately 150,000 individuals every two weeks, and collecting sociodemographic, work/school and behavioural data at each timepoint.  
  • HPRU-funded statisticians investigated the initial rises in positivity in mid-August and explored the demographic and epidemiological factors accounting for this rise.  
  • The optimal approach to systematically monitor the relatively large number of variables collected to identify subgroups in whom positivity rates are changing fastest is unclear, particularly since this is effectively an interaction with time, inevitable missing data, and also the fact that interactions with other factors (particularly age and region given their early influence on increases in positivity).  
  • We will develop and test a screening approach based on quantile-quantile plots for main effects and multiple different types of interaction models, and test this over time in terms of ability to identify at-risk sub-populations. We will particularly investigate the contribution of deprivation. 

Key milestones and timescales: 

December 2020: - initial screening approach developed and implemented  

June 2021: – evaluation and demonstration of impact (translational milestone identifying how real time surveillance could be used to identify changing at-risk populations) 


Project Title 05:

Investigation of transmissibility of SARS-CoV-2 as a function of viral load  

OU lead: Tim Peto, Sarah Walker, David Eyre; PHE lead: Susan Hopkins Researcher: Lennard Lee 

Project start: October 2020 

  • Detection of antigen by lateral flow devices is highly correlated with viral load with a limit of detection of 103 viral RNA copies/ml 
  • The COVID positive data from the UK lighthouse labs held in Second Generation Surveillance System (SGSS) and NHS Digital has been sourced. We will carry out analyses focusing on the relationship between the viral load of an index case and detection of SARS-CoV-2 in contacts. 

Key milestones and timescales: 

December 2020: – report compiled and shared with DHSC 

January 2021: – publication submitted  


Project Title 06:

Exploiting increasingly rich data on antimicrobial usage in primary/secondary care for monitoring and prediction  

PHE lead: Sarah Gerver; OU lead: David Eyre Researchers: HPRU-funded OU/PHE machine learning and statisticians  

Project start: December 2020 

  • To determine the impact of COVID on the outcomes of antimicrobial prescribing and antimicrobial resistance (AMR). 
  • Assess whether machine learning methods can predict future use of different antimicrobials, individually, by class and using WHO-Aware (Access/Reserve/Watch) classifications, from historical use, and consequent impact on AMR&HAI (C. difficile in particular)  

Key milestones and timescales: 

December 2021: – Publication 


Project Title 07:

Methods to automate monitoring of quality of big data  

OU Leads: Sarah Walker, Tim Peto PHE leads:  Sarah Gerver, Russell Hope, Susan Hopkins Researcher: Phuong Quan (statistician) 

Project start: April 2020 

  • Work through the preceding HPRU has identified some specific issues with Electronic Health Records (HER) data quality, such as systematic errors that can arise through events such as system glitches (e.g. if the default value of a dropdown menu changes without users noticing), or when the meaning or content of a particular data field changes (e.g. when laboratory methods and/or specimen types are altered). In the age of big data and with the prospect of machine learning being applied to live data streams, it is not clear who is going to monitor all this data or how 
  • Tools will be developed to identify these sorts of temporal artefacts, distinguishing them from real changes in the presenting population, and to do it in a timely fashion so that any errors can be corrected.  
  • Based on a previous literature review, we will test methods heuristic/statistical/machine-learning/other) to assess data quality and validity of EHR for research.  
  • We will set up a project on Zooniverse to use volunteers to review large numbers of anonymised data feeds from the Infections in Oxfordshire Research Database (IORD) to generate a “gold standard” set of data changes which automated systems should be able to identify. 
  • Develop open-source R-packages and recommendations for researchers. 

Key milestones and timescales: 

December 2020: - Zooniverse project set-up and identified a set of gold standard data changes to test automated systems against 

June 2021: - initial comparison of methods and prototype R tools developed 

March 2022: - recommendations drafted and submitted for publication (translational milestones identifying how big data can be optimally managed) 




Populations Theme Publications

Healthcare-associated COVID-19 in England: a national data linkage study

Bhattacharya A, Collin S M, Stimson J, et al.


SARS-CoV-2 infection rates of antibody-positive compared with antibody-negative health-care workers in England: a large, multicentre, prospective cohort study (SIREN)

Hall V J, Foulkes S, Charlett A, et al.


Antibody Status and Incidence of SARS-CoV-2 Infection in Health Care Workers

Lumley S F, O’Donnell D, Stoesser N, et al.


The Duration, Dynamics, and Determinants of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Antibody Responses in Individual Healthcare Workers

Lumley S, Wei J, O’ Donnell D, et al.