Papers - Lay Summary
Sequence Independent Amplification and Sequencing of Respiratory Viruses Direct from Clinical Sample
Viruses are a common cause of respiratory infections. Symptoms range from a mild cold to serious illness or death due to their effect on the breathing passages and lungs. Different age groups may be at greater risk from certain viruses. Examples include coronavirus or flu infection in the elderly and viruses associated with “croup” and breathing difficulties in the young.
Respiratory viruses spread easily, often causing local outbreaks. However, if a new virus arises, it may cause a global public health emergency. Examples include the recent coronavirus pandemic and historic flu pandemics.
Tests performed in a laboratory are our first line of defence against viruses. They aim to tell us which virus is present in a given patient. This information can tell doctors the best way to treat a particular patient and detect outbreaks if many people catch the same virus. Current laboratory tests detect a handful of specific viruses by using “primers” to amplify small fragments of each virus’s genetic material. We need to make these “primers” in advance based on already knowing the virus’s genetic material. New technology now allows us to read the virus genetic material directly from a patient’s sample without having “primers”. This means we don’t need to know which virus we’re looking for in advance and makes it easier to detect both common and new viruses.
Unfortunately, direct-from-sample sequencing of virus genetic material can be very challenging. This is because viruses make up only a tiny amount of the total genetic material present on patients’ nose or throat swabs.
This work was done in two parts. In the first two years, we tested many different ideas for improving the detection of virus genetic material from clinical samples. Success was measured by adding known viruses to negative throat swabs. We tested methods to concentrate viruses from the clinical sample and to extract the virus genetic material efficiently. Finally, we increased (amplified) the genetic material, so that we had enough to sequence. We combined these methods to produce our new test. The test starts with a clinical sample (patient’s throat swab) and ends with the sequence of the virus genetic material as the final product.
The second part of the work is now underway. Test performance is currently being studied using several hundred discarded and anonymised respiratory samples. We are collecting these from the Clinical Microbiology Laboratory, John Radcliffe Hospital. We have had an ethically approved research partnership with this laboratory for over 20 years. Most of the samples are positive for different respiratory viruses (sometimes more than one) by routine clinical laboratory testing, but some are negative. We aim to find out how the new test compares to the current tests performed in the Clinical Microbiology Laboratory.
The longer-term aim is to move virus genetic material sequencing direct-from-sample into busy clinical laboratories. We hope that eventually both common and new viruses will be sequenced, diagnosing patients and providing early warning of pandemics.
Predicting rifampicin resistance in M. tuberculosis using machine learning informed by protein structural and chemical features
Rifampicin is a cornerstone antibiotic in tuberculosis (TB) treatment, but its efficacy is increasingly threatened by mutations in its protein target, RNA polymerase. The drug works by blocking the extension of the mRNA chain during DNA transcription, with most resistance-associated mutations clustering in a specific region of the rpoB subunit. This genetic localisation enables high sensitivity in molecular diagnostic tests like GeneXpert and next-generation sequencing. However, these methods rely on existing mutation catalogues and struggle to infer resistance for rare or novel mutations. In our study, we explored a machine learning approach, training models on the structural features of the rpoB gene product to directly predict resistance to rifampicin.
We trained 4 machine learning models: logistic regression, decision tree, random forest, and gradient boosted decision tree. The best performing model was a decision tree that achieved 87.6 ± 0.028% sensitivity and 96.3 ± 0.008% specificity when cross-validated. Owing to the tight spatial clustering of resistant mutations around the rifampicin binding site, the most discriminatory feature was distance to the binding site, whereby mutations outside of this region are typically predicted susceptible; a major benefit of tree-based methods is interpretability. We also investigated whether using features derived from molecular dynamics simulations offered any advantage over those based on static crystal structures. Our results showed that both approaches performed equally well.
Although our performance is good, the model’s reliance on proximity to the binding site means predicting rare resistance-associated variants outside the RRDR is challenging. Additionally, comparisons with similar models underscore the importance of avoiding overfitting and highlight that to be clinically useful, the model must also be capable of accurately identifying mutations that have no effect on resistance. While structure-based models show promise, we expect the approach to be more useful for drugs where the target genes are not essential as this usually leads to more genetic variation and more diffuse mechanisms of resistance.
High precision Neisseria gonorrhoeae variant and antimicrobial resistance calling from metagenomic Nanopore sequencing
Gonorrhoea is the second most common sexually transmitted infection. We have developed a new approach, using an exciting new technology, which promises to detect infections and provide the best treatment faster.
Gonorrhoea is caused by the bacteria called Neisseria gonorrhoeae. It can be treated with antibiotics, but the bacteria have developed resistance to many antibiotics, so that now only one antibiotic works reliably.
Gonorrhoea is usually detected by collecting a urine sample but can also be grown in a lab from swabs taken in a sexual health clinic. If bacteria do grow, further tests can be done to see which antibiotics will kill the bacteria. Scientists also do what are called ‘molecular’ tests, which involve trying to detect DNA from the gonorrhoea bacteria. It can take a while to get the results back from all of these tests - often bacteria take a few days to grow. A single test that could be done much faster, and which would both find out whether gonorrhoea bacteria are there and which antibiotics would kill them best, would let patients start the right treatment quicker. In turn, this would stop gonorrhoea being passed on.
A new molecular method, called Metagenomic Sequencing, identifies DNA from bacteria directly from patient samples, and is showing potential as a new diagnostic test. The bacterial DNA can be compared to a database of many known bacteria, to work out the cause of an infection - like a ‘paternity test’ for bugs. If we can get enough bacterial DNA from a sample, not only can we work out which type of bacteria is causing the infection but we can also find specific parts of its DNA that we know lead to antibiotic resistance. Metagenomic sequencing can also be faster than current tests, often finding out the cause of an infection and which treatments will work within a few hours.
In our recent study we have been testing whether metagenomic sequencing can accurately find gonorrhoea infections in urine samples. We were trying to choose the best method for getting bacterial genetic material out of these urine samples, and we did many experiments in the laboratory to work this out. At first, we used artificial infections - taking uninfected urine and putting in known amounts of Neisseria gonorrhoeae bacteria - so we could measure how successful our experiments were at getting the bacterial genetic material (DNA) back. Once we had chosen the best method, we tested it on 10 urine samples collected from men with suspected gonorrhoea infections. We found that it was possible to detect almost all the complete genetic sequence (>90%) of gonorrhoea in all 10 of our samples. We were also able to find the parts of the genetic sequence that we know are involved in antibiotic resistance.
We have shown that with our optimised laboratory method we can detect gonorrhoea DNA directly in urine from men with suspected infections. These results provide a solid foundation on which to build, and we plan to test our method on a larger number of urine samples. We also plan to develop computer analysis methods that will allow us to take the metagenomic sequencing results and detect more genetic regions involved in antibiotic resistance.
https://genome.cshlp.org/content/early/2020/09/01/gr.262865.120
DNA Thermo-Protection Facilitates Whole-Genome Sequencing of Mycobacteria Direct from Clinical Samples
Mycobacterim tuberculosis, the bug that causes tuberculosis or ‘TB’, is one of the leading causes of death due to infection. The World Health Organization estimates that 10 million new infections with TB bacteria, and 1.2 million deaths occurred worldwide in 2018.
Luckily, TB can be cured with antibiotics. Getting a quick diagnosis, so that patients receive the right antibiotics as soon as possible is really important, both for the patient and to stop them spreading the infection to other people. To treat people we need to find out the right antibiotics to give them. Choosing the right antibiotics has become trickier in recent years, because many TB bacteria have become resistant. This means that antibiotics which used to work, have become less effective treatments or stopped working.
Fortunately, thanks to recent advances in DNA sequencing technology, we can quickly sequence the whole genome of the TB bacteria infecting an individual patient. The aim of our study was to develop a simple, rapid, and cheap method of preparing TB bug DNA, directly from patient samples (sputum), for sequencing. This would avoid the slow process of growing TB bugs which takes weeks.
In our study, we invented a special solution, which is added to the patients’ sputum sample before the sample is heated for half an hour to kill the TB bugs. During heating, the solution protects the TB DNA from falling apart. Broken DNA is no good for sequencing! An essential ingredient in the solution is a high concentration of the salt potassium chloride.
This breakthrough allowed us to sequence complete TB bacteria genomes from 15/20 sputum samples tested and to find out the right antibiotics for the patients. The method is currently being tested in labs based in India and Madagascar, two locations with a high number of people suffering from TB infections.
https://journals.asm.org/doi/10.1128/JCM.00670-20