AbstractHybrid assemblies are highly valuable for studies of Enterobacteriaceae due to their ability to fully resolve the structure of mobile genetic elements, such as plasmids, which are involved in the carriage of clinically important genes (e.g. those involved in AMR/virulence). The widespread application of this technique is currently primarily limited by cost. Recent data has suggested that non-inferior, and even superior, hybrid assemblies can be produced using a fraction of the total output from a multiplexed nanopore (Oxford Nanopore Technologies [ONT]) flowcell run. In this study we sought to determine the optimal minimal running time for flowcells when acquiring reads for hybrid assembly. We then evaluated whether the ONT wash kit might allow users to exploit shorter running times by sequencing multiple libraries per flowcell. After 24 hours of sequencing, most chromosomes and plasmids had circularised and there was no benefit associated with longer running times. Quality was similar at 12 hours suggesting shorter running times are likely to be acceptable for certain applications (e.g. plasmid genomics). The ONT wash kit was highly effective in removing DNA between libraries. Contamination between libraries did not appear to affect subsequent hybrid assemblies, even when the same barcodes were used successively on a single flowcell. Utilising shorter run-times in combination with between-library nuclease washes allows at least 36 Enterobacteriaceae isolates to be sequenced per flowcell, significantly reducing the per isolate sequencing cost. Ultimately this will facilitate large-scale studies utilising hybrid assembly advancing our understanding of the genomics of key human pathogens.Data SummaryRaw sequencing data is available via NCBI under project accession number PRJNA604975. Sample accession numbers are provided in table S1.Assemblies are available via Figshare https://doi.org/10.6084/m9.figshare.11816532.v1.Impact StatementMost existing sequencing data has been acquired from short-read platforms (eg. Illumina). For some species of bacteria, clinically important genes, such as those involved in antibiotic resistance and/or virulence, are carried on plasmids. Whilst Illumina sequencing is highly accurate, it is generally unable to resolve complete genomic structures due to repetitive regions. Hybrid assembly uses long reads to scaffold together short-read contigs, maximising the benefits of both technologies. A major limiting factor to using hybrid assemblies at scale is the cost of sequencing the same isolate with two different technologies. Here we show that high-quality hybrid assemblies can be created for most isolates using significantly shorter run-times than are currently standard. We demonstrate that a simple washing step allows several libraries to be run on the same flowcell, facilitating the ability to take advantage of shorter running times. Adding nuclease means that contamination between libraries is minimal and has no significant effect on the quality of subsequent hybrid assemblies. This approach reduces the cost of acquiring long reads by >30%, paving the way for large-scale studies utilising hybrid assemblies which will likely significantly enhance our understanding of the genomics of important human pathogens.
Cold Spring Harbor Laboratory