Streamlining CRISPR spacer-based bacterial host predictions to decipher the viral dark matter

Authors: Dion, MoïraPlante, Pier-LucZufferey, Edwige; Shah, Shiraz A.; Corbeil, JacquesMoineau, Sylvain
Abstract: Thousands of new phages have recently been discovered thanks to viral metagenomics. These phages are extremely diverse and their genome sequences often do not resemble any known phages. To appreciate their ecological impact, it is important to determine their bacterial hosts. CRISPR spacers can be used to predict hosts of unknown phages, as spacers represent biological records of past phage-bacteria interactions. However, no guidelines have been established to standardize host prediction based on CRISPR spacers. Additionally, there are no tools that use spacers to perform host predictions on large viral datasets. Here, we developed a set of tools that includes all the necessary steps for predicting the hosts of uncharacterized phages. We created a database of more than 11 million spacers and a program to execute host predictions on large viral datasets. Our host prediction approach uses biological criteria inspired by how CRISPR-Cas naturally work as adaptive immune systems, which make the results easy to interpret. We evaluated the performance using 9 484 phages with known hosts and obtained a recall of 49% and a precision of 70%. We also found that this host prediction method yielded higher performance for phages that infect gut-associated bacteria, suggesting it is well suited for gut-virome characterization.
Document Type: Article de recherche
Issue Date: 2 March 2021
Open Access Date: 15 April 2021
Document version: VoR
Creative Commons Licence: https://creativecommons.org/licenses/by/4.0
Permalink: http://hdl.handle.net/20.500.11794/68819
This document was published in: Nucleic acids research (2021)
https://doi.org/10.1093/nar/gkab133
Information Retrieval Limited
Alternative version: 10.1093/nar/gkab133
Collection:Articles publiés dans des revues avec comité de lecture

Files in this item:
Description SizeFormat 
NAR-03454-H-2020.R1_Proof_hi copie.pdf1.51 MBAdobe PDFThumbnail
View/Open
All documents in CorpusUL are protected by Copyright Act of Canada.