Algorithmes d'apprentissage automatique pour la conception de composés pharmaceutiques et de vaccins
|Advisor:||Marchand, Mario; Corbeil, Jacques|
|Abstract:||The discovery of pharmaceutical compounds is currently too time-consuming, too expensive, and the failure rate is too high. Biochemical and genomic databases continue to grow and it is now impracticable to interpret these data. A radical change is needed; some steps in this process must be automated. Peptides are molecules that play an important role in the immune system and in cell signaling. Their favorable properties make them prime candidates for initiating the design of new drugs and assist in the design of vaccines. In addition, modern synthesis techniques can quickly generate these molecules at low cost. Statistical learning algorithms are well suited to manage large amount of data and to learn models in an automated fashion. These methods and peptides thus offer a solution of choice to the challenges facing pharmaceutical research. We propose a kernel for learning statistical models of biochemical phenomena involving peptides. This allows, among other things, to learn a universal model that can reasonably quantify the binding energy between any peptide sequence and any binding site of a protein. In addition, it unifies the theory of many existing string kernels while maintaining a low computational complexity. This kernel is particularly suitable for quantifying the interaction between antigens and proteins of the major histocompatibility complex. We provide a tool to predict peptides that are likely to be processed by the antigen presentation pathway. This tool has won an international competition and has several applications in immunology, including vaccine design. Ultimately, a peptide should maximize the interaction with a target protein or maximize bioactivity in the host. We formalize this problem as a structured prediction problem. Then, we propose an algorithm exploiting the longest paths in a graph to identify peptides maximizing the predicted bioactivity of a previously learned model. We validate this new approach in the laboratory with the discovery of new antimicrobial peptides. Finally, we provide PAC-Bayes bound for two structured prediction algorithms, one of which is new.|
|Document Type:||Thèse de doctorat|
|Open Access Date:||23 April 2018|
|Collection:||Thèses et mémoires|
All documents in CorpusUL are protected by Copyright Act of Canada.