Développement d'outils bioinformatiques et de méthodologies d'apprentissage machine pour une meilleure compréhension des éléments génétiques sous-jacents à la susceptibilité au cancer du sein

Authors: Lemaçon, Audrey
Advisor: Droit, ArnaudSimard, Jacques
Abstract: Breast cancer is one of the leading causes of death from cancer among Canadian women (about 1 in 8 Canadian women will develop breast cancer during her lifetime and 1 in 31 will die from the disease). Evidence suggests that most breast cancer cases develop in a small proportion of women with a genetic susceptibility to the disease. Since the personalized assessment of this risk is based on the certainty that women can be divided into several groups according to their inherent genetic risk, it is essential to identify the actors responsible for this genetic susceptibility to breast cancer in order to offer these at-risk women, personalized preventive measures. Thus, since the discovery of the associated genes BRCA1 in 1994 and BRCA2 in 1995, tremendous efforts have been made to identify the genetic components underlying breast cancer risk and many other deleterious mutations have been uncovered in susceptibility genes such as PTEN, PALB2 or CHEK2. Unfortunately, despite these efforts, the susceptibility genes/loci known to date only explain about half of the genetic risk associated with this disease. Acknowledging the challenges, many international groups have partnered in consortia such as the Breast Cancer Consortium (BCAC) or the Consortium of Investigators of Modiers of BRCA1/2 (CIMBA) to join their resources for the identication of what has been called breast cancer "missing heritability". Several hypotheses have been formulated as to the sources of this missing heritability and, among these hypotheses, we have explored two. First, we tested the hypothesis of many common low penetrance genetic variants still to be discovered through a large genome-wide association study conducted within the OncoArray Network. In a second step, we tested the hypothesis according to which rarer variants of higher penetrance, could be discovered in the coding regions of the genome, through the evaluation of the predictive power of these variants by an innovative approach of exomes data analysis. Thus, we were able to demonstrate the veracity of the rst hypothesis by the discovery of 65 new loci associated with overall breast cancer susceptibility. In addition, these studies having highlighted the need for assistance tools for prioritization analysis, we developed two softwares to help prioritize human genetic variants. Finally, we developed a new multi-step methodology, combining the analysis of genotypes and haplotypes in order to assess the predictive power of coding variants. This approach, taking advantage of the power of machine learning, enabled the identication of new credible coding markers (variants alone or combined into haplotypes), signicantly associated with the phenotype. For susceptibility loci as well as for candidate genes identied during the analysis of exome data, it will be essential to conrm their involvement and effect size on large external sample sets and then perform their functional characterization. If they are validated, their integration into current risk prediction tools could help promote early management and well-calibrated therapeutic interventions for at-risk women.
Document Type: Thèse de doctorat
Issue Date: 2019
Open Access Date: 10 July 2019
Permalink: http://hdl.handle.net/20.500.11794/35418
Grantor: Université Laval
Collection:Thèses et mémoires

Files in this item:
Description SizeFormat 
35148_Annexe.zip22 MBArchive ZIPView/Open
35148.pdf21.65 MBAdobe PDFThumbnail
All documents in CorpusUL are protected by Copyright Act of Canada.