Agnostic Bayes

Authors: Lacoste, Alexandre
Advisor: Marchand, MarioLaviolette, François
Abstract: Machine learning is the science of learning from examples. Algorithms based on this approach are now ubiquitous. While there has been significant progress, this field presents important challenges. Namely, simply selecting the function that best fits the observed data was shown to have no statistical guarantee on the examples that have not yet been observed. There are a few learning theories that suggest how to address this problem. Among these, we present the Bayesian modeling of machine learning and the PAC-Bayesian approach to machine learning in a unified view to highlight important similarities. The outcome of this analysis suggests that model averaging is one of the key elements to obtain a good generalization performance. Specifically, one should perform predictions based on the outcome of every model instead of simply the one that best fits the observed data. Unfortunately, this approach comes with a high computational cost problem, and finding good approximations is the subject of active research. In this thesis, we present an innovative approach that can be applied with a low computational cost on a wide range of machine learning setups. In order to achieve this, we apply the Bayes’ theory in a different way than what is conventionally done for machine learning. Specifically, instead of searching for the true model at the origin of the observed data, we search for the best model according to a given metric. While the difference seems subtle, in this approach, we do not assume that the true model belongs to the set of explored model. Hence, we say that we are agnostic. An extensive experimental setup shows a significant generalization performance gain when using this model averaging approach during the cross-validation phase. Moreover, this simple algorithm does not add a significant computational cost to the conventional search of hyperparameters. Finally, this probabilistic tool can also be used as a statistical significance test to evaluate the quality of learning algorithms on multiple datasets.
Document Type: Thèse de doctorat
Issue Date: 2015
Open Access Date: 20 April 2018
Grantor: Université Laval
Collection:Thèses et mémoires

Files in this item:
31271.pdf6.57 MBAdobe PDFView/Open
All documents in CorpusUL are protected by Copyright Act of Canada.