Apprentissage automatique avec garanties de généralisation à l'aide de méthodes d'ensemble maximisant le désaccord
|Advisor:||Laviolette, François; Marchand, Mario|
|Abstract:||We focus on machine learning, a branch of artificial intelligence. When solving a classification problem, a learning algorithm is provided labelled data and has the task of learning a function that will be able to automatically classify future, unseen data. Many classical learning algorithms are designed to combine simple classifiers by building a weighted majority vote classifier out of them. In this thesis, we extend the usage of the C-bound, bound on the risk of the majority vote classifier. This bound is defined using two quantities : the individual performance of the voters, and the correlation of their errors (their disagreement). First, we design majority vote generalization bounds based on the C-bound. Then, we extend this bound from binary classification to generalized majority votes. Finally, we develop new learning algorithms with state-of-the-art performance, by constructing majority votes that maximize the voters’ disagreement, while controlling their individual performance. The generalization guarantees that we develop in this thesis are in the family of PAC-Bayesian bounds. We generalize the PAC-Bayesian theory by introducing a general theorem, from which the classical bounds from the literature can be recovered. Using this same theorem, we introduce generalization bounds based on the C-bound. We also simplify the proof process of PAC-Bayesian theorems, easing the development of new families of bounds. We introduce two new families of PAC-Bayesian bounds. One is based on a different notion of complexity than usual bounds, the Rényi divergence, instead of the classical Kullback-Leibler divergence. The second family is specialized to transductive learning, instead of inductive learning. The two learning algorithms that we introduce, MinCq and CqBoost, output a majority vote classifier that maximizes the disagreement between voters. An hyperparameter of the algorithms gives a direct control over the individual performance of the voters. These two algorithms being designed to minimize PAC-Bayesian generalization bounds on the risk of the majority vote classifier, they come with rigorous theoretical guarantees. By performing an empirical evaluation, we show that MinCq and CqBoost perform as well as classical stateof- the-art algorithms.|
|Document Type:||Thèse de doctorat|
|Open Access Date:||3 May 2018|
|Collection:||Thèses et mémoires|
All documents in CorpusUL are protected by Copyright Act of Canada.