On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms

Shalev-Shwartz, Shai; Singer, Yoram
September 2010
Machine Learning;Sep2010, Vol. 80 Issue 2/3, p141
Academic Journal
Boosting algorithms build highly accurate prediction mechanisms from a collection of low-accuracy predictors. To do so, they employ the notion of weak-learnability. The starting point of this paper is a proof which shows that weak learnability is equivalent to linear separability with ℓ1 margin. The equivalence is a direct consequence of von Neumann’s minimax theorem. Nonetheless, we derive the equivalence directly using Fenchel duality. We then use our derivation to describe a family of relaxations to the weak-learnability assumption that readily translates to a family of relaxations of linear separability with margin. This alternative perspective sheds new light on known soft-margin boosting algorithms and also enables us to derive several new relaxations of the notion of linear separability. Last, we describe and analyze an efficient boosting framework that can be used for minimizing the loss functions derived from our family of relaxations. In particular, we obtain efficient boosting algorithms for maximizing hard and soft versions of the ℓ1 margin.


Related Articles

  • A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data. Ando, Rie Kubota; Tong Zhang; Bartlett, Peter // Journal of Machine Learning Research;11/1/2005, Vol. 6 Issue 11, p1817 

    One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such...

  • On boosting kernel density methods for multivariate data: density estimation and classification. Marzio, Marco; Taylor, Charles // Statistical Methods & Applications;2005, Vol. 14 Issue 2, p163 

    Statistical learning is emerging as a promising field where a number of algorithms from machine learning are interpreted as statistical methods and vice-versa. Due to good practical performance, boosting is one of the most studied machine learning techniques. We propose algorithms for...

  • Three New MDL-Based Pruning Techniques for Robust Rule Induction. Pham, D. T.; Afify, A. A. // Proceedings of the Institution of Mechanical Engineers -- Part C;Apr2006, Vol. 220 Issue 4, p553 

    Overfitting the training data is a major problem in machine learning, particularly when noise is present. Overfitting increases learning time and reduces both the accuracy and the comprehensibility of the generated rules, making learning from large data sets more difficult. Pruning is a...

  • Preference-based learning to rank. Ailon, Nir; Mohri, Mehryar // Machine Learning;Sep2010, Vol. 80 Issue 2/3, p189 

    This paper presents an efficient preference-based ranking algorithm running in two stages. In the first stage, the algorithm learns a preference function defined over pairs, as in a standard binary classification problem. In the second stage, it makes use of that preference function to produce...

  • Hedging Predictions in Machine Learning. Alexander Gammerman // Computer Journal;2007, Vol. 50 Issue 2, p151 

    Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This article describes a new technique for ‘hedging’ the predictions output by many such algorithms, including support vector machines,...

  • Data Preprocessing for Supervised Leaning. Kotsiantis, S. B.; Kanellopoulos, D.; Pintelas, P. E. // Enformatika;2006, Vol. 12, p277 

    Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is...

  • Inferring latent task structure for Multitask Learning by Multiple Kernel Learning. Widmer, Christian; Toussaint, Nora C.; Altun, Yasemin; Ratsch, Gunnar // BMC Bioinformatics;2010 Supplement 8, Vol. 11, p1 

    Background: The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available...

  • A study of the effect of different types of noise on the precision of supervised learning techniques. Nettleton, David F.; Orriols-Puig, Albert; Fornells, Albert // Artificial Intelligence Review;Apr2010, Vol. 33 Issue 4, p275 

    Machine learning techniques often have to deal with noisy data, which may affect the accuracy of the resulting data models. Therefore, effectively dealing with noise is a key aspect in supervised learning to obtain reliable models from data. Although several authors have studied the effect of...

  • Efficient greedy feature selection for unsupervised learning. Farahat, Ahmed; Ghodsi, Ali; Kamel, Mohamed // Knowledge & Information Systems;May2013, Vol. 35 Issue 2, p285 

    Reducing the dimensionality of the data has been a challenging task in data mining and machine learning applications. In these applications, the existence of irrelevant and redundant features negatively affects the efficiency and effectiveness of different learning algorithms. Feature selection...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics