Regret bounds for sleeping experts and bandits

Kleinberg, Robert; Niculescu-Mizil, Alexandru; Sharma, Yogeshwer
September 2010
Machine Learning;Sep2010, Vol. 80 Issue 2/3, p245
Academic Journal
We study on-line decision problems where the set of actions that are available to the decision algorithm varies over time. With a few notable exceptions, such problems remained largely unaddressed in the literature, despite their applicability to a large number of practical problems. Departing from previous work on this “Sleeping Experts” problem, we compare algorithms against the payoff obtained by the best ordering of the actions, which is a natural benchmark for this type of problem. We study both the full-information (best expert) and partial-information (multi-armed bandit) settings and consider both stochastic and adversarial rewards models. For all settings we give algorithms achieving (almost) information-theoretically optimal regret bounds (up to a constant or a sub-logarithmic factor) with respect to the best-ordering benchmark.


Related Articles

  • Classification with guaranteed probability of error. Campi, Marco // Machine Learning;Jul2010, Vol. 80 Issue 1, p63 

    We introduce a general-purpose classifier that we call the Guaranteed Error Machine, or GEM, and two learning algorithms that are used for the training of GEM, a real GEM algorithm and an ideal GEM algorithm. The real GEM algorithm is for use in real applications, while the ideal GEM algorithm...

  • Extracting certainty from uncertainty: regret bounded by variation in costs. Hazan, Elad; Kale, Satyen // Machine Learning;Sep2010, Vol. 80 Issue 2/3, p165 

    Prediction from expert advice is a fundamental problem in machine learning. A major pillar of the field is the existence of learning algorithms whose average loss approaches that of the best expert in hindsight (in other words, whose average regret approaches zero). Traditionally the regret of...

  • Efficient Leave-m-out Cross-Validation of Support Vector Regression by Generalizing Decremental Algorithm. Karasuyama, Masayuki; Takeuchi, Ichiro; Nakano, Ryohei // New Generation Computing;2009, Vol. 27 Issue 4, p307 

    We propose a computationally efficient method for cross-validation of the Support Vector Regression (SVR) by generalizing the decremental algorithm of SVR. Incremental and decremental algorithm of Support Vector Machines (SVM)) efficiently update the trained SVM model when a single data point is...

  • Bounded Kernel-Based Online Learning. Orabona, Francesco; Keshet, Joseph; Caputo, Barbara // Journal of Machine Learning Research;11/1/2009, Vol. 10 Issue 11, p2643 

    A common problem of kernel-based online algorithms, such as the kernel-based Perceptron algorithm, is the amount of memory required to store the online hypothesis, which may increase without bound as the algorithm progresses. Furthermore, the computational load of such algorithms grows linearly...

  • Large Scale Online Learning of Image Similarity Through Ranking. Chechik, Gal; Sharma, Varun; Shalit, Uri; Bengio, Samy // Journal of Machine Learning Research;3/1/2010, Vol. 11 Issue 3, p1109 

    Learning a measure of similarity between pairs of objects is an important generic problem in machine learning. It is particularly useful in large scale applications like searching for an image that is similar to a given image or finding videos that are relevant to a given video. In these tasks,...

  • Machine Learning with Data Dependent Hypothesis Classes. Cannon, Adam; Ettinger, J. Mark; Hush, Don; Scovel, Clint // Journal of Machine Learning Research;Summer2002, Vol. 2 Issue 3, p335 

    We extend the VC theory of statistical learning to data dependent spaces of classifiers. This theory can be viewed as a decomposition of classifier design into two components; the first component is a restriction to a data dependent hypothesis class and the second is empirical risk minimization...

  • Introduction to the Special Issue on Computational Learning Theory. Long, Philip M. // Journal of Machine Learning Research;Apr2003, Vol. 3 Issue 3, p361 

    Introduces articles in the special issue of the 'Journal of Machine Learning Research' on computational learning theory. 'Tracking a Small Set of Experts by Mixing Past Posteriors'; 'Using Confidence Bounds for Exploitation-Exploration Tradeoffs.'

  • On Boosting with Polynomially Bounded Distributions. Bshouty, Nader H.; Gavinsky, Dmitry // Journal of Machine Learning Research;Apr2003, Vol. 3 Issue 3, p483 

    We construct a framework which allows an algorithm to turn the distributions produced by some boosting algorithms into polynomially smooth distributions (w.r.t. the PAC oracle's distribution), with minimal performance loss. Further, we explore the case of Freund and Schapire's AdaBoost...

  • NP-hardness of Euclidean sum-of-squares clustering. Aloise, Daniel; Deshpande, Amit; Hansen, Pierre; Popat, Preyas // Machine Learning;May2009, Vol. 75 Issue 2, p245 

    A recent proof of NP-hardness of Euclidean sum-of-squares clustering, due to Drineas et al. (Mach. Learn. 56:9–33, ), is not valid. An alternate short proof is provided.


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics