Access the full text.
Sign up today, get DeepDyve free for 14 days.
D. Haussler (1992)
Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning ApplicationsInf. Comput., 100
Jacob Abernethy (2010)
Can we learn to gamble efficiently
Sanjeev Arora, C. Lund, R. Motwani, M. Sudan, M. Szegedy (1992)
Proof verification and hardness of approximation problemsProceedings., 33rd Annual Symposium on Foundations of Computer Science
A. Kalai, Adam Klivans, Y. Mansour, R. Servedio (2005)
Agnostically learning halfspaces46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05)
N. Cesa-Bianchi, G. Lugosi (2006)
Prediction, learning, and games
(1979)
Computers and Intractability
Y. Freund, R. Schapire (1997)
A decision-theoretic generalization of on-line learning and an application to boosting
M. J. Kearns, R. E. Schapire, L. M. Sellie (1994)
Toward efficient agnostic learningMachine Learn., 17
Purushottam Kar, Bharath Sriperumbudur, Prateek Jain, H. Karnick (2013)
On the Generalization Ability of Online Learning Algorithms for Pairwise Loss FunctionsArXiv, abs/1305.2505
Shai Ben-David, D. Pál, S. Shalev-Shwartz (2009)
Agnostic Online Learning
N. Littlestone (1989)
From on-line to batch learning
E. Hazan, S. Kale, S. Shalev-Shwartz (2012)
Near-optimal algorithms for online matrix predictionProceedings of the 25th Annual Conference on Learning Theory, 23
Y. Freund, R. Schapire, Y. Singer, Manfred Warmuth (1997)
Using and combining predictors that specialize
M. Kearns, R. Schapire, Linda Sellie (1992)
Toward efficient agnostic learningMachine Learning, 17
A. Kalai, Varun Kanade, Y. Mansour (2012)
Reliable Agnostic LearningJ. Comput. Syst. Sci., 78
L. G. Valiant (1984)
A theory of the learnableCommun. ACM, 27
Varun Kanade, T. Steinke (2012)
Learning hurdles for sleeping expertsElectron. Colloquium Comput. Complex., TR11
(2012)
ACM Transactions on Computation Theory
Avrim Blum, Y. Mansour (2005)
From External to Internal Regret
J. Langford, Tong Zhang (2007)
The Epoch-Greedy algorithm for contextual multi-armed bandits
N. Cesa-Bianchi, A. Conconi, C. Gentile (2001)
On the generalization ability of on-line learning algorithmsIEEE Transactions on Information Theory, 50
C. Papadimitriou, M. Yannakakis (1991)
Optimization, approximation, and complexity classes
A. T. Kalai, V. Kanade, Y. Mansour (2009)
Reliable agnostic learningProceedings of the 22nd Annual Conference on Learning Theory.
Varun Kanade, H. McMahan, Brent Bryan (2009)
Sleeping Experts and Bandits with Stochastic Action Availability and Adversarial Rewards
S. Shalev-Shwartz, O. Shamir, Karthik Sridharan (2010)
Learning Kernel-Based Halfspaces with the Zero-One LossArXiv, abs/1005.3681
J. Håstad (2001)
Some optimal inapproximability resultsElectron. Colloquium Comput. Complex., TR97
Adam Klivans, R. Servedio (2001)
Learning DNF in time
Miroslav Dudík, Daniel Hsu, Satyen Kale, Nikos Karampatziakis, J. Langford, L. Reyzin, Tong Zhang (2011)
Efficient Optimal Learning for Contextual Bandits
Devdatt Dubhashi, A. Panconesi (2009)
Concentration of Measure for the Analysis of Randomized Algorithms
A. Beygelzimer, J. Langford, Lihong Li, L. Reyzin, R. Schapire (2010)
Contextual Bandit Algorithms with Supervised Learning Guarantees
Adam Klivans, Alexander Sherstov (2007)
A Lower Bound for Agnostically Learning Disjunctions
Adam Klivans, R. Servedio (2001)
Learning DNF in time 2 Õ(n 1/3 ) .
Robert Kleinberg, Alexandru Niculescu-Mizil, Yogeshwer Sharma (2010)
Regret bounds for sleeping experts and banditsMachine Learning, 80
A. T. Kalai, A. R. Klivans, Y. Mansour, R. A. Servedio (2005)
Agnostically learning halfspacesProceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. IEEE
David Editor
Artificial Intelligence and Language Processing a Theory of the Learnable
Elad Hazan, Satyen Kale, S. Shalev-Shwartz (2012)
Near-Optimal Algorithms for Online Matrix PredictionSIAM J. Comput., 46
J. Abernethy (2010)
Can we learn to gamble efficiently? (open problem)Proceedings of the 23rd Annual Conference on Learning Theory
We study the online decision problem in which the set of available actions varies over time, also called the sleeping experts problem. We consider the setting in which the performance comparison is made with respect to the best ordering of actions in hindsight. In this article, both the payoff function and the availability of actions are adversarial. Kleinberg et al. [2010] gave a computationally efficient no-regret algorithm in the setting in which payoffs are stochastic. Kanade et al. [2009] gave an efficient no-regret algorithm in the setting in which action availability is stochastic. However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. [2010]. We show that such an algorithm would imply an algorithm for PAC learning DNF, a long-standing important open problem. We also consider the setting in which the number of available actions is restricted and study its relation to agnostic-learning monotone disjunctions over examples with bounded Hamming weight.
ACM Transactions on Computation Theory (TOCT) – Association for Computing Machinery
Published: Jul 1, 2014
Keywords: Online learning
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.