Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation in Stochastic Multi-Armed Bandits

ICML 2020