Upper Confidence Reinforcement Learning with Value Targeted Regression

ICML 2020