Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation

ICML 2020