A new convergent variant of Q-learning with linear function approximation

NeurIPS 2020