On the Noisy Gradient Descent that Generalizes as SGD

ICML 2020