This episode is an interview with Daniel Levy from Stanford University, discussing highlights from his paper, "Necessary and Sufficient Geometries for Gradient Methods," accepted as an oral presentation at NeurIPS 2019 Conference.
Daniel is a second-year Ph.D. student in Computer Science at Stanford University advised by John Duchi. His research interests are in optimization and machine learning. Previously, he was an M.S. student at Stanford University advised by Stefano Ermon, working on probabilistic models and reinforcement learning. He completed his undergraduate studies at Lycée Louis-Le-Grand and Ecole Polytechnique from which he obtained a B.S. and a M.S. in 2014 and 2015. He spent an internship at Google Brain in 2017 where he worked with Jascha Sohl-Dickstein and Matt Hoffman.
Paper At A Glance:
We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex—for example, any `p-ball for p < 2—the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.