Gradients are Not All You Need (Machine Learning Research Paper Explained)

Gradients are Not All You Need (Machine Learning Research Paper Explained)

Nov 22, 2021
|
33 views
Details
#deeplearning #backpropagation #simulation More and more systems are made differentiable, which means that accurate gradients of these systems' dynamics can be computed exactly. While this development has led to a lot of advances, there are also distinct situations where backpropagation can be a very bad idea. This paper characterizes a few such systems in the domain of iterated dynamical systems, often including some source of stochasticity, resulting in chaotic behavior. In these systems, it is often better to use black-box estimators for gradients than computing them exactly. OUTLINE: 0:00 - Foreword 1:15 - Intro & Overview 3:40 - Backpropagation through iterated systems 12:10 - Connection to the spectrum of the Jacobian 15:35 - The Reparameterization Trick 21:30 - Problems of reparameterization 26:35 - Example 1: Policy Learning in Simulation 33:05 - Example 2: Meta-Learning Optimizers 36:15 - Example 3: Disk packing 37:45 - Analysis of Jacobians 40:20 - What can be done? 45:40 - Just use Black-Box methods Paper: https://arxiv.org/abs/2111.05803 Abstract: Differentiable programming techniques are widely used in the community and are responsible for the machine learning renaissance of the past several decades. While these methods are powerful, they have limits. In this short report, we discuss a common chaos based failure mode which appears in a variety of differentiable circumstances, ranging from recurrent neural networks and numerical physics simulation to training learned optimizers. We trace this failure to the spectrum of the Jacobian of the system under study, and provide criteria for when a practitioner might expect this failure to spoil their differentiation based optimization algorithms. Authors: Luke Metz, C. Daniel Freeman, Samuel S. Schoenholz, Tal Kachman Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

0:00 - Foreword 1:15 - Intro & Overview 3:40 - Backpropagation through iterated systems 12:10 - Connection to the spectrum of the Jacobian 15:35 - The Reparameterization Trick 21:30 - Problems of reparameterization 26:35 - Example 1: Policy Learning in Simulation 33:05 - Example 2: Meta-Learning Optimizers 36:15 - Example 3: Disk packing 37:45 - Analysis of Jacobians 40:20 - What can be done? 45:40 - Just use Black-Box methods
Comments
loading...