On the training dynamics of deep networks with L2 regularization