Lipschitz Constrained Parameter Initialization for Deep Transformers

ACL 2020