How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

ACL 2020