Understanding the Impact of Experiment Design for Evaluating Dialogue System Output

ACL 2020