The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models