Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning