Evaluating a Bi-LSTM Model for Metaphor Detection in TOEFL Essays