Adaptation of Word-Level Benchmark Datasets for Relation-Level Metaphor Identification