Inspecting state-of-the-art performance and NLP metrics in image-based medical report generation