OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning