Efficiently Breaking the Curse of Horizon with Double Reinforcement Learning