Contributed talk: Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks

NeurIPS 2019