Abstract: Deep Reinforcement Learning has proven its success for many difficult control problems by automatically learning policies using Neural Networks. However, understanding decision rules of learned policy has been limited by the inability to interpret black box policy model. We propose the Deep Symbolic Policy approach to learn interpretable symbolic policies in a given environment. We use RNN to generate the mathematical equation of the policy and employ reinforcement learning to train the RNN to maximize the performance of the generated policy by rolling out episodes using the generated policy. Additionally, we propose a method to reliably scale DSP to the environments with multiple actions by combining generated symbolic policies with distillated policies from pre-trained deep-RL algorithm.
We compare accuracy of generated policy with state-of-the-art deep-RL algorithms and analyze interpretability and stability for each control task. By applying our method to several classical control tasks, we show that our method produces fully explainable and stable symbolic policy which reliably captures mathematical dynamics of control tasks.
Symbolic policy discovered from our approach outperforms than or at least equally well performing as Deep RL baselines.