Authors: Songan Zhang, Huei Peng, Subramanya Nageshrao, H. Eric Tseng Description: Deep reinforcement learning methods have been considered and implemented for autonomous vehicle's decision-making in recent years. A key issue is that deep neural networks can be fragile to adversarial attacks through unseen inputs, and thus the reinforcement learning policy, that uses deep neural networks would be also fragile to malicious attacks or benign but out of distribution perturbations. In this paper, we address the latter issue: we focus on generating socially acceptable perturbations (SAP), so that the autonomous vehicle (AV agent under evaluation), instead of the challenging vehicle (challenger), is primarily responsible for the crash. In our process, one challenger is added to the environment and trained by deep reinforcement learning to generate the desired perturbation. The reward is designed so that the challenger aims to fail the AV agent in a socially acceptable way. After training the challenger, the AV agent policy is evaluated in both the original naturalistic environment and the environment with one challenger. The results show that the AV agent policy which is safe in the naturalistic environment has many crashes in the perturbed environment.