Authors: Naman Bansal, Chirag Agarwal, Anh Nguyen Description: Attribution methods can provide powerful insights into the reasons for a classifier's decision. We argue that a key desideratum of an explanation is its robustness to input hyperparameter changes that are often randomly set or empirically tuned. High sensitivity to arbitrary hyperparameter choices does not only impede reproducibility but also questions the correctness of an explanation and impairs the trust by end-users. In this paper, we provide a thorough empirical study on the sensitivity of existing attribution methods. We found an alarming trend that many methods are highly sensitive to changes in their common hyperparameters e.g. even changing a random seed can yield a different explanation! In contrast, explanations generated for robust classifiers that are trained to be invariant to pixel-wise perturbations are surprisingly more robust. Interestingly, such sensitivity is not reflected in the average explanation correctness scores over the entire dataset as commonly reported in the literature.