Although convolutional neural networks (CNNs) are inspired by the mechanisms behind human visual systems, they diverge on many measures such as ambiguity or hardness. In this paper, we make a surprising discovery: there exists a (nearly) universal score function for CNNs whose correlation is statistically significant than the widely used model confidence with human visual hardness. We term this function as angular visual hardness (AVH) which is given by the normalized angular distance between a feature embedding and the classifier weights of the corresponding target category in a CNN. We conduct an in-depth scientific study. We observe that CNN models with the highest accuracy also have the best AVH scores. This agrees with an earlier finding that state-of-art models tend to improve on the classification of harder training examples. We find that AVH displays interesting dynamics during training: it quickly reaches a plateau even though the training loss keeps improving. This suggests the need for designing better loss functions that can target harder examples more effectively. Finally, we empirically show significant improvement in performance by using AVH as a measure of hardness in self-training methods for domain adaptation.
Speakers: Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar