Authors: Amir Hossein Farzaneh, Xiaojun Qi Description: Facial Expression Recognition (FER) has demonstrated remarkable progress due to the advancement of deep Convolutional Neural Networks (CNNs). FER's goal as a visual recognition problem is to learn a mapping from the facial embedding space to a set of fixed expression categories using a supervised learning algorithm. Softmax loss as the de facto standard in practice fails to learn discriminative features for efficient learning. Center loss and its variants as promising solutions increase deep feature discriminability in the embedding space and enable efficient learning. They fundamentally aim to maximize intra-class similarity and inter-class separation in the embedding space. However, center loss and its variants ignore the underlying extreme class imbalance in challenging wild FER datasets. As a result, they lead to a separation bias toward majority classes and leave minority classes overlapped in the embedding space. In this paper, we propose a novel Discriminant Distribution-Agnostic loss (DDA loss) to optimize the embedding space for extreme class imbalance scenarios. Specifically, DDA loss enforces inter-class separation of deep features for both majority and minority classes. Any CNN model can be trained with the DDA loss to yield well separated deep feature clusters in the embedding space. We conduct experiments on two popular large-scale wild FER datasets (RAF-DB and AffectNet) to show the discriminative power of the proposed loss function.