Authors: Xinting Hu, Yi Jiang, Kaihua Tang, Jingyuan Chen, Chunyan Miao, Hanwang Zhang Description: Real-world visual recognition requires handling the extreme sample imbalance in large-scale long-tailed data. We propose a “divide&conquer” strategy for the challenging LVIS task: divide the whole data into balanced parts and then apply incremental learning to conquer each one. This derives a novel learning paradigm: class-incremental few-shot learning, which is especially effective for the challenge evolving over time: 1) the class imbalance among the old class knowledge review and 2) the few-shot data in new-class learning. We call our approach Learning to Segment the Tail (LST). In particular, we design an instance-level balanced replay scheme, which is a memory-efficient approximation to balance the instance-level samples from the old-class images. We also propose to use a meta-module for new-class learning, where the module parameters are shared across incremental phases, gaining the learning-to-learn knowledge incrementally, from the data-rich head to the data-poor tail. We empirically show that: at the expense of a little sacrifice of head-class forgetting, we can gain a significant 8.3% AP improvement for the tail classes with less than 10 instances, achieving an overall 2.0% AP boost for the whole 1,230 classes.