Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Lu Jiang, Di Huang, Mason Liu, Weilong Yang
ICML 2020
https://arxiv.org/abs/1911.09781
Based on our findings, we have the following practical recommendations for training deep neural networks on noisy data:
1. A simple way to deal with noisy labels is fine-tuning a preateind model. The better the pretrained model is, the better it may generalize on the downstream noisy training task.
2. Early stopping may not be effective on the real-world label noise from the web.
3. The real-world label noise from the web appears to be less harmful, yet it is more difficult to tackle.
4. Methods that perform well on synthetic noise may not work as well on the real-world noisy labels from the web.
5. The proposed MentorMix overcomes both synthetic and real-world noisy labels.