Data Parameters: A New Family of Parameters for Learning a Differentiable Curriculum
authors Shreyas Saxena, Oncel Tuzel, Dennis DeCoste
Recent works have shown that learning from easier instances first can help deep neural networks (DNNs) generalize better. However, knowing which data to present during different stages of training is a challenging problem. In this work, we address this problem by introducing data parameters. More specifically, we equip each sample and class in a dataset with a learnable parameter (data parameters), which governs their importance in the learning process. During training, at each iteration, as we update the model parameters, we also update the data parameters. These updates are done by gradient descent and do not require hand-crafted rules or design. When applied to image classification task on CIFAR10, CIFAR100, WebVision and ImageNet datasets, and object detection task on KITTI dataset, learning a dynamic curriculum via data parameters leads to consistent gains, without any increase in model complexity or training time. When applied to a noisy dataset, the proposed method learns to learn from clean images and improves over the state-of-the-art methods by 14%. To the best of our knowledge, our work is the first curriculum learning method to show gains on large scale image classification and detection tasks.
Apple attended the 33rd Conference and Workshop on Neural Information Processing Systems (NeurIPS) in December. The conference took place in Vancouver, Canada from December 8th to 14th.
Apple product teams are engaged in state of the art research in machine hearing, speech recognition, natural language processing, machine translation, text-to-speech, and artificial intelligence, improving the lives of millions of customers every day.
Most successful examples of neural nets today are trained with supervision. However, to achieve high accuracy, the training sets need to be large, diverse, and accurately annotated, which is costly. An alternative to labelling huge amounts of data is to use synthetic images from a simulator. This is cheap as there is no labeling cost, but the synthetic images may not be realistic enough, resulting in poor generalization on real test images. To help close this performance gap, we've developed a method for refining synthetic images to make them look more realistic. We show that training models on these refined images leads to significant improvements in accuracy on various machine learning tasks.