View publication

Recent work (Feng et al., 2018) establishes the presence of short, uninterpretable input fragments that yield high confidence and accuracy in neural models. We refer to these as Minimal Prediction Preserving Inputs (MPPIs). In the context of question answering, we investigate competing hypotheses for the existence of MPPIs, including poor posterior calibration of neural models, lack of pretraining, and "dataset bias" (where a model learns to attend to spurious, non-generalizable cues in the training data). We discover a perplexing invariance of MPPIs to random training seed, model architecture, pretraining, and training domain. MPPIs demonstrate remarkable transferability across domains achieving significantly higher performance than comparably short queries. Additionally, penalizing over-confidence on MPPIs fails to improve either generalization or adversarial robustness. These results suggest the interpretability of MPPIs is insufficient to characterize generalization capacity of these models. We hope this focused investigation encourages more systematic analysis of model behavior outside of the human interpretable distribution of examples.

*Equal Contributions

Related readings and updates.

Apple at NAACL 2021

Apple is sponsoring the annual conference of North American Chapter of the Association for Computational Linguistics (NAACL). The conference takes place virtually from June 6 to 11.

See event details

MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering

Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers (MKQA), an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide…
See paper details