Exploring Text-Guided Synthetic Distribution Shifts for Robust Image Classification

Document Type

Conference Proceeding

Publication Date



The empirical risk minimization approach of contemporary machine learning leads to potential failures under distribution shifts. While out-of-distribution data can be used to probe for robustness issues, collecting this at scale in the wild can be difficult given its nature. We propose a novel method to generate this data using pretrained foundation models. We train a language model to generate class-conditioned image captions that minimize their cosine similarity with that of corresponding class images from the original distribution. We then use these captions to synthesize new images with off-the-shelf text-to-image generative models. We show our method’s ability to generate samples from shifted distributions, and the quality of the data for both robustness testing and as additional training data to improve generalization.