Transformers generalize differently from in-context and in-weights information

Transformer models have a powerful dual ability to utilize two kinds of information: information stored in weights during training, and information provided only via the inputs presented at inference time (known as "in-context learning"). However, it is unknown whether generalization from in-weights vs in-context exhibit similar inductive biases. In this work, we show that transformers exhibit different inductive biases in these two modes. When transformers are meta-trained for few-shot learning from context, they are biased towards exemplar-based generalization from in-context information. In contrast, transformers are biased towards sparse rule-based extrapolation when generalizing from in-weights information. However, large language pretrained transformer models exhibit partially rule-based generalization even from novel in-context information. Finally, we show that in-context learning can be pushed towards rule-based generalization by changing the training data, providing a potential explanation for language model behavior. In-context learning is now ubiquitously used for efficiently imparting task specifications to large pre-trained models; understanding their generalization behaviors (and how to shape them through the training data) is of important practical consequence.