Representations are crucial to the successful generalization of machine learning methods. In self-supervised learning representations are learned without labels for use on unknown downstream tasks. What exactly comprises a useful representation in this setting is difficult to specify formally. We show how ideas from causality allow us to formalize intuitions about what constitutes useful representations. To this end, we provide a principled, theoretical treatment of data augmentations and proxy tasks in self-supervised learning. Using our causal representation learning framework, we relate several recent approaches to self-supervised learning, e.g. CPC, AMDIM, SimCLR. Further, we derive novel objectives based on invariant proxy prediction across augmentations and discuss their theoretical properties. Empirically we find that our method performs competitively with recent leading approaches on image classification tasks on CIFAR100 and ImageNet.