Linear Mode Connectivity in Multitask and Continual Learning

The phenomena of Catastrophic Forgetting states the solution found for subsequent tasks do not perform well on the previous ones. This can happen, for example, in continual training of neural networks as opposed to multitask training, where there exist minima that perform well on all the tasks. Recent work showed that minima are connected under appropriate conditions by very simple curves, such as a polygonal chain of low error. In this work, we investigate this connectivity and its implications in continual learning where we are dealing with multiple tasks. We show that multitask and continual minima are linearly connected by a path on which the losses remains roughly the same. Furthermore, we exploit this property to design an effective algorithm that constrains the sequential minima to lie on these low-loss paths. We show that our method outperforms several state of the art continual learning algorithms in various computer vision benchmarks.

Authors' notes