How aligned are different alignment metrics?

Published: 2 March 2024

Abstract

In recent years, various methods and benchmarks have been proposed to empirically evaluate the alignment of artificial neural networks to human neural and behavioral data. But how aligned are different alignment metrics? To answer this question, we here analyze visual data from Brain-Score (Schrimpf et al., 2018), including metrics from the model-vs-human toolbox (Geirhos et al., 2021), together with human feature alignment (Linsley et al., 2018; Fel et al., 2022) and human similarity judgements (Muttenthaler et al., 2022). We find that pairwise correlations between neural scores and behavioral scores are quite low and sometimes even negative. For instance, the average correlation between those 95 models on Brain-Score that were fully evaluated on all 51 alignment metrics is only 0.161. Assuming that all of the employed metrics are sound, this implies that alignment with human perception may best be thought of as a multidimensional concept, with different methods measuring fundamentally different aspects. Our results underline the importance of integrative benchmarking, but also raise questions about how to correctly combine and aggregate individual metrics. Aggregating by taking the arithmetic average, as done in Brain-Score, leads to the overall performance currently being dominated by behavior (81.24% explained variance) while the neural predictivity plays a less important role (only 67.31% explained variance). As a first step towards making sure that different alignment metrics all contribute towards aggregated scores, we therefore conclude by comparing three different aggregation options.

Authors

Jannis Ahlert, Thomas Klein, Felix A. Wichmann, Robert Geirhos

Venue

ICLR Workshop 2024

How aligned are different alignment metrics?

Share

Abstract

Authors

Venue