Should I tear down this wall? Optimizing social metrics by evaluating novel actions

One of the fundamental challenges of governance is deciding when and how to intervene in multi-agent systems in order to impact group-wide metrics of success. This is particularly challenging when proposed interventions are novel and expensive. For example, one may wish to modify a building's layout to improve the efficiency of its escape route. Evaluating such interventions would generally require access to an elaborate simulator, which must be constructed ad-hoc for each environment, and can be prohibitively costly or inaccurate. Here we examine a simple alternative: Optimize By Observational Extrapolation (OBOE). The idea is to use observed behavioural trajectories, without any interventions, to learn predictive models mapping environment states to individual agent outcomes, and then use these to evaluate and select changes. We evaluate OBOE in socially complex gridworld environments and consider novel physical interventions that our models were not trained on. We show that neural network models trained to predict agent returns on baseline environments are effective at selecting among the interventions. Thus, OBOE can provide guidance for challenging questions like: "which wall should I tear down in order to minimize the Gini index of this group?"

Authors' notes