Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Advances in artificial intelligence often stem from the development of new environments that abstract real-world situations into a form where research can be done conveniently. This paper contributes such an environment based on ideas inspired by elementary microeconomics. Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer. We show that the emergent production, consumption, and pricing behaviors respond to environmental conditions in the directions predicted by supply and demand shifts in Microeconomics. We also demonstrate settings where the agents' emergent prices for goods vary over space, reflecting the local abundance of goods in each region. After the price disparities emerge, some agents then discover a niche of transporting goods between regions with different prevailing prices---a profitable strategy because they can buy goods where they are cheap and sell them where they are expensive. Finally, in a series of ablation experiments, we investigate how choices in the bartering actions, agent architecture, and ability to consume tradable goods can either aid or inhibit the emergence of this economic behavior. This work is part of the environment development branch of a research program that aims to build human-like artificial general intelligence through multi-agent interactions in simulated societies. By considering what environment features were needed so that the basic phenomena of elementary micro-economics could emerge automatically from learning, we arrived at an environment that differs from those studied in prior multi-agent reinforcement learning work along several dimensions. For instance, the model incorporates heterogeneous tastes and physical abilities, and agents negotiate with one another as a grounded form of communication. To facilitate further work in this vein we will release an open-source implementation of the environment as part of the Melting Pot (Leibo 2021).