Human-centred mechanism design with democratic AI research

Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which deep reinforcement learning is used to design a social mechanism that humans will vote for by majority. A large group of humans played an online investment game that involved deciding whether to keep an endowment or share it for collective benefit. Shared revenue was returned to players under two different redistribution mechanisms, one designed by the AI and the other by humans. The AI discovered a mechanism that redressed initial wealth imbalance, sanctioned free riders, and successfully won the majority vote. By optimizing for human preference directly, Democratic AI may be a promising method for value alignment.

Authors' notes