Silly Rules Enhance Learning of Compliance and Enforcement Behavior in Artificial Agents

Raphael Koster (DeepMind)

Dylan Hadfield-Menell (University of California, Berkeley)

Richard Everett (DeepMind)

Laura Weidinger (DeepMind)

Gillian Hadfield (University of Toronto)

Joel Leibo (DeepMind)

Abstract: How do societies learn and maintain social norms? Here we use multi-agent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit-assignment problem in discovering delayed health effects. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This “silly rule” counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. Our results highlight the benefit of employing a computational model focused on learning to implement complex actions.

: Download the paper