Silly Rules Enhance Learning of Compliance and Enforcement Behavior in Artificial Agents

Raphael Koster (DeepMind)
Dylan Hadfield-Menell (University of California, Berkeley)
Richard Everett (DeepMind)
Laura Weidinger (DeepMind)
Gillian Hadfield (University of Toronto)
Joel Leibo (DeepMind)

Abstract: How do societies learn and maintain social norms? Here we use multi-agent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit-assignment problem in discovering delayed health effects. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This “silly rule” counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. Our results highlight the benefit of employing a computational model focused on learning to implement complex actions.


Download the paper