Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Mar 11, 2022
|
33 views
Details
#deepmind #rl #society This is an in-depth paper review, followed by an interview with the papers' authors! Society is ruled by norms, and most of these norms are very useful, such as washing your hands before cooking. However, there also exist plenty of social norms which are essentially arbitrary, such as what hairstyles are acceptable, or what words are rude. These are called "silly rules". This paper uses multi-agent reinforcement learning to investigate why such silly rules exist. Their results indicate a plausible mechanism, by which the existence of silly rules drastically speeds up the agents' acquisition of the skill of enforcing rules, which generalizes well, and therefore a society that has silly rules will be better at enforcing rules in general, leading to faster adaptation in the face of genuinely useful norms. OUTLINE: 0:00 - Intro 3:00 - Paper Overview 5:20 - Why are some social norms arbitrary? 11:50 - Reinforcement learning environment setup 20:00 - What happens if we introduce a "silly" rule? 25:00 - Experimental Results: how silly rules help society 30:10 - Isolated probing experiments 34:30 - Discussion of the results 37:30 - Start of Interview 39:30 - Where does the research idea come from? 44:00 - What is the purpose behind this research? 49:20 - Short recap of the mechanics of the environment 53:00 - How much does such a closed system tell us about the real world? 56:00 - What do the results tell us about silly rules? 1:01:00 - What are these agents really learning? 1:08:00 - How many silly rules are optimal? 1:11:30 - Why do you have separate weights for each agent? 1:13:45 - What features could be added next? 1:16:00 - How sensitive is the system to hyperparameters? 1:17:20 - How to avoid confirmation bias? 1:23:15 - How does this play into progress towards AGI? 1:29:30 - Can we make real-world recommendations based on this? 1:32:50 - Where do we go from here? Paper: https://www.pnas.org/doi/10.1073/pnas.2106028118 Blog: https://deepmind.com/research/publications/2021/Spurious-normativity-enhances-learning-of-compliance-and-enforcement-behavior-in-artificial-agents Abstract: The fact that humans enforce and comply with norms is an important reason why humans enjoy higher levels of cooperation and welfare than other animals. Some norms are relatively easy to explain; they may prohibit obviously harmful or uncooperative actions. But many norms are not easy to explain. For example, most cultures prohibit eating certain kinds of foods and almost all societies have rules about what constitutes appropriate clothing, language, and gestures. Using a computational model focused on learning shows that apparently pointless rules can have an indirect effect on welfare. They can help agents learn how to enforce and comply with norms in general, improving the group’s ability to enforce norms that have a direct effect on welfare. Authors: Raphael Köster, Dylan Hadfield-Menell, Richard Everett, Laura Weidinger, Gillian K. Hadfield, Joel Z. Leibo Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

0:00 - Intro 3:00 - Paper Overview 5:20 - Why are some social norms arbitrary? 11:50 - Reinforcement learning environment setup 20:00 - What happens if we introduce a "silly" rule? 25:00 - Experimental Results: how silly rules help society 30:10 - Isolated probing experiments 34:30 - Discussion of the results 37:30 - Start of Interview 39:30 - Where does the research idea come from? 44:00 - What is the purpose behind this research? 49:20 - Short recap of the mechanics of the environment 53:00 - How much does such a closed system tell us about the real world? 56:00 - What do the results tell us about silly rules? 1:01:00 - What are these agents really learning? 1:08:00 - How many silly rules are optimal? 1:11:30 - Why do you have separate weights for each agent? 1:13:45 - What features could be added next? 1:16:00 - How sensitive is the system to hyperparameters? 1:17:20 - How to avoid confirmation bias? 1:23:15 - How does this play into progress towards AGI? 1:29:30 - Can we make real-world recommendations based on this? 1:32:50 - Where do we go from here?
Comments
loading...