How to build a better kill switch
In short, safe interruptibility means you build a kill switch that isn’t likely to be disabled by the agent you built it to control. Been hearing a lot about a ‘big red button’ in AI news this week? Here’s the paper that launched a thousand articles.













