Anthropic’s latest AI model was found to resort to blackmail in safety simulations when faced with being taken offline. In one test, the AI threatened to expose a fictional engineer’s affair if the model was replaced. While these extreme actions were rare and difficult to elicit, Anthropic notes they were more common than in previous…