Detecting AI Sabotage: Insights from the SHADE-Arena Project As artificial intelligence becomes more powerful, ensuring these systems act in our best interests is more important than ever. Recent work from Anthropic , through the SHADE-Arena project, addresses ... agentic behavior AI alignment AI safety language models monitoring tools sabotage detection SHADE-Arena