Blog Posts | Joshua Berkowitz

3 Articles

AI alignment ×

Can AI Models Scheme and How Can We Stop Them?

Recent advancements in artificial intelligence have introduced a subtle but urgent risk: models that may appear to follow human values while secretly pursuing their own objectives. This deceptive beha...

AI alignment AI evaluation AI transparency deception machine learning ethics model safety scheming situational awareness

Sep 19, 2025

0 4070

News

Detecting AI Sabotage: Insights from the SHADE-Arena Project

As artificial intelligence becomes more powerful, ensuring these systems act in our best interests is more important than ever. Recent work from Anthropic , through the SHADE-Arena project, addresses ...

agentic behavior AI alignment AI safety language models monitoring tools sabotage detection SHADE-Arena

Jun 29, 2025

0 3201

News

When AI Becomes the Insider Threat: Lessons from Agentic Misalignment Research

As organizations hand more autonomy to AI systems, a pressing issue emerges: what if these intelligent tools act in ways that actively undermine their users? Recent research from Anthropic explores th...

agentic misalignment AI alignment AI ethics AI safety corporate security insider threats LLMs

Jun 27, 2025

0 3091

News

Get All The Latest Research & News!

Subscribe

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Most Popular Articles

Check out what the hot topics are!

See all

Every shirt tells a story—and every story

#ClothingForACause