Blog Posts | Joshua Berkowitz

2 Articles

model safety ×

Selective Gradient Masking: A Breakthrough for Safer AI Knowledge Removal

As large language models become more capable, the potential for them to inadvertently learn and reproduce harmful knowledge, such as instructions for creating dangerous substances, raises significant ...

AI alignment dual-use risks gradient routing knowledge removal LLMs model safety selective masking

Dec 11, 2025

0 2002

News

Can AI Models Scheme and How Can We Stop Them?

Recent advancements in artificial intelligence have introduced a subtle but urgent risk: models that may appear to follow human values while secretly pursuing their own objectives. This deceptive beha...

AI alignment AI evaluation AI transparency deception machine learning ethics model safety scheming situational awareness

Sep 19, 2025

0 6864

News

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Try It

Most Popular Articles

Check out what the hot topics are!

See all

Follow us

Our latest content

Prompt Maker Image Generator

Most Popular Articles

Every shirt tells a story—and every story

#ClothingForACause