News | Joshua Berkowitz

22 Articles

AI safety ×

How Bloom Is Transforming Automated Behavioral Evaluations for Frontier AI Models

Evaluating cutting-edge AI models poses a significant challenge for developers and safety researchers. Manual behavioral assessments are time-consuming and struggle to keep up with rapid model advance...

agentic frameworks AI evaluation AI safety Anthropic automation behavioral testing model alignment open-source

Dec 30, 2025

0 737

Automating AI Alignment: How Anthropic’s Bloom Reimagines Behavioral Evaluation

Evaluating the behavior of advanced AI models is a growing challenge as systems become more capable and complex. Manual assessment methods can’t keep up with rapid model evolution, risking outdated be...

AI alignment AI safety Anthropic automation behavioral evaluation Bloom model benchmarking open source

Dec 26, 2025

0 363

How AI Is Shaping the Future of Cybersecurity: Innovation, Defense, and Responsibility

Artificial intelligence is rapidly transforming the cybersecurity landscape. Organizations now have access to powerful AI-driven tools that enhance their defenses, but these same advancements bring ne...

AI safety cybersecurity defensive tools ecosystem collaboration open source security risk management threat mitigation

Dec 11, 2025

0 1199

FACTS Benchmark Suite: Setting a New Standard for LLM Factuality

As artificial intelligence systems become central to search, support, and communication, their ability to deliver consistently accurate information is under intense scrutiny. Google DeepMind’s FACTS B...

AI benchmarks AI safety factuality Gemini 3 Pro Google DeepMind LLM evaluation machine learning multimodal AI

Dec 11, 2025

0 2838

Unleashing On-Device Agentic Power: How Fara-7B Transforms Human-Computer Interaction

Microsoft Research’s Fara-7B is a small, open-weight agentic model that interacts with your device in a human-like way. It looks to fulfil the promise ofhaving a digital assistant that doesn’t just un...

agentic AI AI safety benchmarking on-device AI open source small language models synthetic data web automation

Nov 25, 2025

0 7183

Claude Opus 4.5: Setting a New Benchmark for AI Productivity

Anthropic's new Claude Opus 4.5 model signals a potentially substantial leap forward in artificial intelligence, offering smarter, more efficient support for coding, research, and productivity tasks. ...

AI development AI safety Claude Opus 4.5 developer platform machine learning productivity tools software engineering

Nov 25, 2025

0 1595

GPT-5.1-Codex-Max: Redefining AI-Powered Coding with Safety and Scale

AI-driven software development is entering a transformative era, thanks to OpenAI’s release of GPT-5.1-Codex-Max. This advanced “agentic” coding model is engineered to tackle challenges across softwar...

agentic AI AI safety coding models cybersecurity GPT-5.1 model evaluation OpenAI software engineering

Nov 20, 2025

0 5500

OpenAI's gpt-oss-safeguard: A New Era for Policy-Driven AI Safety

OpenAI has introduced gpt-oss-safeguard , a groundbreaking family of open-source reasoning models designed to transform safety classification in artificial intelligence. Unlike rigid, traditional clas...

AI safety community collaboration content moderation developer tools machine learning open-source policy reasoning

Nov 5, 2025

0 11825

IBM’s AISteer360 Empowers Safe and Customizable LLM Outputs

As large language models (LLMs) become increasingly sophisticated, so does the challenge of ensuring their outputs remain safe, relevant, and trustworthy especially in high-stakes enterprise environme...

AI safety AISteer360 AI steering enterprise AI generative AI large language models model alignment open source

Oct 28, 2025

0 3707

Unlocking Speed, Smarts, and Savings: Claude Haiku 4.5 Raises the Bar for Small AI Models

AI is entering a new era where speed and affordability no longer come at the expense of intelligence. Anthropic’s Claude Haiku 4.5 exemplifies this shift, making high-level AI capabilities accessible ...

AI models AI safety Claude Haiku coding performance cost efficiency developer tools real-time AI

Oct 15, 2025

0 9746

How a Handful of Malicious Documents Can Backdoor Massive AI Models

It might seem that poisoning a huge AI model would require corrupting a substantial portion of its training data. However, groundbreaking research reveals this isn’t the case. Experts from Anthropic, ...

adversarial machine learning AI safety AI security backdoor attacks data poisoning large language models model robustness research

Oct 14, 2025

0 5951

Rubrics as Rewards: A New Paradigm for Training Reliable AI

AI models face significant challenges when applied to nuanced, high-stakes fields like medicine and science. Standard training techniques, such as Reinforcement Learning from Human Feedback (RLHF), of...

AI safety AI training expert guidance language models model evaluation RLHF rubrics

Sep 23, 2025

0 3542

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Try It

Most Popular Articles

Check out what the hot topics are!

See all

Follow us

Our latest content

Prompt Maker Image Generator

Most Popular Articles

Every shirt tells a story—and every story

#ClothingForACause