Blog Posts | Joshua Berkowitz

29 Articles

reinforcement learning ×

Agent Lightning: Decoupled RL Training for Any AI Agent

Agent Lightning is a Microsoft Research project that turns existing agents into trainable systems with minimal code changes. Instead of rewriting your agent to fit a trainer loop, you attach a lightwe...

AI agents AutoGen DPO LangGraph OpenAI Agents reinforcement learning RLHF VERL vLLM

Oct 8, 2025

0 58476

Github Repos

Code World Model: A 32B Agentic Coding LLM Grounded In Execution Traces

This article analyzes a Meta FAIR technical report introducing the Code World Model (CWM), a 32-billion-parameter decoder-only transformer trained to model program execution and agentic software engin...

agents code generation execution traces LLM reinforcement learning software engineering

Oct 7, 2025

0 17853

Papers

DeepSeek-R1 Is Redefining AI Reasoning Through Reinforcement Learning

Reasoning underpins complex tasks like solving math problems, writing code, and making logical deductions. While recent LLMs have made headlines with their reasoning skills, these advances typically d...

AI DeepSeek-R1 language models machine learning reasoning reinforcement learning safety STEM

Sep 26, 2025

0 9691

News

AI Is Powering Gravitational Wave Detection and Cosmic Discovery

Thanks to breakthrough advances in artificial intelligence, we are starting to be able to “hear” the universe’s faintest secrets. Google DeepMind’s Deep Loop Shaping method is now helping astronomers ...

AI astrophysics DeepMind gravitational waves LIGO noise reduction reinforcement learning scientific discovery

Sep 20, 2025

0 8063

News

Gemini 2.5 Deep Think: AI Achieves Gold-Level Performance at the ICPC World Finals

Artificial intelligence continues to break new ground, and Gemini 2.5 Deep Think’s gold-level performance at the 2025 ICPC World Finals is a testament to how far machine problem-solving has come. This...

AI breakthroughs artificial intelligence collaborative AI competitive programming Gemini ICPC problem solving reinforcement learning

Sep 17, 2025

0 23155

Gemini

Rethinking AI Collaboration: How CollabLLM Trains LLMs for Real Conversations

While large language models (LLMs) have achieved remarkable feats in solving complex tasks recently, they often stumble in genuine, multi-turn conversations. Their typical training on isolated prompts...

AI training collaboration human-AI interaction LLMs multi-turn dialogue reinforcement learning user-centric AI

Sep 3, 2025

0 6435

News

PASS Puts Probabilities on Agentic Workflows for Safer, Adaptive Chest X-ray AI

Chest X-rays are fast, cheap, and ubiquitous, but reading them well demands careful multi-structure reasoning. The paper PASS introduces a multimodal agentic system that treats chest X-ray (CXR) analy...

agentic systems CXR medical AI multimodal radiology reinforcement learning

Aug 19, 2025

0 4708

Papers

Jules: Google’s AI Code Reviewer Setting a New Standard for Quality

Google is bringing you an AI collaborator that not only crafts code but also rigorously critiques its own output before you even see it. Google Developers have unveiled Jules , featuring a groundbreak...

AI coding automated testing code review Google Developers Jules machine learning reinforcement learning software quality

Aug 18, 2025

0 9999

News

SmallThinker: Bringing Powerful Language Models to Local Devices

Researchers from Shanghai Jiao Tong University’s Institute of Parallel and Distributed Systems, the School of Artificial Intelligence, and Zenergize AI introduced SmallThinker : a family of large lang...

AI Models AI training reinforcement learning

Aug 16, 2025

0 13805

Papers

Gemini 2.5 Deep Think: The Next Leap in AI Problem Solving

Artificial intelligence is evolving from simply providing answers to actively reasoning through complex problems. Google's latest Gemini 2.5 Deep Think update exemplifies this shift, offering Google A...

AI AI safety coding Deep Think Gemini problem solving reinforcement learning research tools

Aug 1, 2025

0 16544

Gemini

Z.AI GLM-4.5: Redefining Unified AI Reasoning and Coding

Innovation in artificial intelligence continues at an unprecedented pace, and GLM-4.5 is at the forefront of this evolution. Designed to unify reasoning, coding, and agentic functionalities, GLM-4.5 b...

agentic AI AI benchmarks coding language models model architecture reasoning reinforcement learning

Jul 30, 2025

0 12562

News

TextArena Uses Competitive Gameplay to Advance AI

As language models quickly catch up with and surpass traditional benchmarks, the need for more effective measurement tools becomes urgent. TextArena steps in as an innovative, open-source platf...

agentic AI AI benchmarking LLM evaluation open source reinforcement learning soft skills text-based games TrueSkill

Jul 29, 2025

0 7007

Papers

1
2
3

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Most Popular Articles

Check out what the hot topics are!

See all

Every shirt tells a story—and every story

#ClothingForACause