Papers | Joshua Berkowitz

7 Articles

reinforcement learning ×

Rubrics As Rewards: Reinforcement Learning Beyond Verifiable Domains

When AI Doctors Need Better Report Cards A future where AI is designed to help improve diagnostic medicine and even find rare diseases may be very close thanks to research from ScaleAI. But what does ...

AI training healthcare AI interpretability machine learning reinforcement learning rubrics

Nov 2, 2025

0 14993

Papers

How Direct Reasoning Optimization Teaches LLMs to Grade Their Own Thinking

Large language models have learned to reason well in math and coding thanks to reinforcement learning with verifiable rewards, where an answer can be checked automatically. Open-ended tasks like rewri...

chain-of-thought FinQA GRPO ParaRev R3 reinforcement learning RLVR

Nov 1, 2025

0 6798

Papers

Code World Model: A 32B Agentic Coding LLM Grounded In Execution Traces

This article analyzes a Meta FAIR technical report introducing the Code World Model (CWM), a 32-billion-parameter decoder-only transformer trained to model program execution and agentic software engin...

agents code generation execution traces LLM reinforcement learning software engineering

Oct 7, 2025

0 18414

PASS Puts Probabilities on Agentic Workflows for Safer, Adaptive Chest X-ray AI

Chest X-rays are fast, cheap, and ubiquitous, but reading them well demands careful multi-structure reasoning. The paper PASS introduces a multimodal agentic system that treats chest X-ray (CXR) analy...

agentic systems CXR medical AI multimodal radiology reinforcement learning

Aug 19, 2025

0 4994

SmallThinker: Bringing Powerful Language Models to Local Devices

Researchers from Shanghai Jiao Tong University’s Institute of Parallel and Distributed Systems, the School of Artificial Intelligence, and Zenergize AI introduced SmallThinker : a family of large lang...

AI Models AI training reinforcement learning

Aug 16, 2025

0 14223

TextArena Uses Competitive Gameplay to Advance AI

As language models quickly catch up with and surpass traditional benchmarks, the need for more effective measurement tools becomes urgent. TextArena steps in as an innovative, open-source platf...

agentic AI AI benchmarking LLM evaluation open source reinforcement learning soft skills text-based games TrueSkill

Jul 29, 2025

0 7139

MiroMind-M1: Redefining Open-Source Mathematical Reasoning for AI

Open-source AI is entering a new phase, with MiroMind-M1 leading the charge in mathematical reasoning. This project goes beyond simply releasing models by offering full transparency, every model, data...

AI transparency CAMPO chain-of-thought large language models mathematical reasoning open-source AI reinforcement learning token efficiency

Jul 23, 2025

0 5929

Our latest content

Check out what's new !

See all

Ads

Prompt Maker Image Generator

Struggling with the perfect AI image prompt? My free app helps you generate brilliant ideas and instantly creates an image to match. Go from concept to creation in two clicks!

Most Popular Articles

Check out what the hot topics are!

See all

Every shirt tells a story—and every story

#ClothingForACause