DeepSeek-R1 Is Redefining AI Reasoning Through Reinforcement Learning Reasoning underpins complex tasks like solving math problems, writing code, and making logical deductions. While recent LLMs have made headlines with their reasoning skills, these advances typically d... AI DeepSeek-R1 language models machine learning reasoning reinforcement learning safety STEM
AI Is Powering Gravitational Wave Detection and Cosmic Discovery Thanks to breakthrough advances in artificial intelligence, we are starting to be able to “hear” the universe’s faintest secrets. Google DeepMind’s Deep Loop Shaping method is now helping astronomers ... AI astrophysics DeepMind gravitational waves LIGO noise reduction reinforcement learning scientific discovery
Gemini 2.5 Deep Think: AI Achieves Gold-Level Performance at the ICPC World Finals Artificial intelligence continues to break new ground, and Gemini 2.5 Deep Think’s gold-level performance at the 2025 ICPC World Finals is a testament to how far machine problem-solving has come. This... AI breakthroughs artificial intelligence collaborative AI competitive programming Gemini ICPC problem solving reinforcement learning
Rethinking AI Collaboration: How CollabLLM Trains LLMs for Real Conversations While large language models (LLMs) have achieved remarkable feats in solving complex tasks recently, they often stumble in genuine, multi-turn conversations. Their typical training on isolated prompts... AI training collaboration human-AI interaction LLMs multi-turn dialogue reinforcement learning user-centric AI
PASS Puts Probabilities on Agentic Workflows for Safer, Adaptive Chest X-ray AI Chest X-rays are fast, cheap, and ubiquitous, but reading them well demands careful multi-structure reasoning. The paper PASS introduces a multimodal agentic system that treats chest X-ray (CXR) analy... agentic systems CXR medical AI multimodal radiology reinforcement learning
Jules: Google’s AI Code Reviewer Setting a New Standard for Quality Google is bringing you an AI collaborator that not only crafts code but also rigorously critiques its own output before you even see it. Google Developers have unveiled Jules , featuring a groundbreak... AI coding automated testing code review Google Developers Jules machine learning reinforcement learning software quality
SmallThinker: Bringing Powerful Language Models to Local Devices Researchers from Shanghai Jiao Tong University’s Institute of Parallel and Distributed Systems, the School of Artificial Intelligence, and Zenergize AI introduced SmallThinker : a family of large lang... AI Models AI training reinforcement learning
Gemini 2.5 Deep Think: The Next Leap in AI Problem Solving Artificial intelligence is evolving from simply providing answers to actively reasoning through complex problems. Google's latest Gemini 2.5 Deep Think update exemplifies this shift, offering Google A... AI AI safety coding Deep Think Gemini problem solving reinforcement learning research tools
Z.AI GLM-4.5: Redefining Unified AI Reasoning and Coding Innovation in artificial intelligence continues at an unprecedented pace, and GLM-4.5 is at the forefront of this evolution. Designed to unify reasoning, coding, and agentic functionalities, GLM-4.5 b... agentic AI AI benchmarks coding language models model architecture reasoning reinforcement learning
TextArena Uses Competitive Gameplay to Advance AI As language models quickly catch up with and surpass traditional benchmarks, the need for more effective measurement tools becomes urgent. TextArena steps in as an innovative, open-source platf... agentic AI AI benchmarking LLM evaluation open source reinforcement learning soft skills text-based games TrueSkill
New Qwen3-Coder Thrives in Agentic Coding and Developer Workflows Qwen3-Coder, the newest release from the Qwen team, is redefining what’s possible for agentic code models. Its flagship variant, Qwen3-Coder-480B-A35B-Instruct, leverages an impressive 480-billion par... AI coding APIs developer tools machine learning open source reinforcement learning software engineering
MiroMind-M1: Redefining Open-Source Mathematical Reasoning for AI Open-source AI is entering a new phase, with MiroMind-M1 leading the charge in mathematical reasoning. This project goes beyond simply releasing models by offering full transparency, every model, data... AI transparency CAMPO chain-of-thought large language models mathematical reasoning open-source AI reinforcement learning token efficiency