Is In-Context Learning - Learning? Evidence From 1.89M Predictions In-context learning (ICL) is the claim that an autoregressive large language model can learn a task from a handful of examples in its prompt, then generalize without updating weights. The paper Is In-... chain-of-thought formal languages generalization ICL in-context learning OOD prompting
How Direct Reasoning Optimization Teaches LLMs to Grade Their Own Thinking Large language models have learned to reason well in math and coding thanks to reinforcement learning with verifiable rewards, where an answer can be checked automatically. Open-ended tasks like rewri... chain-of-thought FinQA GRPO ParaRev R3 reinforcement learning RLVR
MiroMind-M1: Redefining Open-Source Mathematical Reasoning for AI Open-source AI is entering a new phase, with MiroMind-M1 leading the charge in mathematical reasoning. This project goes beyond simply releasing models by offering full transparency, every model, data... AI transparency CAMPO chain-of-thought large language models mathematical reasoning open-source AI reinforcement learning token efficiency
Large Reasoning Models: Breakthroughs and Breaking Points in AI Problem-Solving Artificial intelligence has made remarkable strides, and Large Reasoning Models (LRMs) are at the forefront of this revolution. These models promise to deliver more than just answers, they aim to repl... AI research artificial intelligence benchmarking chain-of-thought large language models model limitations problem complexity reasoning