Is In-Context Learning - Learning? Evidence From 1.89M Predictions In-context learning (ICL) is the claim that an autoregressive large language model can learn a task from a handful of examples in its prompt, then generalize without updating weights. The paper Is In-... chain-of-thought formal languages generalization ICL in-context learning OOD prompting
How Direct Reasoning Optimization Teaches LLMs to Grade Their Own Thinking Large language models have learned to reason well in math and coding thanks to reinforcement learning with verifiable rewards, where an answer can be checked automatically. Open-ended tasks like rewri... chain-of-thought FinQA GRPO ParaRev R3 reinforcement learning RLVR
MiroMind-M1: Redefining Open-Source Mathematical Reasoning for AI Open-source AI is entering a new phase, with MiroMind-M1 leading the charge in mathematical reasoning. This project goes beyond simply releasing models by offering full transparency, every model, data... AI transparency CAMPO chain-of-thought large language models mathematical reasoning open-source AI reinforcement learning token efficiency