Solving Tool Overload in AI Agents with Semantic Selection Modern AI agents are integrating with rapidly expanding tool catalogs, sometimes numbering in the hundreds or thousands. This growth, while promising, introduces a substantial challenge: how can agent... AI agents cost efficiency LLM performance open-source models scalability semantic selection tool routing
HELMET: Raising the Bar for Long-Context Language Model Evaluation The rapid advancement of long-context language models (LCLMs) is transforming what AI can do, from digesting entire books to managing vast swaths of information in a single pass. Despite this progress... AI benchmarks evaluation long-context models model-based evaluation open-source models retrieval-augmented generation summarization