Docker cagent: The Open-Source Multi-Agent AI Runtime In the rapidly evolving landscape of AI, building and deploying AI agents has often required navigating a maze of complex frameworks, managing multiple API integrations, and wrestling with configurati... AI Agents Artificial Intelligence Developer Tools DevOps Docker Enterprise AI Go LLM Marketing Automation MCP Multi-Agent Systems Open Source
QuArch Puts AI Agents to the Test on Computer Architecture Computer architecture is having an AI moment. Yet despite rapid progress in agentic tooling for coding and verification, hardware-centric knowledge remains stubbornly hard for language models to maste... AI benchmarking Artificial Intelligence computer architecture Computer Science
Introducing LiveMCPBench: Evaluating Models on Large Tool Set Usage A new arXiv preprint, LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools , from the Chinese Academy of Sciences and UCAS, introduces a benchmark to test AI agents in realistic tool-rich environme... AI benchmarking AI tools Artificial Intelligence MCP MCP Server
From Lab to Live: FunASR, the Open-Source Toolkit Bridging the Speech Recognition Gap In the vast landscape of open-source projects, some stand out not just for their technical elegance, but for their ambitious goal of democratizing a complex technology. FunASR is a "Fundamental End-to... Artificial Intelligence Open Source Voice technology
Introducing SyntheMol-RL An Antibiotic Discovery Powered by Artificial Intelligence The escalating crisis of antibiotic resistance presents one of the most urgent threats to global public health, with millions of deaths linked to drug-resistant bacterial infections annually. Among th... Artificial Intelligence Chemistry drug discovery
Open Deep Search: An Open-Source Framework for Advanced AI Search In the rapidly evolving landscape of artificial intelligence, search technologies powered by large language models (LLMs) have become increasingly sophisticated, offering users more contextually relev... AI benchmarks Artificial Intelligence
HELMET: A Comprehensive Benchmark for Evaluating Long-Context Language Models The ability of language models to process and understand increasingly long texts , known as long-context language models (LCLMs) , is unlocking a wide range of potential applications, from summarizing... AI benchmarks Artificial Intelligence
Microsoft Research Proposes Deep Integration of Computer Use Agents in AgentOS for Windows The automation of desktop applications has long been a goal for improving productivity, traditionally relying on rigid and fragile script-based Robotic Process Automation (RPA) systems. The emergence ... Artificial Intelligence Computer Science
Empowering LLMs with Tools For Drug Discovery This research introduces ChemCrow , an advanced Large Language Model (LLM) integrated with chemistry-specific computational tools designed to enhance the capability of handling complex chemistry tasks... Artificial Intelligence Chemistry