Unsloth Dynamic GGUFs: How Extreme Model Compression Outperforms AI Giants Compressing a large language model by 75% and still outperforming the latest releases from OpenAI and Anthropic is the promise of Unsloth Dynamic GGUFs. Their integration with the Aider Polyglot bench... Aider Polyglot benchmarking DeepSeek LLMs model compression open-source AI quantization Unsloth
Unleashing On-Device Agentic Power: How Fara-7B Transforms Human-Computer Interaction Microsoft Research’s Fara-7B is a small, open-weight agentic model that interacts with your device in a human-like way. It looks to fulfil the promise ofhaving a digital assistant that doesn’t just un... agentic AI AI safety benchmarking on-device AI open source small language models synthetic data web automation
Toucan Dataset: Transforming AI Agents Into Digital Doers Toucan, a groundbreaking open-source dataset from IBM and the University of Washington is crafted to propel tool-calling capabilities in large language models (LLMs) to new heights. For AI to move bey... AI agents API integration benchmarking large language models machine learning open source tool-calling Toucan dataset
OpenAI's GPT-5-Codex: The Next Evolution in AI-Powered Coding OpenAI has taken a bold step forward in the AI coding space by introducing GPT-5-Codex. This new release redefines what developers can expect from AI-powered coding assistants, offering new levels of ... AI coding benchmarking code review Codex GPT-5 OpenAI software development
AssetOpsBench Sets New Standards for AI in Industrial Asset Management Industrial asset management is undergoing a transformation as artificial intelligence agents are poised to take on complex tasks, from predictive maintenance to troubleshooting intricate machinery. At... AI agents asset management benchmarking failure analysis industrial automation LLM evaluation multi-agent systems open source
SciArena: Transforming How We Evaluate AI Models in Scientific Research Researchers face a growing challenge: staying current with the ever-expanding body of scientific literature. Foundation models offer promise in helping synthesize and analyze this vast information, bu... AI evaluation benchmarking crowdsourcing data quality foundation models leaderboard research tools scientific literature