Smolagents is Hugging Face's library that proves you can build sophisticated AI agents with just a few lines of code. The name says it all: this is a "smol" (small) library with an outsized impact on how we think about agentic AI systems.
huggingface
Organization
The concept of AI agents has captured the imagination of the world for personal and professional applications. In these agentic programs, large language model (LLM) outputs control the workflow, enabling systems that can reason, plan, and take actions autonomously.
However, the journey from concept to implementation has been fraught with complexity and design considerations. Smolagents changes this narrative by distilling the essence of agent-building into approximately 1,000 lines of core logic, making it accessible to developers who want results without wrestling with unnecessary abstractions.
The Challenge of Building AI Agents
Building effective AI agents presents a fundamental struggle. On one side, you need enough structure to ensure reliable, secure execution. On the other, you need flexibility to handle the unpredictable nature of real-world tasks.Â
Traditional approaches often resolve this dichotomy poorly, either restricting agents to rigid workflows or exposing developers to security risks through unrestricted code execution. The industry standard of encoding agent actions as JSON blobs, while perhaps safer, limits composability and expressiveness.
Smolagents takes a bold stance allowing agents to write their actions as Python code. This approach, called "code agents," leverages the expressiveness of programming languages rather than fighting against it.Â
Research has demonstrated that code-based actions use 30% fewer steps compared to JSON-based alternatives and achieve higher performance on difficult benchmarks. The reasoning is intuitive; we designed programming languages specifically to express computational actions precisely. Why reinvent the wheel with JSON or another paradigm?
A Breath of Fresh Air
What makes smolagents genuinely refreshing is its philosophical commitment to simplicity without sacrificing capability. The entire agent logic fits in the agents.py file. The mall but powerful library embraces the principle that abstractions should clarify, not mystify. When you read smolagents code, you understand what's happening. There's no magic, no hidden layers of indirection - just clean, readable Python that does exactly what it says.
The hub integrations deserve special mention. Being able to share tools and agents directly to the Hugging Face Hub transforms collaboration. You can push a custom tool with a single method call and have colleagues using it within minutes. This ecosystem thinking, where components are naturally shareable and reusable, reflects Hugging Face's broader vision of democratizing AI.
Capabilities That Matter to AI Developers
- Code Agents: The flagship
CodeAgentclass lets your LLM write Python code to call tools, enabling natural composability through function nesting, loops, and conditionals - capabilities impossible with JSON-based approaches.- Model Agnosticism: Whether you prefer open models through Hugging Face's Inference Providers, commercial APIs like OpenAI and Anthropic via LiteLLM, or local models through Transformers and Ollama, smolagents adapts easily.
- Sandboxed Execution: Security-conscious developers can execute agent code in isolated environments using Blaxel, E2B, Modal, Docker, or a Pyodide+Deno WebAssembly sandbox.
- Tool Ecosystem: Import tools from MCP servers, LangChain, or even use Hugging Face Spaces as tools directly.
- Multimodal Support: Beyond text, agents can process images, video, and audio inputs, opening doors to sophisticated multimodal applications.
Architecture and Implementation
The smolagents architecture follows a ReAct (Reasoning and Acting) pattern, where agents alternate between reasoning about their situation and taking actions. The CodeAgent class orchestrates this loop, maintaining a memory of previous actions and observations while generating new Python code snippets at each step. When the agent determines it has solved the task, it calls the final_answer tool to return results.
from smolagents import CodeAgent, WebSearchTool, InferenceClientModel
model = InferenceClientModel()
agent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)
result = agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")
The code execution pipeline is particularly elegant where the local_python_executor.py module implements a secure Python interpreter that can run generated code while restricting dangerous operations.Â
For higher security requirements, the remote_executors.py module provides integrations with cloud-based sandboxing services. This layered approach lets developers choose their security-convenience tradeoff.
The tool system is defined in tools.py and demonstrates thoughtful API design. Creating a custom tool requires only defining a class with name, description, inputs, and output_type attributes, plus a forward method. The @tool decorator simplifies this further for simple functions. Tools validate their arguments automatically and can be pushed to the Hub for sharing.
Model support spans the entire spectrum of LLM availability. The models.py module provides InferenceClientModel for Hugging Face's inference providers, LiteLLMModel for accessing 100+ cloud LLMs, TransformersModel for local execution, OpenAIModel for OpenAI-compatible APIs, and specialized classes for Azure and Amazon Bedrock.
Real-World Applications
The flexibility of smolagents enables diverse applications across industries and use cases. The examples directory showcases production-ready implementations that demonstrate the framework's versatility. Let's explore the most compelling applications that developers and researchers are building today.
Text-to-SQL and Database Intelligence: One of the most immediately practical applications involves natural language interfaces for databases. The text-to-SQL example demonstrates agents that can query complex database schemas using conversational prompts. What makes smolagents particularly effective here is the self-correcting capability such that when a SQL query fails, the agent can analyze the error, understand the schema constraints, and iteratively refine its approach. This handles the messy reality of databases where table relationships and column types aren't always obvious.
Agentic RAG (Retrieval-Augmented Generation): Traditional RAG systems retrieve documents and stuff them into context, but agentic RAG takes this further. Smolagents enables multi-step retrieval where the agent can refine its queries based on initial results, employ techniques like HyDE (Hypothetical Document Embeddings) to generate better search queries, and reason about whether retrieved information actually answers the question. Using BM25 retrievers with LangChain integrations, developers have built knowledge systems that dynamically explore document collections rather than relying on single-shot retrieval.
from smolagents import CodeAgent, Tool, InferenceClientModel
from langchain_community.retrievers import BM25Retriever
class RetrieverTool(Tool):
name = "retriever"
description = "Retrieves relevant document chunks for a query"
inputs = {"query": {"type": "string", "description": "The search query"}}
output_type = "string"
def __init__(self, docs):
self.retriever = BM25Retriever.from_documents(docs, k=5)
def forward(self, query: str) -> str:
return "\n".join([doc.page_content for doc in self.retriever.invoke(query)])
agent = CodeAgent(tools=[RetrieverTool(docs)], model=InferenceClientModel())
Vision-Based Web Browser Automation: Perhaps the most futuristic application involves agents that can see and interact with web pages like humans do. The web browser example combines smolagents with Selenium and Helium for browser control, using vision-language models like Qwen2-VL to interpret screenshots and decide on actions. These agents can navigate complex websites, fill forms, handle pop-ups, and extract specific information - tasks that previously required brittle, site-specific scrapers. The vision_web_browser.py module provides ready-to-use tools including go_back, close_popups, and search_item_ctrl_f for common web interactions.
Multi-Agent Orchestration: Complex tasks often benefit from specialized agents working together. Smolagents supports hierarchical multi-agent systems where a manager CodeAgent coordinates specialized subordinates. A typical pattern involves a ToolCallingAgent for web search that reports to a CodeAgent for planning and computation. The manager can delegate research tasks, aggregate findings, perform calculations with imported libraries like NumPy and Pandas, and synthesize final answers. This divide-and-conquer approach handles questions that require both real-time information gathering and analytical processing.
Open Deep Research: Hugging Face's own Open Deep Research implementation attempts to replicate OpenAI's Deep Research capability using smolagents. This agent achieves 55% pass@1 on the GAIA validation benchmark - remarkably close to the 67% achieved by the original Deep Research. The implementation uses a sophisticated text-based web browser with tools for archival search, page navigation, and visual inspection. It demonstrates how smolagents can tackle research-grade problems requiring sustained investigation across multiple sources. The GAIA benchmark includes questions that require multi-step reasoning, file processing (PDFs, spreadsheets, images), and web research - exactly the kind of complex, real-world tasks where agentic approaches excel.
Production Async Deployments: For production systems, the async agent example shows how to wrap smolagents in a Starlette web application using anyio for thread management. This pattern enables building REST APIs that serve agent capabilities at scale, with proper handling of concurrent requests and background execution. Combined with observability integrations like Langfuse (which provides native smolagents support), teams can deploy production-grade agent services with full tracing and debugging capabilities.
Enterprise users particularly benefit from the sandboxed execution options and model flexibility. Organizations can run agents entirely on-premises using local Transformers models, maintaining data privacy while still leveraging agentic capabilities.Â
The ability to route between multiple LLM providers using LiteLLMRouterModel enables cost optimization and failover strategies. The CLI tools (smolagent and webagent) provide convenient entry points for quick prototyping, while the Gradio UI integration offers instant visualization of agent reasoning chains for debugging and demonstration.
A Thriving Ecosystem
The smolagents community reflects Hugging Face's collaborative ethos. The project welcomes contributions beyond code, recognizing that documentation improvements, bug reports, and community support are equally valuable. The contribution guide provides clear pathways for involvement, from fixing outstanding issues to adding examples.
GitHub activity shows healthy engagement with regular releases and responsive issue handling. The project maintains a code of conduct based on the Contributor Covenant, fostering an inclusive environment. Discussions around the project span use cases, model comparisons, and architectural decisions, indicating a mature and thoughtful community.
Usage and License Terms
Smolagents is released under the Apache License 2.0, one of the most permissive open-source licenses available. This means you can use, modify, and distribute the code freely, even in commercial applications, as long as you include the original copyright notice and license. The Apache 2.0 license also provides an express grant of patent rights from contributors, offering additional legal protection for users.
Shaping the Future of AI Agents
By demonstrating that effective agents don't require complex frameworks, smolagents challenges assumptions about how agentic AI should be built. The benchmarking results are particularly compelling proving that in tests comparing various models, open-source alternatives like DeepSeek-R1 matched or exceeded proprietary models on agentic tasks. This democratization of agent capabilities aligns with Hugging Face's mission of making AI accessible to all.
Looking ahead, smolagents positions itself as a foundation for increasingly sophisticated agentic applications. The modular design facilitates extension without modification, meaning new capabilities can be added through tools rather than framework changes. As LLMs continue improving at code generation, the code agent paradigm will likely become increasingly powerful, and smolagents will be ready.
About Hugging Face
Hugging Face has established itself as the central hub for machine learning, hosting over a million models, datasets, and demo applications. Founded in 2016, the company began as a chatbot startup before pivoting to become the GitHub of machine learning. Their Transformers library revolutionized how developers work with pre-trained models, and their Hub has become the default destination for sharing and discovering ML artifacts. Smolagents represents their continued commitment to making advanced AI capabilities accessible through well-designed, open-source tools.
Embrace the Smol
Smolagents embodies a powerful idea: that simplicity and capability are not opposing forces. In a field often characterized by complexity creep, this library demonstrates that thoughtful design can deliver sophisticated functionality through minimal, readable code. Whether you're building a simple chatbot or orchestrating multi-agent systems, smolagents provides the foundation without imposing unnecessary constraints. The invitation is open - explore the repository, experiment with the examples, and discover what you can build when agents are finally within reach.

Smolagents: Hugging Face's Minimalist Framework for Building Powerful AI Agents