AI chatbots are rapidly evolving, but can they deliver the same precision as skilled engineers, especially in high-stakes domains like security and privacy? Uber’s Genie is leading the charge by implementing an Enhanced Agentic Retrieval-Augmented Generation (EAg-RAG) system that brings chatbots closer to near-human accuracy in technical Q&A tasks.
Rising to the Challenge of Technical Precision
Genie empowers Uber teams to create LLM-powered Slack bots connected to a vast range of internal documents. However, standard RAG models often produced incomplete, irrelevant, or incorrect answers which were obviously unacceptable in their mission-critical environments. To address this, Uber transitioned Genie from conventional RAG to an agentic architecture, aiming for responses engineers can trust.
Innovations Driving EAg-RAG
- Enriched Document Processing: Retaining complex document formatting is crucial for accuracy. Uber moved from unreliable PDF loaders to Google Docs with HTML extraction, building a custom loader using the Google Python API. This preserved tables, structure, and metadata, dramatically improving document context and search relevance.
- Metadata Enrichment: Every document chunk is now packed with LLM-generated summaries, keywords, and FAQs, in addition to basic metadata. This enrichment enables the chatbot to understand document nuances and provide more relevant answers.
- Agentic Pre- and Post-Processing: A suite of LLM-powered agents manage the flow at key stages:
- Query Optimizer: Refines and clarifies user questions for better retrieval.
- Source Identifier: Targets the most relevant document subset using metadata.
- Hybrid Retrieval: Merges vector search and BM25 methods, leveraging enriched metadata for optimal relevance.
- Post-Processor: Eliminates duplicates and structures retrieved information to maintain document coherence for answer generation.
Automating Quality Control
Manually evaluating chatbot performance had been a bottleneck. Uber’s solution was an “LLM-as-a-Judge” system that automatically scores responses and provides actionable feedback. This innovation cut assessment times from weeks to minutes, allowing rapid iteration and more sophisticated experimentation.
Results: Precision That Scales
Genie’s EAg-RAG architecture delivered a 27% increase in acceptable answers and a 60% reduction in incorrect advice. This success has enabled wider deployment across Uber’s engineering teams, freeing experts for more strategic work and fostering robust documentation practices. Its modular design also means EAg-RAG can be adapted for other complex domains, expanding the reach of high-precision AI assistance.
The Future: Smarter, More Flexible AI Agents
- Multi-modal Enrichment: Plans are underway to include images and diverse content types in document enrichment.
- Iterative Reasoning: Introducing chain-of-RAG and self-critique agents promises greater accuracy for complex queries.
- Tool Selection: Future agents might dynamically select the best tools for each query, further enhancing flexibility and precision.
Uber’s advancements lay the groundwork for robust agentic RAG systems, bringing more intelligent automation and reliable technical support to the organization.
Takeaway
By integrating advanced document processing, agent-driven workflows, and automated evaluation, Uber’s Genie proves that AI chatbots can become trusted partners in technical domains. As these innovations mature, expect even more human-like and reliable AI copilots to transform enterprise support and knowledge management.
Uber’s Genie Achieves Near-Human Precision with Enhanced Agentic RAG