The rapid advancement of AI has made context engineering a game-changer, especially for companies striving to build intelligent, user-friendly agents. Asana stands out for its innovative approach, prioritizing not just more data, but more meaningful data. Their experience reveals that making every byte count is far more effective than simply feeding large language models (LLMs) with endless information.
The Pitfalls of Overloading AI Models
Many developers assume that expanding an AI's context window and stuffing it with data will improve performance, but this is not the reality. Asana and many others have discovered that this approach leads to diminishing returns.
Overloaded models often miss key details, respond more slowly, and drive up costs. The core issue is that critical information becomes lost in a sea of less-relevant details, a problem known in research as "lost in the middle."
Why Standard Retrieval Approaches Miss the Mark
Traditional Retrieval-Augmented Generation (RAG) systems retrieve documents based on how closely they match a user's query. While this semantic similarity is helpful, it’s often insufficient.
For instance, an old but similar document might overshadow a new, urgent task that is less similar on the surface but far more relevant. Expanding context windows can make this worse by indiscriminately pulling in more data, diluting the relevance of what the AI receives.
Intent-Augmented Retrieval: Asana's Breakthrough
To overcome these challenges, Asana shifted to intent-augmented retrieval. The key difference? Understanding user intent before fetching data. This method unfolds in two critical steps:
- Filter First: Asana’s system analyzes the user's request, converting it into structured filters before any search begins. For example, a query like "Show me overdue tasks in the marketing project" is broken down into project scope, due dates, and completion status, ensuring only the most relevant data is retrieved.
- Sort and Summarize with Intent: Once data is fetched, it's reranked and distilled based on the original query. Cross-encoders weigh both the query and the content, prioritizing relevance. Summarization techniques further refine the material, so only the most important details, such as risks or timelines, reach the AI model.
Efficiency and Real-World Impact
Implementing intent-driven context engineering in production has brought significant gains to Asana’s AI features, including AI Chat. Notable results include:
- 35% reduction in input tokens, cutting operational costs.
- 24% faster response times at the 95th percentile, leading to a smoother user experience.
- 30% lower cost per API call, proving that efficiency and quality are not mutually
exclusive. - Techniques such as cross-encoder reranking and object field filtering achieved up to 40% token savings and raised answer accuracy from 92-94% to 95-96%.
Key Takeaways and the Road Ahead
Asana’s experience underscores the power of putting user intent at the heart of context engineering. This focus results in AI systems that are faster, more accurate, and more cost-effective. While occasional edge cases remain, intent-driven strategies, supported by robust fallbacks and post-processing, keep solutions scalable and reliable.
Looking forward, Asana is exploring hybrid solutions that combine intent-augmented retrieval with knowledge graphs, like GraphRAG. This evolving approach promises even greater scalability and efficiency for enterprise AI applications, making context engineering a cornerstone of future advancements.

How Asana Transformed AI Agents with Intent-Driven Context Engineering