Intent-Aware LLM Gateways: A Practical Review of vLLM Semantic Router Large language model applications rarely involve a single task or a single best model. A customer conversation can shift from arithmetic to code to creative writing within minutes. The vLLM Semantic R... envoy extproc llm gateway modernbert openai-compatible api owasp llm top 10 semantic caching semantic routing