
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Documentation | Playground | Blog | Publications | Hugging Face
In the LLM era, the number of models is exploding. Different models vary across capability, scale, cost, and privacy boundaries. Choosing and connecting the right models to build semantic AI infrastructure is a system problem.
vLLM Semantic Router is a signal-driven intelligent router for that problem. It helps teams build model systems that are more efficient, safer, and more adaptive across cloud, data center, and edge environments.

It delivers three core values:
curl -fsSL https://vllm-semantic-router.com/install.sh | bash
For platform notes, detailed setup options, and troubleshooting, see the Installation Guide.
[!IMPORTANT] Online playground default credentials:
- username:
love@vllm-sr.ai- password:
vllm-sr
Earlier announcements
More announcements are available on the Blog and Publications pages.
For questions, feedback, or to contribute, please join the #semantic-router channel in vLLM Slack.
We host community meetings on the first and third Tuesday of each month to sync with contributors across different time zones:
If you want to contribute, start with CONTRIBUTING.md.
For repository-native development workflow and validation commands, use AGENTS.md as the entrypoint and docs/agent/README.md as the canonical index.
If you find Semantic Router helpful in your research or projects, please consider citing it:
@misc{semanticrouter2025,
title={vLLM Semantic Router},
author={vLLM Semantic Router Team},
year={2025},
howpublished={\url{https://github.com/vllm-project/semantic-router}},
}
We are grateful to our sponsors who support us:
AMD provides us with GPU resources and ROCm™ software for training and researching frontier router models, enhancing E2E testing, and building the online models playground.
$ claude mcp add semantic-router \
-- python -m otcore.mcp_server <graph>