hub / github.com/vllm-project/semantic-router

github.com/vllm-project/semantic-router @v0.3.0

repository ↗ · DeepWiki ↗ · release v0.3.0 ↗ · + Follow

21,671 symbols 75,868 edges 2,186 files 8,083 documented · 37% ● updated todayv0.3.0 · 2026-06-05★ 4,78977 open issues

What it actually does AI analysis from the code graph — generated when you open this

loading…

README

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Documentation | Playground | Blog | Publications | Hugging Face

About

In the LLM era, the number of models is exploding. Different models vary across capability, scale, cost, and privacy boundaries. Choosing and connecting the right models to build semantic AI infrastructure is a system problem.

vLLM Semantic Router is a signal-driven intelligent router for that problem. It helps teams build model systems that are more efficient, safer, and more adaptive across cloud, data center, and edge environments.

system

It delivers three core values:

Token economics: reduce wasted tokens, increase effective output, and maximize the value of every token.
LLM safety: detect jailbreaks, sensitive leakage, and hallucinations so agents remain controllable, trustworthy, and auditable.
Fullmesh intelligence: build personal AI at the edge and intelligent MaaS in the cloud by coordinating local, private, and frontier models across cost, privacy, and capability boundaries.

Getting Started

Install

curl -fsSL https://vllm-semantic-router.com/install.sh | bash

For platform notes, detailed setup options, and troubleshooting, see the Installation Guide.

[!IMPORTANT] Online playground default credentials:

username: love@vllm-sr.ai

password: vllm-sr

Latest News

[2026/03/24] Vision Paper Released: The Workload-Router-Pool Architecture for LLM Inference Optimization
[2026/03/10] v0.2 Released: vLLM Semantic Router v0.2 Athena Release
[2026/02/27] White Paper Released: Signal Driven Decision Routing for Mixture-of-Modality Models
[2026/01/05] Iris v0.1 Released: vLLM Semantic Router v0.1 Iris: The First Major Release
[2025/12/16] Collaboration: AMD × vLLM Semantic Router: Building the System Intelligence Together
[2025/11/19] New Blog: Signal-Decision Driven Architecture: Reshaping Semantic Routing at Scale
[2025/11/03] Paper Published: Category-Aware Semantic Caching for Heterogeneous LLM Workloads
[2025/10/12] Paper Accepted: When to Reason: Semantic Router for vLLM

Earlier announcements

[2025/12/15] New Blog: Token-Level Truth: Real-Time Hallucination Detection for Production LLMs
[2025/10/27] New Blog: Scaling Semantic Routing with Extensible LoRA
[2025/10/08] Collaboration: vLLM Semantic Router with vLLM Production Stack Team.
[2025/09/01] Released the project: vLLM Semantic Router: Next Phase in LLM inference.

More announcements are available on the Blog and Publications pages.

Community

For questions, feedback, or to contribute, please join the #semantic-router channel in vLLM Slack.

Community Meetings

We host community meetings on the first and third Tuesday of each month to sync with contributors across different time zones:

First Tuesday of the month: 9:00-10:00 AM EST (accommodates US EST, EU, and Asia Pacific contributors)
Zoom Link
Google Calendar Invite
ics file
Third Tuesday of the month: 1:00-2:00 PM EST (accommodates US EST and California contributors)
Zoom Link
Google Calendar Invite
ics file
Meeting recordings: YouTube

Contributing

If you want to contribute, start with CONTRIBUTING.md.

For repository-native development workflow and validation commands, use AGENTS.md as the entrypoint and docs/agent/README.md as the canonical index.

Citation

If you find Semantic Router helpful in your research or projects, please consider citing it:

@misc{semanticrouter2025,
  title={vLLM Semantic Router},
  author={vLLM Semantic Router Team},
  year={2025},
  howpublished={\url{https://github.com/vllm-project/semantic-router}},
}

Star History

Extension points exported contracts — how you extend this code

ToolRetriever (Interface)

ToolRetriever is the pluggable interface for tool-candidate retrieval. Implementations are registered by name on Registr [11 …

src/semantic-router/pkg/tools/retriever.go

Profile (Interface)

Profile defines the interface that all test profiles must implement [22 implementers]

e2e/pkg/framework/types.go

GoInstance (Interface)

(no doc) [4 implementers]

dashboard/frontend/src/lib/wasm.ts

StreamCallback (FuncType)

StreamCallback is called for each chunk of streamed content

dashboard/backend/handlers/openclaw_stream.go

Node (Interface)

(no doc)

website/src/components/NeuralNetworkBackground.tsx

SearchBarStoreValue (Interface)

(no doc)

dashboard/wizmap/src/stores.ts

SaveableSelector (Interface)

SaveableSelector interface for selectors that can persist their state [6 implementers]

src/semantic-router/pkg/modelselection/persistence.go

RequestFunc (FuncType)

RequestFunc is a function that executes a single request

e2e/pkg/performance/load_generator.go

Core symbols most depended-on inside this repo

Fatalf

called by 4444

src/semantic-router/pkg/config/reference_config_test_helpers_test.go

get

called by 1237

dashboard/frontend/src/tools/registry.ts

Error

called by 1210

src/semantic-router/pkg/extproc/utils_fast.go

Run

called by 905

e2e/profiles/llm-d/profile.go

get

called by 861

src/semantic-router/pkg/apiserver/kb_map_store.go

Set

called by 569

src/semantic-router/pkg/selection/lookuptable/table.go

Error

called by 535

dashboard/backend/handlers/runtime_config_apply.go

Get

called by 441

src/semantic-router/pkg/memory/store.go

Shape

Function 12,623

Method 5,596

Struct 1,973

Interface 822

Class 517

Route 64

TypeAlias 56

FuncType 19

Enum 1

Languages

Go68%

Python20%

TypeScript12%

Modules by API surface

deploy/operator/api/v1alpha1/zz_generated.deepcopy.go184 symbols

src/semantic-router/pkg/dsl/dsl_test.go176 symbols

dashboard/frontend/src/pages/configPageSupport.ts145 symbols

candle-binding/semantic-router.go139 symbols

src/fleet-sim/tests/test_api.py124 symbols

src/semantic-router/pkg/memory/milvus_store_test.go116 symbols

candle-binding/semantic-router_mock.go113 symbols

onnx-binding/semantic-router.go100 symbols

deploy/operator/api/v1alpha1/semanticrouter_types.go92 symbols

src/semantic-router/pkg/extproc/processor_req_body_streamed_test.go87 symbols

src/semantic-router/pkg/dsl/ast.go83 symbols

candle-binding/semantic-router_test.go81 symbols

Dependencies from manifests, versioned

github.com/alecthomas/participle/v2v2.1.4 · 1×

github.com/anthropics/anthropic-sdk-gov1.19.0 · 1×

github.com/bahlo/generic-list-gov0.2.0 · 1×

github.com/beorn7/perksv1.0.1 · 1×

github.com/buger/jsonparserv1.1.1 · 1×

github.com/cenkalti/backoff/v4v4.3.0 · 1×

github.com/cespare/xxhash/v2v2.3.0 · 1×

github.com/cncf/xds/gov0.0.0-2025102218044 · 1×

github.com/cockroachdb/errorsv1.9.1 · 1×

github.com/cockroachdb/logtagsv0.0.0-2021111810474 · 1×

github.com/cockroachdb/redactv1.1.3 · 1×

github.com/davecgh/go-spewv1.1.2-0.20180830191 · 1×

For agents

$ claude mcp add semantic-router \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact