hub / github.com/evalstate/fast-agent

github.com/evalstate/fast-agent @v0.8.3 sqlite

repository ↗ · DeepWiki ↗ · release v0.8.3 ↗

21,907 symbols 99,258 edges 1,396 files 4,507 documented · 21%

README

Pepy Total Downloads

Start Here

[!TIP] Please see https://fast-agent.ai for latest documentation.

fast-agent is a flexible way to interact with LLMs, excellent for use as a Coding Agent, Development Toolkit, Evaluation or Workflow platform.

To start an interactive session with shell support, install uv and run

uvx fast-agent-mcp@latest -x

To start coding with Hugging Face inference providers or use your OpenAI Codex plan:

# Code with Hugging Face Inference Providers
uvx fast-agent-mcp@latest --pack hf-dev

# Code with Codex (agents optimized for OpenAI)
uvx fast-agent-mcp@latest --pack codex

Enter a shell with !, or run shell commands e.g. ! cd web && npm run build.

Manage skills with the /skills command, and connect to MCP Servers with /connect. The default fast-agent registry contains skills to let you set up LSP, Agent and Tool Hooks, Compaction strategies, Automation and more.

# /connect supports stdio or streamable http (with OAuth)

# Start a STDIO server
/connect @modelcontextprotocol/server-everything

# Connect to a Streamable HTTP Server
/connect https://huggingface.co/mcp

It's recommended to install fast-agent to set up the shell aliases and other tooling.

# Install fast-agent
uv tool install -U fast-agent-mcp

# Run fast-agent with opus, shell support and subagent/smart mode
fast-agent --model opus -x --smart

Use local models with the generic provider, or automatically create the correct configuration for llama.cpp:

fast-agent model llamacpp

Any fast-agent setup or program can be used with any ACP client - the simplest way is to use fast-agent-acp:

# Run fast-agent inside Toad
toad acp "fast-agent-acp -x --model sonnet"

fast-agent enables you to create and interact with sophisticated multimodal Agents and Workflows in minutes. It is the first framework with complete, end-to-end tested MCP Feature support including Sampling and Elicitations.

fast-agent is CLI-first, with an optional prompt_toolkit-powered interactive terminal prompt (TUI-style input, completions, and in-terminal menus); responses can stream live to the terminal via rich without relying on full-screen curses UIs or external GUI overlays.

The simple declarative syntax lets you concentrate on composing your Prompts and MCP Servers to build effective agents.

Model support is comprehensive with native support for Anthropic, OpenAI and Google providers as well as Azure, Ollama, Deepseek and dozens of others via TensorZero. Structured Outputs, PDF and Vision support is simple to use and well tested. Passthrough and Playback LLMs enable rapid development and test of Python glue-code for your applications.

Recent features include:

Agent Skills (SKILL.md)
MCP-UI Support |
OpenAI Apps SDK (Skybridge)
Shell Mode
Advanced MCP Transport Diagnsotics
MCP Elicitations

MCP Transport Diagnostics

fast-agent is the only tool that allows you to inspect Streamable HTTP Transport usage - a critical feature for ensuring reliable, compliant deployments. OAuth is supported with KeyRing storage for secrets. Use the fast-agent auth command to manage.

[!IMPORTANT]

Documentation is included in this repository under docs/. Use the docs helper script from the repository root to install, generate, build, serve, screenshot, and assess the site.

Agent Application Development

Prompts and configurations that define your Agent Applications are stored in simple files, with minimal boilerplate, enabling simple management and version control.

Chat with individual Agents and Components before, during and after workflow execution to tune and diagnose your application. Agents can request human input to get additional context for task completion.

Simple model selection makes testing Model <-> MCP Server interaction painless. You can read more about the motivation behind this project here

2025-03-23-fast-agent

Get started:

Start by installing the uv package manager for Python. Then:

uv pip install fast-agent-mcp          # install fast-agent!
fast-agent go                          # start an interactive session
fast-agent go --url https://hf.co/mcp  # with a remote MCP
fast-agent go --model=generic.qwen2.5  # use ollama qwen 2.5
fast-agent go --pack analyst --model haiku  # install/reuse a card pack and launch it
fast-agent scaffold                    # create an example agent and config files
uv run agent.py                        # run your first agent
uv run agent.py --model='gpt-5.4-mini?reasoning=low'    # specify a model
uv run agent.py --transport http --port 8001  # expose as MCP server (server mode implied)
fast-agent quickstart workflow  # create "building effective agents" examples

For packaged starter agents, use fast-agent go --pack <name> --model <model>. This installs the pack into the selected fast-agent environment if needed, then starts go normally. --model is a fallback for cards without an explicit model setting; a model declared directly in an AgentCard still wins.

Other quickstart examples include a Researcher Agent (with Evaluator-Optimizer workflow) and Data Analysis Agent (similar to the ChatGPT experience), demonstrating MCP Roots support.

[!TIP] Windows Users - there are a couple of configuration changes needed for the Filesystem and Docker MCP Servers - necessary changes are detailed within the configuration files.

Basic Agents

Defining an agent is as simple as:

@fast.agent(
  instruction="Given an object, respond only with an estimate of its size."
)

We can then send messages to the Agent:

async with fast.run() as agent:
  moon_size = await agent("the moon")
  print(moon_size)

Or start an interactive chat with the Agent:

async with fast.run() as agent:
  await agent.interactive()

Here is the complete sizer.py Agent application, with boilerplate code:

import asyncio
from fast_agent import FastAgent

# Create the application
fast = FastAgent("Agent Example")

@fast.agent(
  instruction="Given an object, respond only with an estimate of its size."
)
async def main():
  async with fast.run() as agent:
    await agent.interactive()

if __name__ == "__main__":
    asyncio.run(main())

The Agent can then be run with uv run sizer.py.

Specify a model with the --model switch - for example uv run sizer.py --model sonnet.

Model strings also accept query overrides. For example:

uv run sizer.py --model "gpt-5?reasoning=low"
uv run sizer.py --model "claude-sonnet-4-6?web_search=on"
uv run sizer.py --model "claude-sonnet-4-5?context=1m"

For Anthropic models, ?context=1m is only needed for earlier Sonnet 4 / Sonnet 4.5 models that still require the explicit 1M context opt-in. Claude Sonnet 4.6 and Claude Opus 4.6 already use their long context window by default, so ?context=1m is accepted for backward compatibility but is unnecessary there.

Combining Agents and using MCP Servers

To generate examples use fast-agent quickstart workflow. This example can be run with uv run workflow/chaining.py. Place fast-agent.yaml in the active fast-agent home, or pass an explicit config path when needed.

Agents can be chained to build a workflow, using MCP Servers defined in the fast-agent.yaml file:

@fast.agent(
    "url_fetcher",
    "Given a URL, provide a complete and comprehensive summary",
    servers=["fetch"], # Name of an MCP Server defined in fast-agent.yaml
)
@fast.agent(
    "social_media",
    """
    Write a 280 character social media post for any given text.
    Respond only with the post, never use hashtags.
    """,
)
@fast.chain(
    name="post_writer",
    sequence=["url_fetcher", "social_media"],
)
async def main():
    async with fast.run() as agent:
        # using chain workflow
        await agent.post_writer("http://llmindset.co.uk")

All Agents and Workflows respond to .send("message") or .prompt() to begin a chat session.

Saved as social.py we can now run this workflow from the command line with:

uv run workflow/chaining.py --agent post_writer --message "<url>"

Add the --quiet switch to disable progress and message display and return only the final response - useful for simple automations.

MAKER

MAKER (“Massively decomposed Agentic processes with K-voting Error Reduction”) wraps a worker agent and samples it repeatedly until a response achieves a k-vote margin over all alternatives (“first-to-ahead-by-k” voting). This is useful for long chains of simple steps where rare errors would otherwise compound.

Reference: Solving a Million-Step LLM Task with Zero Errors
Credit: Lucid Programmer (PR author)

@fast.agent(
  name="classifier",
  instruction="Reply with only: A, B, or C.",
)
@fast.maker(
  name="reliable_classifier",
  worker="classifier",
  k=3,
  max_samples=25,
  match_strategy="normalized",
  red_flag_max_length=16,
)
async def main():
  async with fast.run() as agent:
    await agent.reliable_classifier.send("Classify: ...")

Agents As Tools

The Agents As Tools workflow takes a complex task, breaks it into subtasks, and calls other agents as tools based on the main agent instruction.

This pattern is inspired by the OpenAI Agents SDK Agents as tools feature.

With child agents exposed as tools, you can implement routing, parallelization, and orchestrator-workers decomposition directly in the instruction (and combine them). Multiple tool calls per turn are supported and executed in parallel.

Common usage patterns may combine:

Routing: choose the right specialist tool(s) based on the user prompt.
Parallelization: fan out over independent items/projects, then aggregate.
Orchestrator-workers: break a task into scoped subtasks (often via a simple JSON plan), then coordinate execution.

@fast.agent(
    name="NY-Project-Manager",
    instruction="Return NY time + timezone, plus a one-line project status.",
    servers=["time"],
)
@fast.agent(
    name="London-Project-Manager",
    instruction="Return London time + timezone, plus a one-line news update.",
    servers=["time"],
)
@fast.agent(
    name="PMO-orchestrator",
    instruction=(
        "Get reports. Always use one tool call per project/news. "  # parallelization
        "Responsibilities: NY projects: [OpenAI, Fast-Agent, Anthropic]. London news: [Economics, Art, Culture]. "  # routing
        "Aggregate results and add a one-line PMO summary."
    ),
    default=True,
    agents=["NY-Project-Manager", "London-Project-Manager"],  # orchestrator-workers
)
async def main() -> None:
    async with fast.run() as agent:
        await agent("Get PMO report. Projects: all. News: Art, Culture")

Extended example and all params sample is available in the repository as examples/workflows/agents_as_tools_extended.py.

MCP OAuth (v2.1)

For SSE and HTTP MCP servers, OAuth is enabled by default with minimal configuration. A local callback server is used to capture the authorization code, with a paste-URL fallback if the port is unavailable.

Minimal per-server settings in fast-agent.yaml:

mcp:
  servers:
    myserver:
      transport: http # or sse
      url: http://localhost:8001/mcp # or /sse for SSE servers
      auth:
        oauth: true # default: true
        redirect_port: 3030 # default: 3030
        redirect_path: /callback # default: /callback
        # scope: "user"       # optional; if omitted, server defaults are used

The OAuth client uses PKCE and in-memory token storage (no tokens written to disk).
Token persistence: by default, tokens are stored securely in your OS keychain via keyring. If a keychain is unavailable (e.g., headless container), in-memory storage is used for the session.
To force in-memory only per server, set:

mcp:
  servers:
    myserver:
      transport: http
      url: http://localhost:8001/mcp
      auth:
        oauth: true
        persist: memory

To disable OAuth for a specific server , set auth.oauth: false for that server.

MCP Ping (optional)

The MCP ping utility can be enabled by either peer (client or server). See the Ping overview.

Client-side pinging is configured per server (default: 30s interval, 3 missed pings):

```yaml mcp: servers: myserver: ping_interval_seconds: 30 # optional

Core symbols most depended-on inside this repo

append

called by 2324

src/fast_agent/llm/memory.py

get

called by 1490

src/fast_agent/llm/memory.py

write_text

called by 766

src/fast_agent/eval/artifacts.py

get

called by 610

src/fast_agent/core/harness.py

append

called by 547

src/fast_agent/ui/stream_segments.py

add_message

called by 362

src/fast_agent/commands/results.py

resolve

called by 351

src/fast_agent/llm/tool_tracking.py

pop

called by 350

src/fast_agent/llm/memory.py

Shape

Function 11,721

Method 7,715

Class 2,355

Route 116

Languages

Python100%

TypeScript1%

Modules by API surface

tests/unit/fast_agent/ui/test_agent_completer.py178 symbols

tests/unit/fast_agent/llm/providers/test_responses_helpers.py166 symbols

src/fast_agent/mcp/mcp_aggregator.py160 symbols

tests/unit/fast_agent/llm/providers/test_responses_websocket.py149 symbols

src/fast_agent/ui/prompt/completer.py148 symbols

src/fast_agent/llm/provider/anthropic/llm_anthropic.py135 symbols

src/fast_agent/agents/mcp_agent.py134 symbols

src/fast_agent/agents/smart_agent.py132 symbols

tests/unit/fast_agent/llm/test_model_factory.py129 symbols

src/fast_agent/config.py128 symbols

src/fast_agent/llm/fastagent_llm.py126 symbols

src/fast_agent/llm/provider/bedrock/llm_bedrock.py124 symbols

Dependencies from manifests, versioned

fast-agent-mcp0.6.10 · 1×

fastapi0.136.3 · 1×

fastmcp3.4.2 · 1×

google-api-python-client1×

google-cloud-aiplatform1×

huggingface_hub1.11.0 · 1×

mcp1.27.2 · 1×

pydantic2.13.4 · 1×

pydantic-settings2.14.2 · 1×

pyyaml6.0.3 · 1×

rich15.0.0 · 1×

starlette0.46.2 · 1×

For agents

$ claude mcp add fast-agent \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact