A powerful, feature-rich command-line interface for interacting with Model Context Protocol servers. This client enables seamless communication with LLMs through integration with the CHUK Tool Processor and CHUK-LLM, providing tool usage, conversation management, and multiple operational modes.
Default Configuration: MCP CLI defaults to using Ollama with the gpt-oss reasoning model for local, privacy-focused operation without requiring API keys.
--vm flag: Enable OS-style virtual memory for conversation context management, powered by chuk-ai-session-manager--vm-budget: Control token budget for conversation events (system prompt is uncapped on top), forcing earlier eviction and page creation--vm-mode: Choose VM mode — passive (runtime-managed, default), relaxed (VM-aware conversation), or strict (model-driven paging with tools)/memory command: Visualize VM state during conversations — page table, working set utilization, eviction metrics, TLB stats (aliases: /vm, /mem)/memory page <id> --download: Export page content to local files with modality-aware extensions (.txt, .json, .png)/plan command: Create, inspect, and execute reproducible tool call graphs — create, list, show, run, delete, resume--plan-tools): The LLM autonomously creates and executes plans during conversation — no /plan command needed. It calls plan_create_and_execute when multi-step orchestration is required, and uses regular tools for simple tasks. Each step renders with real-time progress in the terminalmax_concurrency${var}, ${var.field} nested access, and template strings like "https://${api.host}/users" — type-preserving for single refs/plan resume <id>enable_replan=True)_meta.ui annotations automatically open in the browser when called--log-file flag enables rotating JSON log files (10MB, 3 backups) at DEBUG leveltool_timeout and init_timeout overrides, resolved per-server → global → defaultasyncio.Lock and copy-on-write header mutation/health command, health-check-on-failure diagnostics, optional --health-interval background polling/usage command (aliases: /tokens, /cost)/sessions)/export)--dashboard flag: Launch a real-time browser dashboard alongside chat mode/attach command: Stage files for the next message — images, text/code, and audio (aliases: /file, /image)--attach CLI flag: Attach files to the first message (repeatable: --attach img.png --attach code.py)@file: references: Mention @file:path/to/file anywhere in a message to attach itlogging only — no UI importsThe MCP CLI is built on a modular architecture with clean separation of concerns:
/usage command/attach, --attach, @file: refs, or browser uploadMCP CLI supports all providers and models from CHUK-LLM, including cutting-edge reasoning models:
| Provider | Key Models | Special Features |
|---|---|---|
| Ollama (Default) | 🧠 gpt-oss, llama3.3, llama3.2, qwen3, qwen2.5-coder, deepseek-coder, granite3.3, mistral, gemma3, phi3, codellama | Local reasoning models, privacy-focused, no API key required |
| OpenAI | 🚀 GPT-5 family (gpt-5, gpt-5-mini, gpt-5-nano), GPT-4o family, O3 series (o3, o3-mini) | Advanced reasoning, function calling, vision |
| Anthropic | 🧠 Claude 4.5 family (claude-4-5-opus, claude-4-5-sonnet), Claude 3.5 Sonnet | Enhanced reasoning, long context |
| Azure OpenAI 🏢 | Enterprise GPT-5, GPT-4 models | Private endpoints, compliance, audit logs |
| Google Gemini | Gemini 2.0 Flash, Gemini 1.5 Pro | Multimodal, fast inference |
| Groq ⚡ | Llama 3.1 models, Mixtral | Ultra-fast inference (500+ tokens/sec) |
| Perplexity 🌐 | Sonar models | Real-time web search with citations |
| IBM watsonx 🏢 | Granite, Llama models | Enterprise compliance |
| Mistral AI 🇪🇺 | Mistral Large, Medium | European, efficient models |
_meta.ui annotations automatically launch browser apps on tool call--plan-tools, the LLM autonomously decides when to plan — calls plan_create_and_execute for complex multi-step tasks, uses regular tools for simple onesresult_variable), referenced by later steps as ${var} or ${var.field}~/.mcp-cli/plans/$ claude mcp add mcp-cli \
-- python -m otcore.mcp_server <graph>