![]()
Witsy
Desktop AI Assistant
Universal MCP Client
Download Witsy from the releases page.
On macOS you can also brew install --cask witsy.
Witsy is a BYOK (Bring Your Own Keys) AI application: it means you need to have API keys for the LLM providers you want to use. Alternatively, you can use Ollama to run models locally on your machine for free and use them in Witsy.
It is the first of very few (only?) universal MCP clients:
Witsy allows you to run MCP servers with virtually any LLM!
| Capability | Providers |
|---|---|
| Chat | OpenAI, Anthropic, Google (Gemini), xAI (Grok), Meta (Llama), Ollama, LM Studio, MistralAI, DeepSeek, OpenRouter, Groq, Cerebras, Azure OpenAI, any provider who supports the OpenAI API standard (together.ai for instance) |
| Image Creation | OpenAI, Google, xAI, Replicate, fal.ai, HuggingFace, Stable Diffusion WebUI |
| Video Creation | OpenAI, Google, Replicate, fal.ai |
| Text-to-Speech | OpenAI, ElevenLabs, Groq, fal.ai |
| Speech-to-Text | OpenAI (Whisper), fal.ai, Fireworks.ai, Gladia, Groq, nVidia, Speechmatics, Local Whisper, Soniox (realtime and async) any provider who supports the OpenAI API standard |
| Search Engines | Perplexity, Tavily, Brave, Exa, Local Google Search |
| MCP Repositories | Smithery.ai |
| Embeddings | OpenAI, Ollama |
Non-exhaustive feature list: - Chat completion with vision models support (describe an image) - Text-to-image and text-to video - Image-to-image (image editing) and image-to-video - LLM plugins to augment LLM: execute python code, search the Internet... - Anthropic MCP server support - Scratchpad to interactively create the best content with any model! - Prompt anywhere allows to generate content directly in any application - AI commands runnable on highlighted text in almost any application - Experts prompts to specialize your bot on a specific topic - Long-term memory plugin to increase relevance of LLM answers - Read aloud of assistant messages - Read aloud of any text in other applications - Chat with your local files and documents (RAG) - Transcription/Dictation (Speech-to-Text) - Realtime Chat aka Voice Mode - Anthropic Computer Use support - Local history of conversations (with automatic titles) - Formatting and copy to clipboard of generated code - Conversation PDF export - Image copy and download

You can download a binary from from the releases page or build yourself:
npm ci
npm start
To use OpenAI, Anthropic, Google or Mistral AI models, you need to enter your API key: - OpenAI - Anthropic - Google - xAI - Meta - MistralAI - DeepSeek - OpenRouter - Groq - Cerebras
To use Ollama models, you need to install Ollama and download some models.
To use text-to-speech, you need an - OpenAI API key. - Fal.ai API Key - Fireworks.ai API Key - Groq API Key - Speechmatics API Key - Gladia API Key
To use Internet search you need a Tavily API key.
Generate content in any application: - From any editable content in any application - Hit the Prompt anywhere shortcut (Shift+Control+Space / ^⇧Space) - Enter your prompt in the window that pops up - Watch Witsy enter the text directly in your application!
On Mac, you can define an expert that will automatically be triggered depending on the foreground application. For instance, if you have an expert used to generate linux commands, you can have it selected if you trigger Prompt Anywhere from the Terminal application!
AI commands are quick helpers accessible from a shortcut that leverage LLM to boost your productivity: - Select any text in any application - Hit the AI command shorcut (Alt+Control+Space / ⌃⌥Space) - Select one of the commands and let LLM do their magic!
You can also create custom commands with the prompt of your liking!

Commands inspired by https://the.fibery.io/@public/Public_Roadmap/Roadmap_Item/AI-Assistant-via-ChatGPT-API-170.
From https://github.com/f/awesome-chatgpt-prompts.
https://www.youtube.com/watch?v=czcSbG2H-wg
You can connect each chat with a document repository: Witsy will first search for relevant documents in your local files and provide this info to the LLM. To do so:
You can transcribe audio recorded on the microphone to text. Transcription can be done using a variety of state of the art speech to text models (which require API key) or using local Whisper model (requires download of large files).
Currently Witsy supports the following speech to text models: - GPT4o-Transcribe - Gladia - Speechmatics (Standards + Enhanced) - Groq Whisper V3 - Fireworks.ai Realtime Transcription - fal.ai Wizper V3 - fal.ai ElevenLabs - nVidia Microsoft Phi-4 Multimodal
Witsy supports quick shortcuts, so your transcript is always only one button press away.
Once the text is transcribed you can:
https://www.youtube.com/watch?v=vixl7I07hBk
Witsy provides a local HTTP API that allows external applications to trigger various commands and features. The API server runs on localhost by default on port 8090 (or the next available port if 8090 is in use).
Security Note: The HTTP server runs on localhost only by default. If you need external access, consider using a reverse proxy with proper authentication.
The current HTTP server port is displayed in the tray menu below the Settings option: - macOS/Linux: Check the fountain pen icon in the menu bar - Windows: Check the fountain pen icon in the system tray
All endpoints support both GET (with query parameters) and POST (with JSON or form-encoded body) requests.
| Endpoint | Description | Optional Parameters |
|---|---|---|
GET /api/health |
Server health check | - |
GET/POST /api/chat |
Open main window in chat view | text - Pre-fill chat input |
GET/POST /api/scratchpad |
Open scratchpad | - |
GET/POST /api/settings |
Open settings window | - |
GET/POST /api/studio |
Open design studio | - |
GET/POST /api/forge |
Open agent forge | - |
GET/POST /api/realtime |
Open realtime chat (voice mode) | - |
GET/POST /api/prompt |
Trigger Prompt Anywhere | text - Pre-fill prompt |
GET/POST /api/command |
Trigger AI command picker | text - Pre-fill command text |
GET/POST /api/transcribe |
Start transcription/dictation | - |
GET/POST /api/readaloud |
Start read aloud | - |
GET /api/engines |
List available AI engines | Returns configured chat engines |
GET /api/models/:engine |
List models for an engine | Returns available models for specified engine |
POST /api/complete |
Run chat completion | stream (default: true), engine, model, thread (Message[]) |
GET/POST /api/agent/run/:token |
Trigger agent execution via webhook | Query params passed as prompt inputs |
GET /api/agent/status/:token/:runId |
Check agent execution status | Returns status, output, and error |
# Health check
curl http://localhost:8090/api/health
# Open chat with pre-filled text (GET with query parameter)
curl "http://localhost:8090/api/chat?text=Hello%20World"
# Open chat with pre-filled text (POST with JSON)
curl -X POST http://localhost:8090/api/chat \
-H "Content-Type: application/json" \
-d '{"text":"Hello World"}'
# Trigger Prompt Anywhere with text
curl "http://localhost:8090/api/prompt?text=Write%20a%20poem"
# Trigger AI command on selected text
curl -X POST http://localhost:8090/api/command \
-H "Content-Type: application/json" \
-d '{"text":"selected text to process"}'
# Trigger agent via webhook with parameters
curl "http://localhost:8090/api/agent/run/abc12345?input1=value1&input2=value2"
# Trigger agent with POST JSON
curl -X POST http://localhost:8090/api/agent/run/abc12345 \
-H "Content-Type: application/json" \
-d '{"input1":"value1","input2":"value2"}'
# Check agent execution status
curl "http://localhost:8090/api/agent/status/abc12345/run-uuid-here"
# List available engines
curl http://localhost:8090/api/engines
# List models for a specific engine
curl http://localhost:8090/api/models/openai
# Run non-streaming chat completion
curl -X POST http://localhost:8090/api/complete \
-H "Content-Type: application/json" \
-d '{
"stream": "false",
"engine": "openai",
"model": "gpt-4",
"thread": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
# Run streaming chat completion (SSE)
curl -X POST http://localhost:8090/api/complete \
-H "Content-Type: application/json" \
-d '{
"stream": "true",
"thread": [
{"role": "user", "content": "Write a short poem"}
]
}'
Witsy includes a command-line interface that allows you to interact with the AI assistant directly from your terminal.
Installation
The CLI is automatically installed when you launch Witsy for the first time:
- macOS: Creates a symlink at /usr/local/bin/witsy (requires admin password)
- Windows: Adds the CLI to your user PATH (restart terminal for changes to take effect)
- Linux: Creates a symlink at /usr/local/bin/witsy (uses pkexec if needed)
Usage
Once installed, you can use the witsy command from any terminal:
witsy
The CLI will connect to your running Witsy application and provide an interactive chat interface. It uses the same configuration (engine, model, API keys) as your desktop application.
Available Commands
- /help - Show available commands
- /model - Select engine and model
- /port - Change server port (default: 4321)
- /clear - Clear conversation history
- /history - Show conversation history
- /exit - Exit the CLI
Requirements - Witsy desktop application must be running - HTTP API server enabled (default port: 4321)
The /api/complete endpoint provides programmatic access to Witsy's chat completion functionality, enabling command-line tools and scripts to interact with any configured LLM.
Endpoint: POST /api/complete
Request body:
{
"stream": "true", // Optional: "true" (default) for SSE streaming, "false" for JSON response
"engine": "openai", // Optional: defaults to configured engine in settings
"model": "gpt-4", // Optional: defaults to configured model for the engine
"thread": [ // Required: array of messages
{"role": "user", "content": "Your prompt here"}
]
}
Response (non-streaming, stream: "false"):
```json
{
"success": true,
"response": {
$ claude mcp add witsy \
-- python -m otcore.mcp_server <graph>