MCPcopy
hub / github.com/speaches-ai/speaches

github.com/speaches-ai/speaches @v0.8.3 sqlite

repository ↗ · DeepWiki ↗ · release v0.8.3 ↗
512 symbols 1,984 edges 71 files 30 documented · 6%
README

[!NOTE] This project was previously named faster-whisper-server. I've decided to change the name from faster-whisper-server, as the project has evolved to support more than just ASR.

Speaches

speaches is an OpenAI API-compatible server supporting streaming transcription, translation, and speech generation. Speach-to-Text is powered by faster-whisper and for Text-to-Speech piper and Kokoro are used. This project aims to be Ollama, but for TTS/STT models.

See the documentation for installation instructions and usage: speaches.ai

Features:

  • OpenAI API compatible. All tools and SDKs that work with OpenAI's API should work with speaches.
  • Audio generation (chat completions endpoint) | OpenAI Documentation
  • Generate a spoken audio summary of a body of text (text in, audio out)
  • Perform sentiment analysis on a recording (audio in, text out)
  • Async speech to speech interactions with a model (audio in, audio out)
  • Streaming support (transcription is sent via SSE as the audio is transcribed. You don't need to wait for the audio to fully be transcribed before receiving it).
  • Dynamic model loading / offloading. Just specify which model you want to use in the request and it will be loaded automatically. It will then be unloaded after a period of inactivity.
  • Text-to-Speech via kokoro(Ranked #1 in the TTS Arena) and piper models.
  • GPU and CPU support.
  • Deployable via Docker Compose / Docker
  • Realtime API
  • Highly configurable

Please create an issue if you find a bug, have a question, or a feature suggestion.

Demos

Realtime API

https://github.com/user-attachments/assets/457a736d-4c29-4b43-984b-05cc4d9995bc

(Excuse the breathing lol. Didn't have enough time to record a better demo)

Streaming Transcription

TODO

Speech Generation

https://github.com/user-attachments/assets/0021acd9-f480-4bc3-904d-831f54c4d45b

Core symbols most depended-on inside this repo

publish_nowait
called by 44
src/speaches/realtime/pubsub.py
append
called by 31
src/speaches/realtime/input_audio_buffer.py
include_router
called by 16
src/speaches/realtime/event_router.py
extend
called by 14
src/speaches/audio.py
srt_format_timestamp
called by 12
src/speaches/text_utils.py
vtt_format_timestamp
called by 12
src/speaches/text_utils.py
list_local_models
called by 10
src/speaches/executors/piper/utils.py
strip_markdown_emphasis
called by 9
src/speaches/text_utils.py

Shape

Function 197
Class 143
Method 136
Route 36

Languages

Python99%
TypeScript1%

Modules by API surface

src/speaches/types/realtime.py57 symbols
src/speaches/types/chat.py30 symbols
src/speaches/text_utils.py21 symbols
tests/api_chat_test.py16 symbols
src/speaches/routers/models.py16 symbols
src/speaches/realtime/response_event_router.py16 symbols
src/speaches/routers/realtime/rtc.py14 symbols
src/speaches/routers/chat.py14 symbols
src/speaches/realtime/message_manager.py14 symbols
src/speaches/hf_utils.py14 symbols
src/speaches/realtime/input_audio_buffer_event_router.py13 symbols
tests/conftest.py12 symbols

Dependencies from manifests, versioned

ctranslate24.5.0 · 1×
fastapi0.115.6 · 1×
faster-whisper1.1.1 · 1×
httpx0.27.2 · 1×
typer0.12.5 · 1×

For agents

$ claude mcp add speaches \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact