MCPcopy Index your code
hub / github.com/abus-aikorea/voice-pro

github.com/abus-aikorea/voice-pro @v3.2.0 sqlite

repository ↗ · DeepWiki ↗ · release v3.2.0 ↗
2,038 symbols 6,610 edges 235 files 344 documented · 17%
README

Voice-Pro

The best AI speech recognition, translation, and multilingual dubbing solution 🚀

Ask DeepWiki.com youtube Buy Me a Coffee release GitHub Repo stars

<img src="https://github.com/abus-aikorea/voice-pro/raw/v3.2.0/docs/images/main_page_crop.eng.jpg?raw=true" alt="Dubbing Studio"/>

🎙️ An AI-powered web application for speech recognition, translation, and dubbing

South Korea Flag 한국어 United Kingdom Flag English China Flag 中文简体 Taiwan Flag 中文繁體 Japan Flag 日本語 Germany Flag Deutsch Spain Flag Español Portugal Flag Português

Voice-Pro is a state-of-the-art web app that transforms multimedia content creation. It integrates YouTube video downloading, voice separation, speech recognition, translation, and text-to-speech into a single, powerful tool for creators, researchers, and multilingual professionals. - 🔊 Top-tier speech recognition: Whisper, Faster-Whisper, Whisper-Timestamped, WhisperX - 🎤 Zero-shot voice cloning: F5-TTS, E2-TTS, CosyVoice - 📢 Multilingual text-to-speech: Edge-TTS, kokoro (Paid version includes Azure TTS) - 🎥 YouTube processing & audio extraction: yt-dlp - 🌍 Instant translation for 100+ languages: Deep-Translator (Paid version includes Azure Translator)

A robust alternative to ElevenLabs, Voice-Pro empowers podcasters, developers, and creators with advanced voice solutions.

⚠️ Please Note

  • Due to WeConnect development work, Voice-Pro development and updates are not possible for the time being.
  • We have made all Voice-Pro code open source and completely free. Voice-Pro can now be freely distributed and modified by anyone.
  • It works well on Windows with NVIDIA GPU. Operation on Mac and Linux has not been verified.
  • Please leave your requests on the GitHub Issues or GitHub Discussions pages.
  • Troubleshooting: In most cases, issues can be resolved by deleting the installer_files folder and then running configure.bat followed by start.bat.

📰 News & History

version 3.2

  • We have been focusing on WeConnect development for the past few months and have not been able to manage Voice-Pro at all.
  • We have decided to open source all Voice-Pro code.
  • Voice-Pro is completely free and supports Windows, Mac, Linux.
  • WeConnect is an application for global cultural exchange.
  • Connect with people from all over the world for meaningful cultural exchanges, language learning, and international friendships.

    ScreenShot 0 ScreenShot 1 ScreenShot 2 ScreenShot 3 ScreenShot 4

version 3.1

version 3.0

  • 🔥 Removed the AI Cover feature.
  • 🚀 Added support for m-bain/whisperX.

version 2.0

  • 🐍 Built with Python 3.10.15, Torch 2.5.1+cu124, and Gradio 5.14.0.
  • 🆓 Free trial supports media up to 60 seconds in length.
  • 🔥 Added the AI Cover feature.
  • 🎤 Introduced support for CosyVoice and kokoro.
  • ⏳ Initial run downloads CozyVoice2-0.5B (9GB), which may take over an hour depending on network speed.
  • 🎧 Voice samples for cloning will be continuously updated.
  • 📝 Added spaCy for natural sentence-by-sentence translation and TTS.
  • ☁️ Subscription version includes Microsoft Azure Translator and TTS.
  • 🏪 Subscription offers unlimited usage (no 60-second limit) during the subscription period, available via Shopify.

🎥 YouTube Showcase

Demo Video 1 Demo for Voice-Pro (v2.0) Demo Video 2 F5-TTS: Voice Cloning Demo Video 3 Live Transcription & Translation Demo Video 4 Multi-Lingual Voice Cloning: Korean - German
Demo Video 5 Multi-Lingual Voice Cloning: English - Korean Demo Video 6 Multi-Lingual Voice Cloning: Korean - Japanese Demo Video 7 NVIDIA RTX Video Super-Resolution Demo Video 8 AI Karaoke
Demo Video 5 Multi-Lingual Voice Cloning: English - Korean

⭐ Key Features

1. Dubbing Studio

  • YouTube video downloads & audio extraction
  • Voice separation with Demucs
  • Supports 100+ languages for speech recognition & translation

2. Speech Technologies

  • Speech-to-Text: Whisper, Faster-Whisper, Whisper-Timestamped, WhisperX
  • Text-to-Speech:
  • Edge-TTS: 100+ languages, 400+ voices
  • E2-TTS, F5-TTS, CosyVoice: Zero-shot cloning
  • kokoro: Ranked #2 in HuggingFace TTS Arena

3. Real-Time Translation

  • Instant speech recognition
  • Multilingual translation on the fly
  • Customizable audio inputs

🤖 WebUI

Dubbing Studio Tab

  • All-in-one hub: YouTube downloads, noise removal, subtitles, translation, & TTS
  • Supports all ffmpeg-compatible formats
  • Output

Core symbols most depended-on inside this repo

get
called by 228
src/config.py
set
called by 101
src/config.py
info
called by 86
src/demucs/audio.py
load
called by 79
cosyvoice/cli/model.py
set_split
called by 70
app/abus_files.py
get_split
called by 58
app/abus_files.py
path_add_postfix
called by 56
app/abus_path.py
update
called by 55
src/demucs/ema.py

Shape

Method 1,275
Function 464
Class 299

Languages

Python99%
TypeScript1%

Modules by API surface

src/aicover/infer_pack/models.py61 symbols
cosyvoice/utils/scheduler.py53 symbols
src/aicover/infer_pack/models_onnx_moess.py46 symbols
src/aicover/infer_pack/models_onnx.py46 symbols
app/gradio_gulliver.py39 symbols
src/demucs/transformer.py37 symbols
src/aicover/rmvpe.py36 symbols
src/aicover/infer_pack/modules.py35 symbols
app/abus_path.py31 symbols
third_party/Matcha-TTS/matcha/models/components/text_encoder.py28 symbols
src/demucs/repo.py28 symbols
third_party/Matcha-TTS/matcha/hifigan/models.py27 symbols

Dependencies from manifests, versioned

HyperPyYAML1.2.2 · 1×
WeTextProcessing1.0.3 · 1×
azure-ai-translation-text1.0.0b1 · 1×
cached_path1.6.7 · 1×
conformer0.3.2 · 1×
demucs4.0.1 · 1×
diffusers0.29.0 · 1×
f5-tts1.0.8 · 1×
fastapi-cli0.0.4 · 1×
faster-whisper1.1.0 · 1×
ffmpeg-python0.2.0 · 1×
gdown5.1.0 · 1×

For agents

$ claude mcp add voice-pro \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact