MCPcopy
hub / github.com/DrewThomasson/ebook2audiobook

github.com/DrewThomasson/ebook2audiobook @v26.6.26 sqlite

repository ↗ · DeepWiki ↗ · release v26.6.26 ↗
1,888 symbols 7,166 edges 293 files 259 documented · 14%
README

📚 ebook2audiobook (E2A)

CPU/GPU Converter from E-Book to audiobook with chapters and metadata

using advanced TTS engines and much more.

Supports voice cloning and 1158 languages!

[!IMPORTANT] This tool is intended for use with non-DRM, legally acquired eBooks only.

The authors are not responsible for any misuse of this software or any resulting legal consequences.

Use this tool responsibly and in accordance with all applicable laws.

Discord

Thanks to support ebook2audiobook developers!

Ko-Fi

Run locally

Quick Start

Docker Build Download

Platform Docker Pull Count

Run Remotely

Hugging Face Free Google Colab Kaggle

GUI Interface

demo_web_gui

Click to see images of Web GUI

GUI Screen 1 GUI Screen 2 GUI Screen 3

Demos

New Default Voice Demo

https://github.com/user-attachments/assets/750035dc-e355-46f1-9286-05c1d9e88cea

More Demos

ASMR Voice

https://github.com/user-attachments/assets/68eee9a1-6f71-4903-aacd-47397e47e422

Rainy Day Voice

https://github.com/user-attachments/assets/d25034d9-c77f-43a9-8f14-0d167172b080

Scarlett Voice

https://github.com/user-attachments/assets/b12009ee-ec0d-45ce-a1ef-b3a52b9f8693

David Attenborough Voice

https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921

Example

Example

README.md

Table of Contents

Features

  • 🔧 TTS Engines supported: XTTSv2, Bark, Fairseq, VITS, Tacotron2, Tortoise, GlowTTS, YourTTS
  • 📚 Convert multiple file formats: .epub, .mobi, .azw3, .fb2, .lrf, .rb, .snb, .tcr, .pdf, .txt, .rtf, .doc, .docx, .html, .odt, .azw, .tiff, .tif, .png, .jpg, .jpeg, .bmp, .zip
  • 💻 TextArea to convert directly a short text in audio
  • 🔍 OCR scanning for files with text pages as images
  • 🔊 High-quality text-to-speech from near realtime to near real voice
  • 🗣️ Optional voice cloning using your own voice file
  • 🌐 Supports 1158 languages (supported languages list)
  • 💻 Low-resource friendly — runs on 2 GB RAM / 1 GB VRAM (minimum)
  • 🎵 Audiobook output formats: mono or stereo aac, flac, mp3, m4b, m4a, mp4, mov, ogg, wav, webm
  • 🧠 SML tags supported — fine-grained control of breaks, pauses, voice switching and more (see below)
  • 🧩 Optional custom model using your own trained model (XTTSv2, VITS, FAIRSEQ, PIPER, others on request)
  • 🎛️ Fine-tuned preset models trained by the E2A Team

    (Contact us if you need additional fine-tuned models, or if you’d like to share yours to the official preset list)

Hardware Requirements

  • 2GB RAM min, 8GB recommended.
  • 1GB VRAM min, 4GB recommended.
  • Virtualization enabled if running on windows (Docker only).
  • CPU, XPU (intel, AMD, ARM)*.
  • CUDA, ROCm, JETSON
  • MPS (Apple Silicon CPU)

* Modern TTS engines are very slow on CPU, so use lower quality TTS like YourTTS, Tacotron2 etc..

Supported Languages

Arabic (ar) Chinese (zh) English (en) Spanish (es)
French (fr) German (de) Italian (it) Portuguese (pt)
Polish (pl) Turkish (tr) Russian (ru) Dutch (nl)
Czech (cs) Japanese (ja) Hindi (hi) Bengali (bn)
Hungarian (hu) Korean (ko) Vietnamese (vi) Swedish (sv)
Persian (fa) Yoruba (yo) Swahili (sw) Indonesian (id)
Slovak (sk) Croatian (hr) Tamil (ta) Danish (da)
- +1130 languages and dialects here

Supported eBook Formats

  • .epub, .pdf, .mobi, .txt, .html, .rtf, .chm, .lit, .pdb, .fb2, .odt, .cbr, .cbz, .prc, .lrf, .pml, .snb, .cbc, .rb, .tcr
  • Best results: .epub or .mobi for automatic chapter detection

Output and process Formats

  • .m4b, .m4a, .mp4, .webm, .mov, .mp3, .flac, .wav, .ogg, .aac
  • Process format can be changed in lib/conf.py

SML tags available

  • [break] — silence (random range 0.3–0.6 sec.)
  • [pause] — silence (random range 1.0–1.6 sec.)
  • [pause:N] — fixed pause (N sec.)
  • [voice:/path/to/voice/file]...[/voice] — switch voice from default or selected voice from GUI/CLI

Check our other repo dedicated to add SML automatically in your ebook -> E2A-SML

[!IMPORTANT] **Before to post an install or bug issue search carefully to the opened and closed issues TAB

to be sure your issue does not exist already.**

[!NOTE] **EPUB format lacks any standard structure like what is a chapter, paragraph, preface etc.

So you should first remove manually any text you don't want to be converted in audio.**

Instructions

  1. Clone repo bash git clone https://github.com/DrewThomasson/ebook2audiobook.git cd ebook2audiobook

  2. Install / Run ebook2audiobook:

  3. Linux/MacOS
    bash ./ebook2audiobook.command Note for MacOS users: homebrew is installed to install missing programs.

  4. Mac Launcher
    Double click Mac Ebook2Audiobook Launcher.command

  5. Windows
    bash ebook2audiobook.cmd or Double click ebook2audiobook.cmd

    Note for Windows users: scoop is installed to install missing programs without administrator privileges.

  6. Open the Web App: Click the URL provided in the terminal to access the web app and convert eBooks. http://localhost:7860/

  7. For Public Link: ./ebook2audiobook.command --share (Linux/MacOS) ebook2audiobook.cmd --share (Windows) python app.py --share (all OS)

[!IMPORTANT] **If the script is stopped and run again, you need to refresh your gradio GUI interface

to let the web page reconnect to the new connection socket.**

Basic Usage

  • Linux/MacOS: bash ./ebook2audiobook.command --headless --ebook <path_to_ebook_file> --voice <path_to_voice_file> --language <language_code>
  • Windows bash ebook2audiobook.cmd --headless --ebook <path_to_ebook_file> --voice <path_to_voice_file> --language <language_code>

  • [--ebook]: Path to your eBook file

  • [--voice]: Voice cloning file path (optional)
  • [--language]: Language code in ISO-639-3 (i.e.: ita for italian, eng for english, deu for german...).

    Default language is eng and --language is optional for default language set in ./lib/lang.py.

    The ISO-639-1 2 letters codes are also supported.

Example of Custom Model Zip Upload

(must be a .zip file containing the mandatory model files. Example for XTTSv2: config.json, model.pth, vocab.json and ref.wav) - Linux/MacOS bash ./ebook2audiobook.command --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path> - Windows bash ebook2audiobook.cmd --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path> Note: the ref.wav of your custom model is always the voice selected for the conversion

  • : Path to model_name.zip file, which must contain (according to the tts engine) all the mandatory files

    (see ./lib/models.py).

For Detailed Guide with list of all Parameters to use

  • Linux/MacOS bash ./ebook2audiobook.command --help
  • Windows bash ebook2audiobook.cmd --help
  • Or for all OS python app.py --help

```bash usage: app.py [-h] [--session SESSION] [--share] [--headless] [--ebook EBOOK] [--ebooks_dir EBOOKS_DIR] [--language LANGUAGE] [--voice VOICE] [--voice_map VOICE_MAP] [--device {CPU,CUDA,MPS,ROCM,XPU,JETSON}] [--tts_engine {XTTS,BARK,VITS,FAIRSEQ,TACOTRON,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}] [--custom_model CUSTOM_MODEL] [--fine_tuned FINE_TUNED] [--output_format OUTPUT_FORMAT] [--output_channel OUTPUT_CHANNEL] [--temperature TEMPERATURE] [--length_penalty LENGTH_PENALTY] [--num_beams NUM_BEAMS] [--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K] [--top_p TOP_P] [--speed SPEED] [--enable_text_splitting] [--text_temp TEXT_TEMP] [--waveform_temp WAVEFORM_TEMP] [--output_dir OUTPUT_DIR] [--version]

Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the Gradio interface or run the script in headless mode for direct conversion.

options: -h, --help show this help message and exit --session SESSION Session to resume the conversion in case of interruption, crash, or reuse of custom models and custom cloning voices.

**** The following option is for gradio/gui mode only: --share (Optional) Enable a public shareable Gradio link.

**** The following options are for --headless mode only: --headless Run the script in headless mode --ebook EBOOK Path to the ebook file for conversion. Cannot be used when --ebooks_dir is present. --ebooks_dir EBOOKS_DIR Relative or absolute path of the directory containing the files to convert. Cannot be used when --ebook is present. --text TEXT Raw text for conversion. Cannot be used when --ebook or --ebooks_dir is present. --language LANGUAGE Language of the e-book. Default language is set in ./lib/lang.py sed as default if not present. All compatible language codes are in ./lib/lang.py

optional parameters: --translate ISO3 (Optional) Translate ebook to a target language (ISO 639-3 code, e.g. eng, fra, deu) before TTS synthesis. Uses argostranslate. The target language becomes the effective TTS language for the run. A copy of the source ebook is made with the _ suffix so translated and non-translated outputs stay isolated (independent process folder, audio chunks, and final file). --voice VOICE (Optional) Path to the voice cloning file for TTS engine. Uses the default voice if not present. --voice_map VOICE_MAP (Optional, --ebooks_dir only) Path to a JSON file mapping ebook path -> voice path. Each entry overrides --voice for that specific ebook. Missing/null entries fall back to --voice. Keys may be absolute paths or basenames. Example:

Core symbols most depended-on inside this repo

join
called by 579
ext/py/num2words/num2words/lang_ID.py
update
called by 356
ext/py/demucs/demucs/ema.py
get_session
called by 82
lib/core.py
read
called by 72
ext/py/demucs/demucs/audio.py
write
called by 63
lib/classes/std_filter.py
update
called by 60
components/E2A-SML/booknlp/english/gender_inference_model_1.py
show_alert
called by 56
lib/core.py
load
called by 55
components/Universal_TTS_Finetune/piper/piper/voice.py

Shape

Method 1,027
Function 643
Class 217
Route 1

Languages

Python100%

Modules by API surface

lib/core.py109 symbols
lib/gradio.py72 symbols
components/Universal_TTS_Finetune/utils/pipeline.py52 symbols
lib/classes/device_installer.py37 symbols
ext/py/demucs/demucs/transformer.py37 symbols
components/Universal_TTS_Finetune/web_gui.py35 symbols
components/Universal_TTS_Finetune/piper/piper_train/vits/modules.py34 symbols
components/Universal_TTS_Finetune/piper/piper_train/vits/models.py33 symbols
components/Universal_TTS_Finetune/utils/tokenizer.py30 symbols
ext/py/demucs/demucs/repo.py28 symbols
ext/py/num2words/num2words/base.py27 symbols
lib/classes/tts_engines/common/utils.py26 symbols

Dependencies from manifests, versioned

Cython0.29.0 · 1×
accelerate0.33.0 · 1×
argostranslate1.11.0 · 1×
audiocraft1.3.0 · 1×
av11.0.0 · 1×
blis0.7.11 · 1×
catalogue2.0.10 · 1×
cutlet
faster_whisper1.0.2 · 1×
ffmpeg-python0.2.0 · 1×
gradio5.49.1 · 1×
huggingface-hub0.25.2 · 1×

For agents

$ claude mcp add ebook2audiobook \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact