MCPcopy Index your code
hub / github.com/YaoFANGUK/video-subtitle-extractor

github.com/YaoFANGUK/video-subtitle-extractor @2.2.0 sqlite

repository ↗ · DeepWiki ↗ · release 2.2.0 ↗
486 symbols 1,590 edges 40 files 144 documented · 30%
README

简体中文 | English

Introduction

License python version support os

Video-subtitle-extractor (VSE) is a free, open-source tool which can help you rip the hard-coded subtitles from videos and automatically generate corresponding srt files for each video. It includes the following implementations:

  • Detect and extract subtitle frames (using traditional graphic method)
  • Detect subtitle areas (i.e., coordinates) (as well as scene text if you want) (using deep learning algorithms)
  • Converting graphic text into plain-text (using deep learning algorithms)
  • Filter non-subtitle text (e.g., Logo and watermark etc.)
  • Remove watermark, logo text and original video hard subtitles, see: video-subtitle-remover (VSR).
  • Remove duplicated subtitle line and generate srt file (by calculating text similarity)
  • Batch extraction. You can select multiple video files at one time and this tool can generate subtitles for each video.
  • Multiple language support. You can extract subtitles in 87 languages such as: Simplified Chinese, English, Japanese, Korean, Arabic, Traditional Chinese, French, German, Russian, Spanish, Portuguese, Italian
  • Multi-mode:
  • fast: (Recommended) Uses a lightweight model for quick subtitle extraction, though it might miss a small amount of subtitles and contains a few typos.
  • auto: (Recommended) Automatically selects the model. It uses the lightweight model under the CPU, and the precise model under the GPU. While subtitle extraction speed is slower and might miss a minor amount of subtitles, there are almost no typos.
  • accurate: (Not Recommended) Uses the precise model with frame-by-frame detection under the GPU, ensuring no missed subtitles and almost non-existent typos, but the speed is very slow.

demo.png

Features

  • You don't need to do any preprocessing (e.g., binarization) and don't need to consider all aspects like subtitle fonts and size etc..
  • This is an offline project. There is no online API call and you dont need to connect to the Internet service provider in order to get results.

Usage

  • After clicking "Open", select video file(s), adjust the subtitle area, and then click "Run".
  • Single file extraction: When opening a file, choose a single video.
  • Batch extraction: When opening files, choose multiple videos, ensure that every video's resolution and subtitle area remain consistent.

  • Remove watermark text/replace specific text:

    If specific text needs to be deleted from generated .srt file, or specific text needs to be replaced, you can edit the backend/configs/typoMap.json file and add the content you want to replace or remove.

{
    "l'm": "I'm",
    "l just": "I just",
    "Let'sqo": "Let's go",
    "Iife": "life",
    "威筋": "threat",
    "性感荷官在线发牌": ""
}

In this way, you can replace all occurrences of "威筋" in the text with "threat" and delete all instances of the text "性感荷官在线发牌".

  • Directly download the compressed package, unzip it and run it. If it cannot run, follow the tutorial below and try to install the Conda environment and run it using the source code.

Download

Provide your suggestions to improve this project in ISSUES & DISCUSSION

Pre-built Package Comparison:

Pre-built Package Name Python Paddle Environment Supported Compute Capability Range
vse-windows-cpu.7z 3.12 3.0.0 No GPU, CPU only Universal
vse-windows-directml.7z 3.12 3.0.0 Windows without Nvidia GPU Universal
vse-windows-nvidia-cuda-10.2.7z 3.11 2.5.2 CUDA 10.2 3.0 – 7.5
vse-windows-nvidia-cuda-11.8.7z 3.12 3.0.0 CUDA 11.8 3.5 – 8.9
vse-windows-nvidia-cuda-12.6.7z 3.12 3.0.0 CUDA 12.6 5.0 – 9.0

NVIDIA provides a list of supported compute capabilities for each GPU model. You can refer to the following link: CUDA GPUs to check which CUDA version is compatible with your GPU.

NVIDIA 50 series graphics cards require CUDA 12.8.0 or above, but Paddle 3.0.0 does not support it yet, so it is recommended to use the DirectML universal version.

Recognition Mode Selection:

Mode Name GPU OCR Model Size Subtitle Detection Engine Notes
Fast Yes/No Mini VideoSubFinder
Auto Yes Large VideoSubFinder Recommended
Auto No Mini VideoSubFinder Recommended
Precise Yes/No Large VSE Very slow

The subtitle detection engine for both Windows/Linux environments is VideoSubFinder.

Demo

  • Graphic User Interface (GUI):

demo.gif

  • Command Line Interface (CLI):

Demo Video

Running Online

  • Google Colab Notebook with free GPU: Open In Colab

PS: can only run CLI version on Google Colab

Getting Started with Source Code

1. Install Python

Please ensure that you have installed Python 3.12+.

  • Windows users can go to the Python official website to download and install Python.
  • MacOS users can install using Homebrew: shell brew install python@3.12
  • Linux users can install via the package manager, such as on Ubuntu/Debian: shell sudo apt update && sudo apt install python3.12 python3.12-venv python3.12-dev

2. Install Dependencies

It is recommended to use a virtual environment to manage project dependencies to avoid conflicts with the system environment.

(1) Create and activate the virtual environment:

python -m venv videoEnv
  • Windows:
videoEnv\\Scripts\\activate
  • MacOS/Linux:
source videoEnv/bin/activate

3. Create and Activate Project Directory

Change to the directory where your source code is located:

cd <source_code_directory>

For example, if your source code is in the tools folder on the D drive and the folder name is video-subtitle-extractor, use: shell cd D:/tools/video-subtitle-extractor-main

4. Install the Appropriate Runtime Environment

This project supports four runtime modes: CUDA (NVIDIA GPU acceleration), CPU (no GPU), DirectML (AMD, Intel, and other GPUs/APUs), and ONNX.

(1) CUDA (For NVIDIA GPU users)

Make sure your NVIDIA GPU driver supports the selected CUDA version.

  • Recommended CUDA 11.8, corresponding to cuDNN 8.6.0.

  • Install CUDA:

  • Windows: Download CUDA 11.8
  • Linux: shell wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run sudo sh cuda_11.8.0_520.61.05_linux.run
  • CUDA is not supported on MacOS.

  • Install cuDNN (CUDA 11.8 corresponds to cuDNN 8.6.0):

  • Windows cuDNN 8.6.0 Download
  • Linux cuDNN 8.6.0 Download
  • Follow the installation guide in the NVIDIA official documentation.

  • Install PaddlePaddle GPU version (CUDA 11.8): shell pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ pip install -r requirements.txt

(2) DirectML (For AMD, Intel, and other GPU/APU users)
  • Suitable for Windows devices with AMD/NVIDIA/Intel GPUs.
  • Install ONNX Runtime DirectML version: shell pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ pip install -r requirements.txt pip install -r requirements_directml.txt

(3) ONNX (Suitable for macOS, AMD ROCm, and other hardware-accelerated environments; the basic setup is the same as DirectML, not tested!)

  • If using this method, DO NOT REPORT ISSUES.
  • Suitable for Linux or macOS devices with AMD/Metal GPUs/Apple Silicon GPUs.
  • Install ONNX Runtime DirectML version: ```shell pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ pip install -r requirements.txt

# Read documentation https://onnxruntime.ai/docs/execution-providers/ # Choose the appropriate execution backend based on your device, modify the dependencies in requirements_directml.txt accordingly.

# Example: # requirements_coreml.txt # paddle2onnx==1.3.1 # onnxruntime-coreml==1.13.1

pip install -r requirements_coreml.txt ```

(4) CPU Only (For systems without GPU or those not wanting to use GPU acceleration)
  • Suitable for systems without GPU or those that do not wish to use GPU.
  • Install the CPU version of PaddlePaddle: shell pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ pip install -r requirements.txt

5. Run the Program

  • Run the graphical user interface version (GUI):
python gui.py
  • Run the command-line interface version (CLI):
python ./backend/main.py

Q & A

1. Running Failure or Environment Problem

Solution: If you are using a nvidia ampere architecture graphic card such as RTX 3050/3060/3070/3080, please use the latest PaddlePaddle version and CUDA 11.6 with cuDNN 8.2.1. Otherwise, check your which cuda and cudnn works with your GPU and then install them.

2. For Windows users, if you encounter errors related to "geos_c.dll"

    _lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
  File "C:\Users\Flavi\anaconda3\envs\subEnv\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found。

Solution:

1) Uninstall Shapely

pip uninstall Shapely -y

2) Reinstall Shapely via conda (make sure you have anaconda or miniconda installed)

conda install Shapely             

3. How to generate executables

Using Nuitka version 0.6.19, copy all the files of site-packages under the Lib folder of the conda virtual environment to the dependencies folder, and comment all codes relevant to subprocess of image.py under the paddle library dataset, and use the following packaging command:

 python -m nuitka --standalone --mingw64 --include-data-dir=D:\vse\backend=backend --include-data-dir=D:\vse\dependencies=dependencies  --nofollow-imports --windows-icon-from-ico=D:\vse\design\vse.ico --plugin-enable=tk-inter,multiprocessing --output-dir=out .\gui.py

Make a single .exe file, (pip install zstandard can compress the file):

 python -m nuitka --standalone --windows-disable-console --mingw64 --lto no --include-data-dir=D:\vse\backend=backend --include-data-dir=D:\vse\dependencies=dependencies  --nofollow-imports --windows-icon-from-ico=D:\vse\design\vse.ico --plugin-enable=tk-inter,multiprocessing --output-dir=out --onefile .\gui.py

Core symbols most depended-on inside this repo

format
called by 80
backend/sushi/__main__.py
append_output
called by 29
backend/main.py
format_time
called by 22
backend/sushi/common.py
width
called by 20
backend/bean/subtitle_area.py
height
called by 18
backend/bean/subtitle_area.py
update_preview_with_rect
called by 16
ui/component/video_display_component.py
start
called by 15
backend/tools/python_runner.py
instance
called by 13
backend/tools/process_manager.py

Shape

Method 354
Function 76
Class 55
Route 1

Languages

Python100%

Modules by API surface

backend/tools/concurrent/future.py44 symbols
backend/sushi/demux.py44 symbols
backend/sushi/subs.py39 symbols
backend/main.py39 symbols
ui/component/video_display_component.py36 symbols
backend/sushi/__init__.py28 symbols
ui/component/task_list_component.py27 symbols
ui/home_interface.py25 symbols
gui.py14 symbols
backend/tools/subtitle_extractor_remote_call.py14 symbols
backend/tools/python_runner.py14 symbols
backend/tools/concurrent/task_manager.py13 symbols

Dependencies from manifests, versioned

Levenshtein0.26.0 · 1×
imageio-ffmpeg0.6 · 1×
je-showinfilemanager1.1.6a4 · 1×
lmdb1.6 · 1×
numpy2.2 · 1×
onnxruntime-directml1.20.1 · 1×
opencv-python4.11 · 1×
paddle2onnx1.3.1 · 1×
paddleocr3.4.0 · 1×
paddlepaddle3.3 · 1×
pillow10.4.0 · 1×
pyclipper1.3 · 1×

For agents

$ claude mcp add video-subtitle-extractor \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact