hub / github.com/Lightricks/ComfyUI-LTXVideo

github.com/Lightricks/ComfyUI-LTXVideo @main sqlite

546 symbols 1,300 edges 55 files 92 documented · 17%

README

ComfyUI-LTXVideo

A collection of powerful custom nodes that extend ComfyUI's capabilities for the LTX-2 video generation model.

LTX-2 is built into ComfyUI core (see it here), making it readily accessible to all ComfyUI users. This repository hosts additional nodes and workflows to help you get the most out of LTX-2's advanced features.

To learn more about LTX-2 See the main LTX-2 repository for model details and additional resources.

Prerequisites

Before you begin using an LTX-2 workflow in ComfyUI, make sure you have:

ComfyUI installed (Download here](https://www.comfy.org/download)
CUDA-compatible GPU with 32GB+ VRAM
100GB+ free disk space for models and cache

Quick Start 🚀

We recommend using the LTX-2 workflows available in Comfy Manager.

Open ComfyUI
Click the Manager button (or press Ctrl+M)
Select Install Custom Nodes
Search for “LTXVideo”
Click Install
Wait for installation to complete
Restart ComfyUI

The nodes will appear in your node menu under the “LTXVideo” category. Required models will be downloaded on first use.

Example Workflows

The ComfyUI-LTXVideo installation includes several example workflows. You can see them all at:

ComfyUI/custom_nodes/ComfyUI-LTXVideo/example_workflows/

LTX-2.3 Workflows:

Older Workflows (LTX-2.0):

Union IC-LoRA Model

We introduce a new Union IC-LoRA model that combines depth and edge (canny) control conditions into a single unified LoRA.

Key Features

Unified Control: A single LoRA that supports multiple control conditions (depth or edges).
Downsampled Latent Processing: The union LoRA operates on a downsampled latent size, which reduces memory usage and significantly speeds up inference while maintaining quality.

How It Works

The union LoRA is trained to understand and respond to both control signals (depth maps and edge maps) within a single model. The model learns to:

Parse multiple conditions: Identify which control signals are present in the input
Process at reduced resolution: Work on downsampled latents to improve efficiency

HDR IC-LoRA

We provide an HDR IC-LoRA that generates linear HDR video encoded in ARRI LogC3, enabling workflows that output high-dynamic-range content suitable for grading and EXR export.

Key Features

Linear HDR output: The LoRA produces frames in LogC3-compressed space; the LTXVHDRDecodePostprocess node decodes these back to linear HDR values.
SDR preview + raw HDR: The node outputs both a Reinhard-tonemapped SDR preview and the raw linear HDR tensor for downstream use.
EXR export: Optionally writes the linear HDR frames as a 16/32-bit EXR image sequence. To enable EXR writing, set OPENCV_IO_ENABLE_OPENEXR=1 in the environment before starting ComfyUI. The exported EXR sequence is best viewed in DJV (or DJV for macOS).

Lipdub IC-LoRA

We provide a Lipdub IC-LoRA that dubs or rephrases speech in video. Given a source video and a text prompt containing the desired dialogue, it generates new lip movements and audio that match the target text while preserving the speaker's identity.

Key Features

Multilingual dubbing: Translate speech into another language - the model regenerates lips and audio to match.
Same-language rephrasing: Change what the speaker says while keeping the original language.
Two-stage pipeline: Stage 1 generates the video and audio at base resolution; Stage 2 upscales while freezing the audio.
Speaker identity preservation: Reference audio tokens provide speaker context so the generated voice stays consistent.

Pixel Spatial Upscaler IC-LoRA

We provide Pixel Spatial Upscaler IC-LoRAs that creatively upscale low-resolution video by synthesizing fine detail rather than simply interpolating pixels. Given a low-resolution reference clip, the model re-renders it at 2× or 4× resolution with generative spatial detail — making it a creative upsampler, not a pixel-accurate refiner.

Key Features

2× and 4× variants: Choose the 2× upscaler for moderate upscaling or the 4× upscaler for larger resolution jumps.
Generative detail synthesis: The model synthesizes texture and structure from the reference rather than faithfully preserving every pixel.
Draft-then-upscale workflow: Generate at a low base resolution (e.g. ~280p) to lock in composition and motion, then run the upscaler for the final high-resolution output.
Tunable fidelity: LoRA strength, guidance, and step count control how closely the output follows the reference — lower values stay closer to the source; higher values allow more creative detail.

Text-to-Audio (T2A)

LTX-2 is a single joint audio/video transformer, but it can generate audio on its own. The LTXVAudioOnlyModel node puts the model into audio-only mode for text-to-audio, with no video output.

Key Features

Audio-only sampling: The node sets the model's run_vx, a2v_cross_attn and v2a_cross_attn flags off, so the audio is denoised with no dependence on the video latent and the video stream is skipped. This matches the reference single-stage T2A pipeline's video=None behavior.
Minimal dummy video latent: The model splits its input positionally into [video, audio], so the sampler still needs a video latent at index 0. Use the LTXVAudioOnlyEmptyVideoLatent node (a fixed 64x64 single-frame placeholder, no params to tweak) joined with the audio latent via LTXVConcatAVLatent; with LTXVAudioOnlyModel active it is never attended to and adds negligible cost.
Audio decode: LTXVAudioVAEDecode extracts the audio directly from the joint latent, then save it with a standard built-in audio node (for example Save Audio (FLAC)).

Required Models

Download the following models:

LTX-2.3 Model Checkpoint - Choose and download one of the models to COMFYUI_ROOT_FOLDER/models/checkpoints folder. * ltx-2.3-22b-dev.safetensors * ltx-2.3-22b-distilled-1.1.safetensors

Spatial Upscaler - Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder. * ltx-2.3-spatial-upscaler-x2-1.1.safetensors * ltx-2.3-spatial-upscaler-x1.5-1.0.safetensors

Temporal Upscaler - Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder. * ltx-2.3-temporal-upscaler-x2-1.0.safetensors

Distilled LoRA - Required for current two-stage pipeline implementations in this repository (except DistilledPipeline and ICLoraPipeline). Download to COMFYUI_ROOT_FOLDER/models/loras folder. * ltx-2.3-22b-distilled-lora-384-1.1.safetensors

Gemma Text Encoder Download all files from the repository to COMFYUI_ROOT_FOLDER/models/text_encoders/gemma-3-12b-it-qat-q4_0-unquantized. * Gemma 3

Core symbols most depended-on inside this repo

get

called by 74

tricks/utils/attn_bank.py

web/js/sparse_track_editor.js

append_guide_attention_entry

called by 8

iclora_attention.py

Shape

Method 295

Function 134

Class 117

Languages

Python93%

TypeScript7%

Modules by API surface

stg.py48 symbols

gemma_encoder.py35 symbols

latents.py28 symbols

web/js/sparse_track_editor.js24 symbols

text_embeddings_connectors.py21 symbols

latent_norm.py21 symbols

tricks/nodes/rf_edit_sampler_nodes.py20 symbols

looping_sampler.py19 symbols

tricks/nodes/ltx_inverse_model_pred_nodes.py18 symbols

easy_samplers.py18 symbols

prompt_enhancer_nodes.py14 symbols

iclora.py14 symbols

Dependencies from manifests, versioned

huggingface_hub0.25.2 · 1×

ninja1.11.1.4 · 1×

For agents

$ claude mcp add ComfyUI-LTXVideo \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact