hub / github.com/PKU-YuanGroup/Helios

github.com/PKU-YuanGroup/Helios @main sqlite

971 symbols 3,542 edges 122 files 208 documented · 21%

README

Helios: Real Real-Time Long Video Generation Model

⭐ 14B Real-Time Long Video Generation Model can be Cheaper, Faster but Keep Stronger than 1.3B ones ⭐

[![arXiv](https://img.shields.io/badge/arXiv-2603.04379-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/2603.04379) [![hf_paper](https://img.shields.io/badge/🤗-Paper%20In%20HF-red.svg)](https://huggingface.co/papers/2603.04379) [![Project Page](https://img.shields.io/badge/Project-Website-2ea44f)](https://pku-yuangroup.github.io/Helios-Page) [![hf_space](https://img.shields.io/badge/🤗-Gradio-00b4d8.svg)](https://huggingface.co/spaces/BestWishYsh/Helios-14B-RealTime) [![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-blue)](https://huggingface.co/collections/BestWishYsh/helios) [![ModelScope](https://img.shields.io/badge/🤖-ModelScope-purple)](https://modelscope.cn/collections/BestWishYSH/Helios) [![GitHub](https://img.shields.io/badge/GitHub-black?logo=github)](https://github.com/PKU-YuanGroup/Helios) [![GitCode](https://img.shields.io/badge/GitCodes-blue?logo=gitcode)](https://gitcode.com/weixin_47617277/Helios) [![Ascend](https://img.shields.io/badge/Inference-Ascend--NPU-red)](https://www.hiascend.com/) [![Diffusers](https://img.shields.io/badge/Inference-Diffusers-blueviolet)](https://github.com/huggingface/diffusers/pull/13208) [![SGLang Diffusion](https://img.shields.io/badge/Backend-SGLang--Diffusion-yellow)](https://github.com/sgl-project/sglang/pull/19782) [![vLLM-Omni](https://img.shields.io/badge/Backend-vLLM--Omni-orange)](https://github.com/vllm-project/vllm-omni/pull/1604)

This repository is the official implementation of Helios, which is a breakthrough video generation model that achieves minute-scale, high-quality video synthesis at 19.5 FPS on a single H100 GPU (about 10 FPS on a single Ascend NPU) —without relying on conventional long video anti-drifting strategies or standard video acceleration techniques.

✨ Highlights

Without commonly used anti-drifting strategies (e.g., self-forcing, error-banks, keyframe sampling, or inverted sampling), Helios generates minute-scale videos with high quality and strong coherence.
Without standard acceleration techniques (e.g., KV-cache, causal masking, sparse/linear attention, TinyVAE, progressive noise schedules, hidden-state caching, or quantization), Helios achieves 19.5 FPS in end-to-end inference on a single H100 GPU.
We introduce optimizations that improve both training and inference throughput while reducing memory consumption, enabling image-diffusion-scale batch sizes during training while fitting up to four 14B models within 80 GB of GPU memory.

🎬 Video Demos

or you can click here to get the video. Some best prompts are here.

📣 Latest News!!

[2026.03.26] 🔥 Add summary of FAQ, Tips, and Tutorals: https://github.com/PKU-YuanGroup/Helios/issues/47.
[2026.03.24] 👋 A community-made, unofficial YouTube tutorial for Helios is available here. It covers installation on a consumer-grade PC and supports 4K video generation.
[2026.03.20] 🚀 Helios now supports Ahead-of-Time Compilation (AOTI) on Spaces, with special thanks to the HuggingFace Team! Please refer to this Space for a usage example.
[2026.03.20] 🔧 Based on issue #38, we've identified several ways to further improve Helios's performance, such as fixing the i2v train-inference inconsistency and fully enabling Easy Anti-Drifting. Please refer to commits and correct.yaml for details.
[2026.03.12] ⚡️ Please note that real-time generation performance depends not only on the GPU, but also on the CPU, memory, CUDA driver version, etc. As tested by a user on better hardware with single H100, Helios can reach up to 20.89 FPS!
[2026.03.08] 🚀 Helios now fully supports Group Offloading and Context Parallelism! These features significantly optimize VRAM (only ~6GB) usage and enable inference across multiple GPUs with Ulysses Attention, Ring Attention, Unified Attention, and Ulysses Anything Attention.
[2026.03.06] 👋 Cache-DiT now supports Helios, it offers Fully Cache Acceleration and Parallelism support for Helios! Special thanks to the Cache-DiT Team for their amazing work.
[2026.03.06] 🔧 We fix the Parallel Inference logits for Helios, and provide an example here.
[2026.03.06] 🚀 We official release the Gradio Demo, welcome to try it.
[2026.03.05] 🔥 We are excited to announce the release of the Helios technical report on arXiv. We welcome discussions and feedback!
[2026.03.04] 👋 Day-0 support for Ascend-NPU，with sincere gratitude to the Ascend Team for their support.
[2026.03.04] 👋 Day-0 support for Diffusers，with special thanks to the HuggingFace Team for their support.
[2026.03.04] 👋 Day-0 support for SGLang-Diffusion，with huge thanks to the SGLang Team for their support.
[2026.03.04] 👋 Day-0 support for vLLM-Omni，with heartfelt gratitude to the vLLM Team for their support.
[2026.03.04] 🔥 We've released the training/inference code and weights of Helios-Base, Helios-Mid and Helios-Distilled.

🔥 Friendly Links

If your work has improved Helios and you would like more people to see it, please inform us.

Ascend-NPU: Developed by Huawei, this hardware is designed for efficient AI model training and inference, boosting performance in tasks like computer vision, natural language processing, and autonomous driving.
Diffusers: A popular library designed for working with diffusion models and other generative models in deep learning. It supports easy integration and manipulation of a wide range of generative models.
SGLang-Diffusion: An inference framework for accelerated image and video generation using diffusion models. It provides an end-to-end unified pipeline with optimized kernels and an efficient scheduler loop.
vLLM-Omni: A fully disaggregated serving system for any-to-any models. vLLM-Omni breaks complex architectures into a stage-based graph, using a decoupled backend to maximize resource efficiency and throughput.
Cache-DiT: A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs. It built on top of the Diffusers library and now supports nearly ALL DiTs from Diffusers.

⚙️ Requirements and Installation

Video Tutorial

If you prefer a step-by-step walkthrough, check out this community-made YouTube Tutorial. It covers local installation, 4K video generation, and how to run Helios on a consumer-grade PC, along with other practical usage tips.

Prepare Environment

# 0. Clone the repo
git clone --depth=1 https://github.com/PKU-YuanGroup/Helios.git
cd Helios

# 1. Create conda environment
conda create -n helios python=3.11.2
conda activate helios

# 2. Install PyTorch (adjust for your CUDA version)
# CUDA 12.6
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu126
# CUDA 12.8
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu128
# CUDA 13.0
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu130

# 3. Install dependencies
bash install.sh

Model Download

Models	Download Link	Supports	Notes
Helios-Base	🤗 Huggingface 🤖 ModelScope	T2V ✅ I2V ✅ V2V ✅ Interactive ✅	Best Quality, with v-prediction, standard CFG and custom HeliosScheduler.
Helios-Mid	🤗 Huggingface 🤖 ModelScope	T2V ✅ I2V ✅ V2V ✅ Interactive ✅	Intermediate Ckpt, with v-prediction, CFG-Zero* and custom HeliosScheduler.
Helios-Distilled	🤗 Huggingface 🤖 ModelScope	T2V ✅ I2V ✅ V2V ✅ Interactive ✅	Best Efficiency, with x0-prediction and custom HeliosDMDScheduler.

💡Note: * All three models share the same architecture, but Helios-Mid and Helios-Distilled use a more aggressive multi-scale sampling pipeline to achieve better efficiency. * Helios-Mid is an intermediate checkpoint generated in the process of distilling Helios-Base into Helios-Distilled, and may not meet expected quality. * For Image-to-Video or Video-to-Video, since training is based on Text-to-Video, these two functions may be slightly inferior to Text-to-Video. You may enable is_skip_first_chunk if you find the first few chunks are static or imporve the value of image_noise_sigma_min, image_noise_sigma_max, video_noise_sigma_min, and video_noise_sigma_max.

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download BestWishYSH/Helios-Base --local-dir BestWishYSH/Helios-Base
huggingface-cli download BestWishYSH/Helios-Mid --local-dir BestWishYSH/Helios-Mid
huggingface-cli download BestWishYSH/Helios-Distilled --local-dir BestWishYSH/Helios-Distilled

Download models using modelscope-cli:

pip install modelscope
modelscope download BestWishYSH/Helios-Base --local_dir BestWishYSH/Helios-Base
modelscope download BestWishYSH/Helios-Mid --local_dir BestWishYSH/Helios-Mid
modelscope download BestWishYSH/Helios-Distilled --local_dir BestWishYSH/Helios-Distilled

🚀 Inference

Helios uses an autoregressive approach that generates 33 frames per chunk. For optimal performance, num_frames should be set to a multiple of 33. If a non-multiple value is provided, it will be automatically rounded up to the nearest multiple of 33.

Example frame counts for different video lengths:

num_frames	Adjusted Frames	24 FPS	16 FPS
1449	1452 (33×44)	~60s (1min)	~90s (1min 30s)
720	726 (33×22)	~30s	~45s
240	264 (33×8)	~11s	~16s
129	132 (33×4)	~5.5s	~8s
81	99 (33×3)	~4s	~6s

Run the model

We provide inference scripts for all models covering text-to-video, image-to-video, and video-to-video in this directory.

cd scripts/inference

# For Helios-Base
bash helios-base_t2v.sh
bash helios-base_i2v.sh
bash helios-base_v2v.sh

# For Helios-Mid
bash helios-mid_t2v.sh
bash helios-mid_i2v.sh
bash helios-mid_v2v.sh

# For Helios-Distilled
bash helios-distilled_t2v.sh
bash helios-distilled_i2v.sh
bash helios-distilled_v2v.sh

# For Interactive
# ⚠️ This feature is still under development — results may not always meet expectations
cd scripts/inference/experiment_interactive

Sanity Check

Before trying your own inputs, we highly recommend going through the sanity check to find out if any hardware or software went wrong.

Task	Helios-Base	Helios-Mid	Helios-Distilled

Core symbols most depended-on inside this repo

called by 447

helios/utils/create_ema_zero3.py

update

called by 228

eval/utils/third_party/amt/utils/utils.py

from_pretrained

called by 49

helios/utils/create_ema_zero3.py

load_state_dict

called by 45

helios/utils/create_ema_zero3.py

resize

called by 42

eval/utils/third_party/amt/networks/blocks/ifrnet.py

read

called by 37

eval/utils/third_party/amt/utils/utils.py

encode

called by 29

eval/utils/third_party/ViCLIP/simple_tokenizer.py

img2tensor

called by 26

eval/utils/third_party/amt/utils/utils.py

Shape

Method 467

Function 380

Class 124

Languages

Python100%

Modules by API surface

helios/modules/transformer_helios.py52 symbols

helios/diffusers_version/transformer_helios_diffusers.py30 symbols

eval/utils/third_party/amt/losses/loss.py30 symbols

helios/utils/utils_helios_post.py28 symbols

eval/utils/third_party/amt/utils/utils.py28 symbols

helios/diffusers_version/scheduling_helios_diffusers.py27 symbols

helios/utils/utils_base.py24 symbols

helios/pipelines/pipeline_helios_ode.py24 symbols

helios/pipelines/pipeline_helios.py24 symbols

helios/diffusers_version/pipeline_helios_diffusers.py24 symbols

helios/dataset/dataloader_mp4_dist.py24 symbols

eval/utils/third_party/ViCLIP/viclip_vision.py22 symbols

Dependencies from manifests, versioned

Brotli1.1.0 · 1×

Cython3.1.2 · 1×

Deprecated1.2.18 · 1×

Flask2.3.3 · 1×

GPUtil1.4.0 · 1×

GitPython3.1.44 · 1×

ImageIO2.37.2 · 1×

Jinja23.1.3 · 1×

Markdown3.8.2 · 1×

MarkupSafe2.1.5 · 1×

PyGObject3.42.2 · 1×

PyJWT2.10.1 · 1×

For agents

$ claude mcp add Helios \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact