MCPcopy Index your code
hub / github.com/HKUDS/ViMax

github.com/HKUDS/ViMax @v1.1.0

repository ↗ · DeepWiki ↗ · release v1.1.0 ↗ · + Follow
666 symbols 3,025 edges 82 files 38 documented · 6%
README

HKUDS%2FViMax | Trendshift

ViMax: Agentic Video Generation

<img src="https://img.shields.io/badge/🐍Python-3.12-00d9ff?style=for-the-badge&logo=python&logoColor=white&labelColor=1a1a2e">
<a href="https://github.com/astral-sh/uv"><img src="https://img.shields.io/badge/⚡uv-Ready-ff6b6b?style=for-the-badge&logo=python&logoColor=white&labelColor=1a1a2e"></a>
<img src="https://img.shields.io/badge/License-MIT-4ecdc4?style=for-the-badge&logo=opensourceinitiative&logoColor=white" alt="MIT License">







<a href="https://github.com/HKUDS/ViMax/raw/v1.1.0/Communication.md"><img src="https://img.shields.io/badge/💬Feishu-Group-07c160?style=for-the-badge&logoColor=white&labelColor=1a1a2e"></a>
<a href="https://github.com/HKUDS/ViMax/raw/v1.1.0/Communication.md"><img src="https://img.shields.io/badge/WeChat-Group-07c160?style=for-the-badge&logo=wechat&logoColor=white&labelColor=1a1a2e"></a>







<a href='https://www.youtube.com/@AI-Creator-is-here'><img src='https://badges.aleen42.com/src/youtube.svg' /></a>














<a href="https://github.com/HKUDS/ViMax/raw/v1.1.0/readme.md"><img src="https://img.shields.io/badge/English-1a1a2e?style=for-the-badge"></a>
<a href="https://github.com/HKUDS/ViMax/raw/v1.1.0/README_ZH.md"><img src="https://img.shields.io/badge/中文版-1a1a2e?style=for-the-badge"></a>



<a href="#quick-start" style="text-decoration: none;">
  <img src="https://img.shields.io/badge/Quick%20Start-Get%20Started%20Now-FFC107?style=for-the-badge&logo=rocket&logoColor=white&labelColor=1a1a2e">
</a>

🚨 Current Video Generation Limitations:

  • Limited to Short Clips - Most AI tools generate only seconds of footage.

  • Consistency Chaos - Characters and scenes change unpredictably across frames.

  • Visual-Only Focus - Missing scripts, audio, narrative structure, and storytelling depth.

💡 ViMax Solution:

🎬 Director, Screenwriter, Producer, and Video Generator All-in-One! We're exploring a future where AI becomes a complete creative powerhouse. 💡 Simply input your concept. ViMax autonomously handles the rest. It orchestrates scriptwriting, storyboarding, character creation, and final video generation—all end-to-end. 🚀

https://github.com/user-attachments/assets/5bad46b2-8276-4e1d-9480-3522640744b2


📑 Table of Contents


💡 Key Features

🌟 Idea2Video

Algorithm Badge From Spark to Screen Transform raw ideas into complete video stories through intelligent multi-agent workflows automating storytelling, character design, and production .

🎨 Novel2Video

Frontend Badge Smart Literary Adaptation Engine Transform complete novels into episodic video content with intelligent narrative compression, character tracking, and scene-by-scene visual adaptation

⚙️ Script2Video

Backend Badge Unlimited Screenplay Video Creation Unleash your creativity by writing any screenplay from personal stories to epic adventures, giving you complete control over every aspect of your visual storytelling.

🤳 AutoCameo

Backend Badge Generate Video from Your Photo Create your own cameo video, transforming yourself/pet into a guest star who appears across limitless creative scripts, cinematic sequences, and interactive storylines.

🔮Video Demos Generated from Scratch


🎯 End-to-End Video Creation Engine

The Challenges:

  • 🌅 Reference Images: Time-consuming acquisition, organization, and alignment of reference frames that accurately capture characters, objects, positions, and environments.

  • 🫠 Consistency Check: Sometimes, the image generator may generate unusable images even if it is given the correct characters, position, environment reference image and prompts.

  • 📄 Scripts Generation: Professional and high-quality videos need to have rich information density and structured design.

  • 📝 Storyboard Design: Converting stories into visual narratives requires expertise in cinematography, scene composition, and visual storytelling that most creators lack.

  • 🎬 Shot Design: Creating coherent camera sequences with proper angles, transitions, and pacing while maintaining narrative flow across complex scenes.

  • 🎨 Development Delays: Ensuring character appearances, environments, and artistic style remain consistent across hundreds of shots in long-form content.

  • ⏱️ Production Efficiency: Traditional video creation involves multiple specialists and lengthy workflows, creating barriers for independent creators and rapid prototyping.

  • 🎥 Scaling AI Generated Video: AI-generated videos are usually only a few seconds long, high-quality long videos at the minute or even hour level require complex cross-scene continuity and multi-storyboards design and processing capabilities.

ViMAX: eliminates these production bottlenecks by automating the entire video creation pipeline from narrative input to final video output.


🔥 Why ViMax?

🧠 Effortless Production 🚀 Complete Creative Freedom 🔊 Audio and Video Binding 🎨 Professional Quality 🤩 Interactive Video
One-Prompt to Finished Video From Any Narrative to Reality Synchronized Storytelling Movie-Grade Output Make Your Own Cameo Video
Skip the technical complexity—just describe your vision and let ViMax handle script generation, storyboarding, shot design, reference management, and consistency validation No creative limits—whether it's a trailer, short story, novel chapter, or original concept, ViMax intelligently structures narratives and designs cinematography to bring any idea to life Seamlessly integrate character voice, and sound effects with visual content to create immersive experiences where audio and video work in perfect harmony Automated quality control ensures character consistency, proper scene composition, and professional visual standards across every frame of your video Interact in your own short stories by uploading your photo—ViMax intelligently integrates you as a character with consistent appearance and natural interactions throughout the entire video

ViMax now also includes an Agents Loop + TUI workflow for interactive planning, revision, rendering control, session reuse, and context compaction while preserving the original direct pipeline entrypoints.


☄️ Coming Soon

  • 👨‍💻 Google AI Studio API config✅
  • 🤖 Agents Loop + TUI✅
  • 📄 Technical Report☑️

🏗️ Architecture

📊 System Overview

ViMax is a multi-agent video framework that enables automated multi-shot video generation while ensuring character and scene consistency. Our system seamlessly translates your ideas into corresponding videos, allowing you to focus on storytelling rather than technical implementation.

🎯 Technical Capabilities:

🧬 Intelligent Long Script Generation

RAG-based long script design engine that intelligently analyzes lengthy, novel-like stories and automatically segments them into a multi-scene script format. The process meticulously ensures that all key plot developments and character dialogues are accurately retained within the new structure.

🪄 Expressive Storyboard Design

Shot-level storyboard design system that create expressive storyboards through cinematography language based on user requirements and target audiences, which establishs the narrative rhythm for subsequent video generation.

🔮 Multi-camera Filming Simulation

Simulates multi-camera filming to deliver an immersive viewing experience while maintaining consistent character positioning and backgrounds within the same scene.

🧸 Intelligent Reference Images Selection

Intelligently select the reference image required for the first frame of the current video, including the storyboards that occurred in the previous timeline, to ensure the accuracy of multiple characters and environmental elements as the video becomes longer.

⚙️ Automated Images Generation

Based on the selected reference image and the visual logical order on the previous timeline, the prompt of the image generator is automatically generated to reasonably arrange the spatial interaction position between the character and the environment.

Automated Image Generation Consistency Check

Generate multiple images in parallel and select the best consistent image as the first frame through MLLM/VLM to imitate the workflow of human creators.

High-efficiency Parallel Shot Generation

Parallel processing for sequential shots captured from the same camera enables highly efficient video production.

🤖 Multi-Agent Video Generation Pipeline

🧠 INPUT LAYER 📝 Idea & Scripts & Novels • 💭 Natural Language Prompts • 🖼️ Reference Images •

Core symbols most depended-on inside this repo

get
called by 200
agent_runtime/session_index.py
_emit_render_progress
called by 54
pipelines/script2video_pipeline.py
split
called by 35
agents/novel_compressor.py
load
called by 32
agent_runtime/session_index.py
create
called by 32
agent_runtime/session_index.py
resolve_chat_model_config
called by 27
utils/provider_presets.py
update_stage
called by 18
agent_runtime/session_index.py
write
called by 18
agent_runtime/vimax_adapters.py

Shape

Method 367
Function 169
Class 122
Route 8

Languages

Python94%
TypeScript6%

Modules by API surface

tests/test_vimax_adapters.py45 symbols
agent_runtime/vimax_adapters.py45 symbols
tests/test_novel2video_adapter.py37 symbols
agent_runtime/tools.py35 symbols
tests/test_provider_presets.py34 symbols
agent_runtime/session_index.py25 symbols
ui/src/cli.tsx23 symbols
agent_runtime/context_compactor.py22 symbols
agent_runtime/config.py21 symbols
tests/test_main_agent_cli.py20 symbols
pipelines/script2video_pipeline.py20 symbols
tests/test_omni_yunwu_video_generator.py18 symbols

Dependencies from manifests, versioned

@types/node24.5.2 · 1×
@types/react18.3.12 · 1×
@types/react-dom18.3.1 · 1×
ink4.4.1 · 1×
ink-text-input5.0.1 · 1×
react18.3.1 · 1×
react-dom18.3.1 · 1×
tsx4.20.5 · 1×
typescript5.6.3 · 1×
aiohttp3.12.14 · 1×
chardet5.2.0 · 1×
faiss-cpu1.12.0 · 1×

For agents

$ claude mcp add ViMax \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact