hub / github.com/leofan90/Awesome-World-Models

github.com/leofan90/Awesome-World-Models @main sqlite

62 symbols 270 edges 5 files 0 documented · 0%

README

Awesome World Models for Robotics

This repository provides a curated list of papers for World Models for General Video Generation, Embodied AI, and Autonomous Driving. Template from Awesome-LLM-Robotics and Awesome-World-Model

Contributions are welcome! Please feel free to submit pull requests or reach out via email to add papers!

If you find this repository useful, please consider citing and giving this list a star ⭐. Feel free to share it with others!

Maintainers can use the arXiv candidate pipeline to discover recent papers for review.

Overview

Awesome World Models for Robotics
- Contributions are welcome! Please feel free to submit pull requests or reach out via email to add papers!
Overview
Foundation paper of World Model
Blog or Technical Report
Surveys
Benchmarks \& Evaluation
General World Models
World Models for Embodied AI
World Models for VLA
World Models for Visual Understanding
World Models for Autonomous Driving
- Refer to https://github.com/LMD0311/Awesome-World-Model
Citation

Foundation paper of World Model

World Models, NIPS 2018 Oral. [Paper] [Website]

Blog or Technical Report

NVIDIA OmniDreams, NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation. [Paper]
Cosmos 3, Cosmos 3: Omnimodal World Models for Physical AI. [Paper] [Website]
X Square Robot, WALL-WM: Carving World Action Modeling at the Event Joints. [Paper] [Website]
GE-Sim 2.0, GE-Sim 2.0: A Roadmap Towards Comprehensive Closed-loop Video World Simulators for Robotic Manipulation. [Paper] [Website]
Xiaomi EV World Model, Xiaomi EV World Model: A Joint World Model Integrating Reconstruction and Generation for Autonomous Driving. [Paper] [Website]
Coowa, The DAWN of World-Action Interactive Models. [Paper]
Being-H0.7, Being-H0.7: A Latent World-Action Model from Egocentric Videos. [Paper]
MotuBrain, MotuBrain: An Advanced World Action Model for Robot Control. [Paper]
Cortex 2.0, Cortex 2.0: Grounding World Models in Real-World Industrial Deployment. [Paper]
HY-World 2.0, HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D World. [Paper] [Website] [Code]
Helios, Real Real-Time Long Video Generation Model. [Paper] [Website] [Code]
Seedance 2.0, Seedance 2.0: Advancing Video Generation for World Complexity. [Paper] [Website]
Matrix-Game 3.0, Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory. [Paper] [Website]
HY-Embodied-0.5, HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents. [Paper] [Website]
OpenWorldLib, OpenWorldLib: A Unified Codebase and Definition of Advanced World Models. [Paper]
ABot-PhysWorld, ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment. [Paper]
GigaWorld-Policy, GigaWorld-Policy: An Efficient Action-Centered World--Action Model. [Paper]
GigaBrain-0.5M*, GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning. [Paper] [Website]
ALIVE, ALIVE: Animate Your World with Lifelike Audio-Video Generation. [Paper] [Website]
DreamDojo, DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos. [Paper] [Website]
lingbot-va, Causal World Modeling for Robot Control. [Paper] [Website] [Code]
lingbot-world, Advancing Open-source World Models. [Paper] [Website] [Code]
TARS, World In Your Hands: A Large-Scale and Open-source Ecosystem for Learning Human-centric Manipulation in the Wild. [Paper] [Website]
SIMA 2, SIMA 2: A Generalist Embodied Agent for Virtual Worlds. [Paper]
SimWorld, SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds. [Paper] [Website]
Hunyuan-GameCraft-2, Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model. [Paper] [Website]
GigaWorld-0, GigaWorld-0: World Models as Data Engine to Empower Embodied AI. [Paper] [Website]
PAN, PAN: A World Model for General, Interactable, and Long-Horizon World Simulation. [Paper]
Cosmos-Predict2.5, World Simulation with Video Foundation Models for Physical AI. [Paper] [Code]
Emu3.5, Emu3.5: Native Multimodal Models are World Learners. [Paper] [Website] [Code]
ODesign, ODesign: A World Model for Biomolecular Interaction Design. [Paper] [Website]
GigaBrain-0, GigaBrain-0: A World Model-Powered Vision-Language-Action Model. [Paper] [Website]
CWM, CWM: An Open-Weights LLM for Research on Code Generation with World Models. [Paper] [Website] [Code]
WoW, WoW: Towards a World omniscient World model Through Embodied Interaction. [Paper] [Website]
Matrix-Game 2.0, Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model. [Paper] [Website]
Matrix-3D, Matrix-3D: Omnidirectional Explorable 3D World Generation. [Paper] [Website]
HunyuanWorld 1.0, HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels. [Paper] [Website] [Code]
What Does it Mean for a Neural Network to Learn a "World Model"?. [Paper]
Matrix-Game, Matrix-Game: Interactive World Foundation Model. [Paper] [Code]
Cosmos-Drive-Dreams, Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models. [Paper] [Website]
GAIA-2, GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving. [Paper] [Website]
Cosmos, Cosmos World Foundation Model Platform for Physical AI. [Paper] [Website] [Code]
1X Technologies, 1X World Model. [Blog]
Runway, Introducing General World Models. [Blog]
Wayve, Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy. [Paper] [Blog]
Yann LeCun, A Path Towards Autonomous Machine Intelligence. [Paper]

Survey

"From World Models to World Action Models: A Concise Tutorial for Robotics", arxiv 2026.07. [Paper]
"Autonomous Video Generation with Counterfactual Controllability for Self-Evolving World Models", arxiv 2026.06. [Paper]
"Medical world models: representing medical states, modelling clinical dynamics and guiding intervention policies", arxiv 2026.06. [Paper]
"Bridging the Agent-World Gap: Text World Models for LLM-based Agents", arxiv 2026.06. [Paper]
"Towards Interactive Video World Modeling: Frontiers, Challenges, Benchmarks, and Future Trends", arxiv 2026.05. [Paper]
"Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses" arXiv 2026.05. [Paper] [Website]
"Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform" arXiv 2026.05. [Paper]
"World Action Models: The Next Frontier in Embodied AI" arXiv 2026.05. [Paper]
"Latent State Design for World Models under Sufficiency Constraints" arXiv 2026.05. [Paper]
"World Model for Robot Learning: A Comprehensive Survey" arXiv 2026.05. [Paper] [Website] [Code]
"Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling" arXiv 2026.04. [Paper] [Code]
"Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond" arXiv 2026.04. [Paper]
"Infrastructure-Centric World Models: Bridging Temporal Depth and Spatial Breadth for Roadside Perception" arXiv 2026.04. [Paper]
"Human Cognition in Machines: A Unified Perspective of World Models" arXiv 2026.04. [Paper]
"Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms" arXiv 2026.03. [Paper]
"From Digital Twins to World Models:Opportunities, Challenges, and Applications for Mobile Edge General Intelligence" arXiv 2026.03. [Paper]
"The Trinity of Consistency as a Defining Principle for General World Models" arXiv 2026.02. [Paper] [Code]
"A Mechanistic View on Video Generation as World Models: State and Dynamics", arXiv 2026.01. [Paper]
"From Generative Engines to Actionable Simulators: The Imperative of Physical Grounding in World Models", arXiv 2026.01. [Paper]
"Modeling the Mental World for Embodied AI: A Comprehensive Review", **`arXiv 2026

Core symbols most depended-on inside this repo

parse_feed

called by 9

scripts/arxiv_candidates.py

normalize_arxiv_id

called by 8

scripts/arxiv_candidates.py

render_report

called by 6

scripts/arxiv_candidates.py

parse_checked_candidates

called by 6

scripts/apply_issue_selections.py

required_text

called by 5

scripts/arxiv_candidates.py

extract_readme_arxiv_ids

called by 5

scripts/arxiv_candidates.py

apply_candidates

called by 4

scripts/apply_issue_selections.py

parse_datetime

called by 3

scripts/arxiv_candidates.py

Shape

Function 34

Method 22

Class 5

Route 1

Languages

Python100%

Modules by API surface

scripts/arxiv_candidates.py25 symbols

tests/test_arxiv_candidates.py16 symbols

scripts/apply_issue_selections.py11 symbols

tests/test_apply_issue_selections.py10 symbols

For agents

$ claude mcp add Awesome-World-Models \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact