hub / github.com/danijar/dreamerv3

github.com/danijar/dreamerv3 @main sqlite

815 symbols 2,025 edges 58 files 2 documented · 0%

README

Mastering Diverse Domains through World Models

A reimplementation of DreamerV3, a scalable and general reinforcement learning algorithm that masters a wide range of applications with fixed hyperparameters.

DreamerV3 Tasks

If you find this code useful, please reference in your paper:

@article{hafner2025dreamerv3,
  title={Mastering diverse control tasks through world models},
  author={Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy},
  journal={Nature},
  pages={1--7},
  year={2025},
  publisher={Nature Publishing Group}
}

To learn more:

DreamerV3

DreamerV3 learns a world model from experiences and uses it to train an actor critic policy from imagined trajectories. The world model encodes sensory inputs into categorical representations and predicts future representations and rewards given actions.

DreamerV3 Method Diagram

DreamerV3 masters a wide range of domains with a fixed set of hyperparameters, outperforming specialized methods. Removing the need for tuning reduces the amount of expert knowledge and computational resources needed to apply reinforcement learning.

DreamerV3 Benchmark Scores

Due to its robustness, DreamerV3 shows favorable scaling properties. Notably, using larger models consistently increases not only its final performance but also its data-efficiency. Increasing the number of gradient steps further increases data efficiency.

DreamerV3 Scaling Behavior

Instructions

The code has been tested on Linux and Mac and requires Python 3.11+.

Docker

You can either use the provided Dockerfile that contains instructions or follow the manual instructions below.

Manual

Install JAX and then the other dependencies:

pip install -U -r requirements.txt

Training script:

python dreamerv3/main.py \
  --logdir ~/logdir/dreamer/{timestamp} \
  --configs crafter \
  --run.train_ratio 32

To reproduce results, train on the desired task using the corresponding config, such as --configs atari --task atari_pong.

View results:

pip install -U scope
python -m scope.viewer --basedir ~/logdir --port 8000

Scalar metrics are also writting as JSONL files.

Tips

All config options are listed in dreamerv3/configs.yaml and you can override them as flags from the command line.
The debug config block reduces the network size, batch size, duration between logs, and so on for fast debugging (but does not learn a good model).
By default, the code tries to run on GPU. You can switch to CPU or TPU using the --jax.platform cpu flag.
You can use multiple config blocks that will override defaults in the order they are specified, for example --configs crafter size50m.
By default, metrics are printed to the terminal, appended to a JSON lines file, and written as Scope summaries. Other outputs like WandB and TensorBoard can be enabled in the training script.
If you get a Too many leaves for PyTreeDef error, it means you're reloading a checkpoint that is not compatible with the current config. This often happens when reusing an old logdir by accident.
If you are getting CUDA errors, scroll up because the cause is often just an error that happened earlier, such as out of memory or incompatible JAX and CUDA versions. Try --batch_size 1 to rule out an out of memory error.
Many environments are included, some of which require installing additional packages. See the Dockerfile for reference.
To continue stopped training runs, simply run the same command line again and make sure that the --logdir points to the same directory.

Disclaimer

This repository contains a reimplementation of DreamerV3 based on the open source DreamerV2 code base. It is unrelated to Google or DeepMind. The implementation has been tested to reproduce the official results on a range of environments.

Core symbols most depended-on inside this repo

add

called by 115

embodied/core/replay.py

append

called by 96

embodied/core/chunk.py

update

called by 55

embodied/core/chunk.py

stats

called by 34

embodied/core/replay.py

reset

called by 28

embodied/core/driver.py

insert

called by 22

embodied/core/selectors.py

on_step

called by 22

embodied/core/driver.py

update

called by 17

embodied/jax/utils.py

Shape

Method 593

Function 112

Class 110

Languages

Python100%

Modules by API surface

embodied/jax/outs.py66 symbols

embodied/jax/nets.py61 symbols

embodied/core/selectors.py51 symbols

embodied/core/wrappers.py48 symbols

embodied/envs/minecraft_flat.py45 symbols

embodied/core/streams.py38 symbols

embodied/envs/loconav_quadruped.py27 symbols

dreamerv3/rssm.py26 symbols

embodied/jax/agent.py24 symbols

embodied/core/base.py22 symbols

embodied/jax/utils.py20 symbols

plot.py19 symbols

Dependencies from manifests, versioned

ale_py0.9.0 · 1×

elements3.19.1 · 1×

google-resumable-media2.7.2 · 1×

granular0.20.3 · 1×

ninjax3.5.1 · 1×

nvidia-cuda-nvcc-cu1212.2 · 1×

portal3.5.0 · 1×

scope0.4.4 · 1×

For agents

$ claude mcp add dreamerv3 \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact