Official PyTorch implementation of the NeurIPS 2021 paper

Alias-Free Generative Adversarial Networks
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, Timo Aila
https://nvlabs.github.io/stylegan3
Abstract: We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.
For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing
This repository is an updated version of stylegan2-ada-pytorch, with several new features:
- Alias-free generator architecture and training configurations (stylegan3-t, stylegan3-r).
- Tools for interactive visualization (visualizer.py), spectral analysis (avg_spectra.py), and video generation (gen_video.py).
- Equivariance metrics (eqt50k_int, eqt50k_frac, eqr50k).
- General improvements: reduced memory usage, slightly faster training, bug fixes.
Compatibility: - Compatible with old network pickles created using stylegan2-ada and stylegan2-ada-pytorch. (Note: running old StyleGAN2 models on StyleGAN3 code will produce the same results as running them on stylegan2-ada/stylegan2-ada-pytorch. To benefit from the StyleGAN3 architecture, you need to retrain.) - Supports old StyleGAN2 training configurations, including ADA and transfer learning. See Training configurations for details. - Improved compatibility with Ampere GPUs and newer versions of PyTorch, CuDNN, etc.
While new generator approaches enable new media synthesis capabilities, they may also present a new challenge for AI forensics algorithms for detection and attribution of synthetic media. In collaboration with digital forensic researchers participating in DARPA's SemaFor program, we curated a synthetic image dataset that allowed the researchers to test and validate the performance of their image detectors in advance of the public release. Please see here for more details.
Access individual networks via
https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/<MODEL>, where<MODEL>is one of:
stylegan3-t-ffhq-1024x1024.pkl,stylegan3-t-ffhqu-1024x1024.pkl,stylegan3-t-ffhqu-256x256.pkl
stylegan3-r-ffhq-1024x1024.pkl,stylegan3-r-ffhqu-1024x1024.pkl,stylegan3-r-ffhqu-256x256.pkl
stylegan3-t-metfaces-1024x1024.pkl,stylegan3-t-metfacesu-1024x1024.pkl
stylegan3-r-metfaces-1024x1024.pkl,stylegan3-r-metfacesu-1024x1024.pkl
stylegan3-t-afhqv2-512x512.pkl
stylegan3-r-afhqv2-512x512.pkl
Access individual networks via
https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan2/versions/1/files/<MODEL>, where<MODEL>is one of:
stylegan2-ffhq-1024x1024.pkl,stylegan2-ffhq-512x512.pkl,stylegan2-ffhq-256x256.pkl
stylegan2-ffhqu-1024x1024.pkl,stylegan2-ffhqu-256x256.pkl
stylegan2-metfaces-1024x1024.pkl,stylegan2-metfacesu-1024x1024.pkl
stylegan2-afhqv2-512x512.pkl
stylegan2-afhqcat-512x512.pkl,stylegan2-afhqdog-512x512.pkl,stylegan2-afhqwild-512x512.pkl
stylegan2-brecahad-512x512.pkl,stylegan2-cifar10-32x32.pkl
stylegan2-celebahq-256x256.pkl,stylegan2-lsundog-256x256.pkl
conda env create -f environment.ymlconda activate stylegan3The code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing Visual Studio Community Edition and adding it into PATH using "C:\Program Files (x86)\Microsoft Visual Studio\<VERSION>\Community\VC\Auxiliary\Build\vcvars64.bat".
See Troubleshooting for help on common installation and run-time problems.
Pre-trained networks are stored as *.pkl files that can be referenced using local filenames or URLs:
# Generate an image using pre-trained AFHQv2 model ("Ours" in Figure 1, left).
python gen_images.py --outdir=out --trunc=1 --seeds=2 \
--network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl
# Render a 4x2 grid of interpolations for seeds 0 through 31.
python gen_video.py --output=lerp.mp4 --trunc=1 --seeds=0-31 --grid=4x2 \
--network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl
Outputs from the above commands are placed under out/*.png, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.
Docker: You can run the above curated image example using Docker as follows:
# Build the stylegan3:latest image
docker build --tag stylegan3 .
# Run the gen_images.py script using Docker:
docker run --gpus all -it --rm --user $(id -u):$(id -g) \
-v `pwd`:/scratch --workdir /scratch -e HOME=/scratch \
stylegan3 \
python gen_images.py --outdir=out --trunc=1 --seeds=2 \
--network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl
Note: The Docker image requires NVIDIA driver release r470 or later.
The docker run invocation may look daunting, so let's unpack its contents here:
--gpus all -it --rm --user $(id -u):$(id -g): with all GPUs enabled, run an interactive session with current user's UID/GID to avoid Docker writing files as root.-v `pwd`:/scratch --workdir /scratch: mount current running dir (e.g., the top of this git repo on your host machine) to /scratch in the container and use that as the current working dir.-e HOME=/scratch: let PyTorch and StyleGAN3 code know where to cache temporary files such as pre-trained models and custom PyTorch extension build results. Note: if you want more fine-grained control, you can instead set TORCH_EXTENSIONS_DIR (for custom extensions build dir) and DNNLIB_CACHE_DIR (for pre-trained model download cache). You want these cache dirs to reside on persistent volumes so that their contents are retained across multiple docker run invocations.This release contains an interactive model visualization tool that can be used to explore various characteristics of a trained model. To start it, run:
python visualizer.py
You can use pre-trained networks in your own Python code as follows:
with open('ffhq.pkl', 'rb') as f:
G = pickle.load(f)['G_ema'].cuda() # torch.nn.Module
z = torch.randn([1, G.z_dim]).cuda() # latent codes
c = None # class labels (not used in this example)
img = G(z, c) # NCHW, float32, dynamic range [-1, +1], no truncation
The above code requires torch_utils and dnnlib to be accessible via PYTHONPATH. It does not need source code for the networks themselves — their class definitions are loaded from the pickle via torch_utils.persistence.
The pickle contains three networks. 'G' and 'D' are instantaneous snapshots taken during training, and 'G_ema' represents a moving average of the generator weights over several training steps. The networks are regular instances of torch.nn.Module, with all of their parameters and buffers placed on the CPU at import and gradient computation disabled by default.
The generator consists of two submodules, G.mapping and G.synthesis, that can be executed separately. They also support various additional options:
w = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)
img = G.synthesis(w, noise_mode='const', force_fp32=True)
Please refer to gen_images.py for complete code example.
Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels. Custom datasets can be created from a folder containing images; see python dataset_tool.py --help for more information. Alternatively, the folder can also be used directly as a dataset, without running it through dataset_tool.py first, but doing so may lead to suboptimal performance.
FFHQ: Download the Flickr-Faces-HQ dataset as 1024x1024 images and create a zip archive using dataset_tool.py:
# Original 1024x1024 resolution.
python dataset_tool.py --source=/tmp/images1024x1024 --dest=~/datasets/ffhq-1024x1024.zip
# Scaled down 256x256 resolution.
python dataset_tool.py --source=/tmp/images1024x1024 --dest=~/datasets/ffhq-256x256.zip \
--resolution=256x256
See the FFHQ README for information on how to obtain the unaligned FFHQ dataset images. Use the same steps as above to create a ZIP archive for training and validation.
MetFaces: Download the MetFaces dataset and create a ZIP archive:
python dataset_tool.py --source=~/downloads/metfaces/images --dest=~/datasets/metfaces-1024x1024.zip
See the MetFaces README for information on how to obtain the unaligned MetFaces dataset images. Use the same steps as above to create a ZIP archive for training and validation.
AFHQv2: Download the AFHQv2 dataset and create a ZIP archive:
python dataset_tool.py --source=~/downloads/afhqv2 --dest=~/datasets/afhqv2-512x512.zip
Note that the above command creates a single combined dataset using all images of all three classes (cats, dogs, and wild animals), matching the setup used in the StyleGAN3 paper. Alternatively, you can also create a separate dataset for each class:
python dataset_tool.py --source=~/downloads/afhqv2/train/cat --dest=~/datasets/afhqv2cat-512x512.zip
python dataset_tool.py --source=~/downloads/afhqv2/train/dog --dest=~/datasets/afhqv2dog-512x512.zip
python dataset_tool.py --source=~/downloads/afhqv2/train/wild --dest=~/datasets/afhqv2wild-512x512.zip
You can train new networks using train.py. For example:
```.bash
python train.py --outdir=~/training-runs --cfg=stylegan3-t --data=~/datasets/afhqv2-512x512.zip \ --gp
$ claude mcp add stylegan3 \
-- python -m otcore.mcp_server <graph>