MCPcopy Index your code
hub / github.com/POSTECH-CVLab/PyTorch-StudioGAN

github.com/POSTECH-CVLab/PyTorch-StudioGAN @v.0.4.0 sqlite

repository ↗ · DeepWiki ↗ · release v.0.4.0 ↗
670 symbols 1,441 edges 56 files 93 documented · 14%
README


StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation. StudioGAN aims to offer an identical playground for modern GANs so that machine learning researchers can readily compare and analyze a new idea.

Moreover, StudioGAN provides an unprecedented-scale benchmark for generative models. The benchmark includes results from GANs (BigGAN-Deep, StyleGAN-XL), auto-regressive models (MaskGIT, RQ-Transformer), and Diffusion models (LSGM++, CLD-SGM, ADM-G-U).

News

  • Our new paper "StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis" is made public on arXiv.
  • StudioGAN provides implementations of 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 13 regularization modules, 3 differentiable augmentations, 8 evaluation metrics, and 5 evaluation backbones.
  • StudioGAN supports both clean and architecture-friendly metrics (IS, FID, PRDC, IFID) with a comprehensive benchmark.
  • StudioGAN provides wandb logs and pre-trained models (will be ready soon).

Release Notes (v.0.4.0)

  • We checked the reproducibility of implemented GANs.
  • We provide Baby, Papa, and Grandpa ImageNet datasets where images are processed using the anti-aliasing and high-quality resizer.
  • StudioGAN provides a dedicatedly established Benchmark on standard datasets (CIFAR10, ImageNet, AFHQv2, and FFHQ).
  • StudioGAN supports InceptionV3, ResNet50, SwAV, DINO, and Swin Transformer backbones for GAN evaluation.

Features

  • Coverage: StudioGAN is a self-contained library that provides 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 13 regularization modules, 6 augmentation modules, 8 evaluation metrics, and 5 evaluation backbones. Among these configurations, we formulate 30 GANs as representatives.
  • Flexibility: Each modularized option is managed through a configuration system that works through a YAML file, so users can train a large combination of GANs by mix-matching distinct options.
  • Reproducibility: With StudioGAN, users can compare and debug various GANs with the unified computing environment without concerning about hidden details and tricks.
  • Plentifulness: StudioGAN provides a large collection of pre-trained GAN models, training logs, and evaluation results.
  • Versatility: StudioGAN supports 5 types of acceleration methods with synchronized batch normalization for training: a single GPU training, data-parallel training (DP), distributed data-parallel training (DDP), multi-node distributed data-parallel training (MDDP), and mixed-precision training.

Implemented GANs

Method Venue Architecture GC DC Loss EMA
DCGAN arXiv'15 DCGAN/ResNetGAN1 N/A N/A Vanilla False
InfoGAN NIPS'16 DCGAN/ResNetGAN1 N/A N/A Vanilla False
LSGAN ICCV'17 DCGAN/ResNetGAN1 N/A N/A Least Sqaure False
GGAN arXiv'17 DCGAN/ResNetGAN1 N/A N/A Hinge False
WGAN-WC ICLR'17 ResNetGAN N/A N/A Wasserstein False
WGAN-GP NIPS'17 ResNetGAN N/A N/A Wasserstein False
WGAN-DRA arXiv'17 ResNetGAN N/A N/A Wasserstein False
ACGAN-Mod2 - ResNetGAN cBN AC Hinge False
PDGAN ICLR'18 ResNetGAN cBN PD Hinge False
SNGAN ICLR'18 ResNetGAN cBN PD Hinge False
SAGAN ICML'19 ResNetGAN cBN PD Hinge False
TACGAN Neurips'19 BigGAN cBN TAC Hinge True
LGAN ICML'19 ResNetGAN N/A N/A Vanilla False
Unconditional BigGAN ICLR'19 BigGAN N/A N/A Hinge True
BigGAN ICLR'19 BigGAN cBN PD Hinge True
BigGAN-Deep-CompareGAN ICLR'19 BigGAN-Deep CompareGAN cBN PD Hinge True
BigGAN-Deep-StudioGAN - BigGAN-Deep StudioGAN cBN PD Hinge True
StyleGAN2 CVPR' 20 StyleGAN2 cAdaIN SPD Logistic True
CRGAN ICLR'20 BigGAN cBN PD Hinge True
ICRGAN AAAI'21 BigGAN cBN PD Hinge True
LOGAN arXiv'19 ResNetGAN cBN PD Hinge True
ContraGAN Neurips'20 BigGAN cBN 2C Hinge True
MHGAN WACV'21 BigGAN cBN MH MH True
BigGAN + DiffAugment Neurips'20 BigGAN cBN PD Hinge True
StyleGAN2 + ADA Neurips'20 StyleGAN2 cAdaIN SPD Logistic True
BigGAN + LeCam CVPR'2021 BigGAN cBN PD Hinge True
ADCGAN arXiv'21 BigGAN cBN ADC Hinge True
ReACGAN Neurips'21 BigGAN cBN D2D-CE Hinge True
StyleGAN2 + APA Neurips'21 StyleGAN2 cAdaIN SPD Logistic True
StyleGAN3-t Neurips'21 StyleGAN3 cAaIN SPD Logistic True
StyleGAN3-r Neurips'21 StyleGAN3 cAaIN SPD Logistic True

GC/DC indicates the way how we inject label information to the Generator or Discriminator.

EMA: Exponential Moving Average update to the generator. cBN : conditional Batch Normalization. cAdaIN: Conditional version of Adaptive Instance Normalization. AC : Auxiliary Classifier. PD : Projection Discriminator. TAC: Twin Auxiliary Classifier. SPD : Modified PD for StyleGAN. 2C : Conditional Contrastive loss. MH : Multi-Hinge loss. ADC : Auxiliary Discriminative Classifier. D2D-CE : Data-to-Data Cross-Entropy.

Evaluation Metrics

Method Venue Architecture
Inception Score (IS) Neurips'16 InceptionV3
Frechet Inception Distance (FID) Neurips'17 InceptionV3
Improved Precision & Recall Neurips'19 InceptionV3
Classifier Accuracy Score (CAS) Neurips'19 InceptionV3
Density & Coverage ICML'20 InceptionV3
Intra-class FID - InceptionV3
SwAV FID ICLR'21 SwAV
Clean metrics (IS, FID, PRDC) CVPR'22 InceptionV3
Architecture-friendly metrics (IS, FID, PRDC) arXiv'22 Not limited to InceptionV3

Training and Inference Techniques

Method Venue Target Architecture
FreezeD CVPRW'20 Except for StyleGAN2
Top-K Training Neurips'2020 -
DDLS Neurips'2020 -
SeFa CVPR'2021 BigGAN

Reproducibility

We check the reproducibility of GANs implemented in StudioGAN by comparing IS and FID with the original papers. We identify our platform successfully reproduces most of representative GANs except for PD-GAN, ACGAN, LOGAN, SAGAN, and BigGAN-Deep. FQ means Flickr-Faces-HQ Dataset (FFHQ). The resolutions of ImageNet, AFHQv2, and FQ datasets are 128, 512, and 1024, respectively.

Requirements

First, install PyTorch meeting your environment (at least 1.7, recommmended 1.10):

pip3 install torch==1.10.0+cu111 torchvision==0.11.1+cu111 torchaudio==0.10.0+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html

Then, use the following command to install the rest of the libraries:

pip3 install tqdm ninja h5py kornia matplotlib pandas sklearn scipy seaborn wandb PyYaml click requests pyspng imageio-ffmpeg

With docker, you can use:

docker pull mgkang/studiogan:latest

This is my command to make a container named "StudioGAN".

docker run -it --gpus all --shm-size 128g --name StudioGAN -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash

Dataset

  • CIFAR10/CIFAR100: StudioGAN will automatically download the dataset once you execute main.py.

  • Tiny ImageNet, ImageNet, or a custom dataset:

  • download Tiny ImageNet , Baby ImageNet, Papa ImageNet, Grandpa ImageNet, ImageNet. Prepare your own dataset.
  • make the folder structure of the dataset as follows:
data
└── ImageNet, Tiny_ImageNet, Baby ImageNet, Papa ImageNet, or Grandpa ImageNet
    ├── train
    │   ├── cls0
    │   │   ├── train0.png
    │   │   ├── train1.png
    │   │   └── ...
    │   ├── cls1
    │   └── ...
    └── valid
        ├── cls0
        │   ├── valid0.png
        │   ├── valid1.png
        │   └── ...
        ├── cls1
        └── ...

Quick Start

Before starting, users should login wandb using their personal API key.

wandb login PERSONAL_API_KEY

From release 0.3.0, you can now define which evaluation metrics to use through -metrics option. Not specifying option defaults to calculating FID only. i.e. -metrics is fid calculates only IS and FID and -metrics none skips evaluation.

  • Train (-t) and evaluate IS, FID, Prc, Rec, Dns, Cvg (-metrics is fid prdc) of the model defined in CONFIG_PATH using GPU 0.
CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -metrics is fid prdc -cfg CONFIG_PATH -data DATA_PATH -save SAVE_PATH
  • Preprocess images for training and evaluation using PIL.LANCZOS filter (--pre_resizer lanczos). Then, train (-t) and evaluate friendly-IS, friendly-FID, friendly-Prc, friendly-Rec, friendly-Dns, friendly-Cvg (-metrics is fid prdc --post_resizer clean) of the model defined in CONFIG_PATH using GPU 0.
CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -metrics is fid prdc --pre_resizer lanczos --post_resizer clean -cfg CONFIG_PATH -data DATA_PATH -save SAVE_PATH
  • Train (-t) and evaluate FID of the model defined in CONFIG_PATH through DataParallel using GPUs (0, 1, 2, 3). Evaluation of FID does not require (-metrics) argument!

```bash CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -cfg CONFIG_PATH -data DATA_PATH -save

Core symbols most depended-on inside this repo

update
called by 21
src/utils/ema.py
eval
called by 18
src/metrics/preparation.py
prepare_generator
called by 10
src/utils/misc.py
make_resizer
called by 8
src/utils/resize.py
backward
called by 7
src/utils/losses.py
matrix
called by 7
src/utils/ada_aug.py
trunc_normal_
called by 7
src/metrics/vit.py
_make_layer
called by 7
src/metrics/resnet.py

Shape

Method 313
Function 238
Class 119

Languages

Python100%

Modules by API surface

src/utils/misc.py58 symbols
src/utils/losses.py46 symbols
src/models/stylegan2.py39 symbols
src/metrics/swin_transformer.py37 symbols
src/metrics/vit.py35 symbols
src/utils/simclr_aug.py33 symbols
src/utils/style_ops/dnnlib/util.py30 symbols
src/models/stylegan3.py25 symbols
src/utils/ops.py23 symbols
src/worker.py20 symbols
src/sync_batchnorm/batchnorm.py16 symbols
src/metrics/inception_net.py16 symbols

For agents

$ claude mcp add PyTorch-StudioGAN \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact