hub / github.com/ludwig-ai/ludwig

github.com/ludwig-ai/ludwig @v0.17.6 sqlite

repository ↗ · DeepWiki ↗ · release v0.17.6 ↗

8,753 symbols 34,689 edges 804 files 2,439 documented · 28%

README

Declarative deep learning framework for LLMs, multimodal models, and tabular AI.

Docs · Getting Started · Examples · Discord

What is Ludwig?

Ludwig is a declarative deep learning framework that lets you train, fine-tune, and deploy AI models — from LLM fine-tuning to tabular classification — using a YAML config file and zero boilerplate Python.

# Fine-tune Llama-3.1 with LoRA in one config file
model_type: llm
base_model: meta-llama/Llama-3.1-8B
adapter:
  type: lora
trainer:
  type: finetune
  epochs: 3
input_features:
  - name: instruction
    type: text
output_features:
  - name: response
    type: text

ludwig train --config model.yaml --dataset my_data.csv

Tech stack: Python 3.12 · PyTorch 2.7+ · Pydantic 2 · Transformers 5 · Ray 2.54

Ludwig is hosted by the Linux Foundation AI & Data.

What's New in Ludwig 0.16

Feature	Description
PatchTST & N-BEATS encoders	State-of-the-art timeseries forecasting encoders with MASE/sMAPE metrics
Advanced PEFT adapters	PiSSA, EVA, CorDA/LoftQ initializers; TinyLoRA, OFT, HRA, WaveFT, LN-Tuning, VBLoRA, C3A adapter types
VLM fine-tuning	Train LLaVA, Qwen2-VL, InternVL via `is_multimodal: true` with gated cross-attention
HyperNetwork combiner	Conditioning-based feature fusion — one feature generates weights for others
Nash-MTL & Pareto-MTL	Game-theoretic and preference-based multi-task loss balancing
LLM config generation	`ludwig generate_config "describe your task"` — LLM writes the YAML for you
ModelInspector	Architecture analysis, weight collection, feature importance proxy
Ray Serve & KServe	Distributed and Kubernetes-native model deployment shims
GRPO alignment	Reward-model-free RLHF via Group Relative Policy Optimization
torchao quantization + QAT	PyTorch-native `int4/int8/float8` with Quantization-Aware Training
Multi-adapter PEFT	Multiple named LoRA adapters with weighted merging (TIES, DARE, SVD)
Native Optuna executor	GPT/TPE/CMA-ES samplers, pruning, resumable SQLite/PostgreSQL storage
Timeseries forecasting	`model.forecast(dataset, horizon=N)` API with `TimeseriesOutputFeature`
Muon & ScheduleFreeAdamW	New optimizers for large-scale pretraining and fine-tuning
Image segmentation decoders	UNet, SegFormer, FPN decoders for semantic segmentation

Installation

pip install ludwig           # core
pip install ludwig[full]     # all optional dependencies
pip install ludwig[llm]      # LLM fine-tuning only

Requires Python 3.12+. See contributing for a full dependency matrix.

Quick Start

Fine-tune an LLM (instruction tuning)

Ludwig supports the full LLM fine-tuning spectrum:

Technique	Config key
Supervised fine-tuning (SFT)	`trainer.type: finetune`
DPO / KTO / ORPO / GRPO alignment	`trainer.type: dpo` (or `kto`, `orpo`, `grpo`)
LoRA / DoRA / VeRA / PiSSA	`adapter.type: lora` (or `dora`, `vera`, `lora` + `init_weights: pissa`)
4-bit QLoRA (bitsandbytes)	`quantization.bits: 4`
torchao + QAT	`quantization.backend: torchao`
Multi-adapter with merging	`adapters:` dict + `merge:` block
VLM (vision-language)	`is_multimodal: true`

model_type: llm
base_model: meta-llama/Llama-3.1-8B

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction: {instruction}
    ### Input: {input}
    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

trainer:
  type: finetune
  learning_rate: 0.0001
  batch_size: 1
  gradient_accumulation_steps: 16
  epochs: 3
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.01

backend:
  type: local

export HUGGING_FACE_HUB_TOKEN="<your_token>"
ludwig train --config model.yaml --dataset "ludwig://alpaca"

Train a multimodal classifier

input_features:
  - name: review_text
    type: text
    encoder:
      type: bert
  - name: star_rating
    type: number
  - name: product_image
    type: image
    encoder:
      type: dinov2

output_features:
  - name: recommended
    type: binary

ludwig train --config model.yaml --dataset reviews.csv

Generate a config from natural language

ludwig generate_config "I have a CSV with age, income, education level, and I want to predict loan default"

Make predictions

ludwig predict --model_path results/experiment_run/model --dataset new_data.csv

Launch a REST API

ludwig serve --model_path results/experiment_run/model
# POST http://localhost:8000/predict

Capabilities

LLM Fine-Tuning

Supervised fine-tuning (SFT) on instruction/response pairs
Alignment training: DPO, KTO, ORPO, GRPO (reward-model-free RLHF)
PEFT adapters: LoRA, DoRA, VeRA, LoRA+, TinyLoRA, OFT, HRA, WaveFT, LN-Tuning, VBLoRA, C3A
LoRA initializers: PiSSA, EVA, CorDA, LoftQ for improved convergence
Multi-adapter PEFT: multiple named adapters on one base model, switchable at runtime; merge with TIES, DARE, SVD, magnitude pruning
Quantization: 4-bit/8-bit QLoRA (bitsandbytes), torchao int4/int8/float8 with QAT
VLM fine-tuning: LLaVA, Qwen2-VL, InternVL via is_multimodal: true
Sequence packing for efficient training on variable-length inputs
Paged and 8-bit optimizers for memory-efficient training

Multimodal & Tabular Models

Input modalities: text, numbers, categories, binary, sets, bags, sequences, images, audio, timeseries, vectors, dates
Text encoders: any HuggingFace Transformer (BERT, RoBERTa, ModernBERT, Qwen3, Llama-3.1, etc.), plus Mamba-2, Jamba
Image encoders: DINOv2, ConvNeXt, EfficientNet, ViT, CAFormer, ConvFormer, PoolFormer, TIMM (1000+ models)
Timeseries encoders: PatchTST, N-BEATS, CNN, RNN, Transformer; MASE and sMAPE metrics; model.forecast() API
Combiners: concat, transformer, tab_transformer, FT-Transformer, TabNet, TabPFN v2, HyperNetwork, ProjectAggregate, GatedFusion, Perceiver
Multi-task learning: multiple output features in a single model; Nash-MTL, Pareto-MTL, FAMO, GradNorm, uncertainty loss balancing
Image segmentation: UNet, SegFormer, FPN decoders

Training Infrastructure

Distributed training: HuggingFace Accelerate with DDP, FSDP, DeepSpeed (zero-code changes)
Ray backend: training across a Ray cluster, larger-than-memory datasets via Ray Data
Automatic batch size selection and learning rate range test
Mixed precision (fp16/bf16), gradient checkpointing, gradient accumulation
Optimizers: AdamW, Adafactor, SGD, Muon, ScheduleFreeAdamW, Lion, paged/8-bit variants
Learning rate schedulers: cosine, linear, polynomial, reduce-on-plateau, OneCycleLR
Model Soup: uniform and greedy checkpoint averaging for better generalization at zero inference cost
Modality dropout for robust multimodal models

Hyperparameter Optimization

Executors: Ray Tune (ASHA, PBT, Bayesian) and native Optuna (auto/GP/TPE/CMA-ES)
Optuna persistence: SQLite or PostgreSQL for resumable HPO runs
Pruning with Optuna's MedianPruner and HyperbandPruner
Search spaces: uniform, log-uniform, choice, randint, quantized
Full Ludwig config is searchable — any nested parameter can be a hyperparameter

Production & Deployment

REST API: FastAPI server with Prometheus metrics and structured logging (ludwig serve)
vLLM serving: OpenAI-compatible API with PagedAttention and continuous batching
Ray Serve: distributed deployment with auto-scaling and traffic splitting
KServe: Kubernetes-native deployment with Open Inference Protocol v2
Model export: SafeTensors (default), torch.export .pt2 bundles, ONNX
HuggingFace Hub: ludwig upload hf_hub — push model + auto-generated model card
Docker: prebuilt containers at ludwigai/ludwig

Tooling & Integrations

Experiment tracking: TensorBoard, Weights & Biases, Comet ML, MLflow, Aim Stack
Model inspection: ModelInspector — weight enumeration, architecture summary, feature importance proxy
Visualizations: learning curves, confusion matrices, calibration plots, ROC curves, hyperopt analysis
AutoML: ludwig.automl.auto_train() — give it a dataset and a time budget; the YAML-driven search space samples encoder/combiner/decoder combinations and validates them before training
Dataset quality checks: from ludwig.utils.dataset_quality import check_dataset_quality — validates a DataFrame before training (missing values, class imbalance, near-duplicate columns, ID leakage, …)
OpenML integration: load any OpenML task directly — OpenMLLoader fetches by task ID and caches locally as Parquet
LLM config generation: ludwig generate_config "describe your task" — LLM writes the YAML
K-fold cross-validation: ludwig experiment --k_fold N
Dataset Zoo: 70+ built-in benchmark datasets (ludwig://mnist, ludwig://alpaca, …)

Examples

LLM & Alignment

Use Case	Link
LLM instruction tuning (LoRA + QLoRA)	examples/llm
DPO / GRPO alignment	examples/llm/alignment
Advanced PEFT (PiSSA, OFT, VBLoRA, …)	examples/llms/peft_advanced
VLM fine-tuning (LLaVA, Qwen2-VL)	examples/vlm

Tabular & Multimodal

Use Case	Link
Binary classification (Titanic)	examples/titanic
Tabular classification (census income)	examples/adult_census_income
Multimodal classification

Core symbols most depended-on inside this repo

tests/integration_tests/utils.py

items

called by 253

ludwig/utils/registry.py

tests/integration_tests/utils.py

text_feature

called by 144

tests/integration_tests/utils.py

Shape

Method 4,195

Function 2,907

Class 1,519

Route 132

Languages

Python100%

Modules by API surface

ludwig/utils/tokenizers.py245 symbols

ludwig/encoders/text_encoders.py151 symbols

tests/ultra_slow/test_ultra_slow.py149 symbols

ludwig/modules/metric_modules.py119 symbols

ludwig/schema/llms/peft.py98 symbols

ludwig/features/number_feature.py84 symbols

ludwig/features/image_feature.py81 symbols

ludwig/modules/loss_modules.py79 symbols

ludwig/schema/utils.py78 symbols

ludwig/utils/data_utils.py74 symbols

ludwig/schema/optimizers.py71 symbols

ludwig/modules/convolutional_modules.py71 symbols

Dependencies from manifests, versioned

PyYAML6.0 · 1×

absl-py1×

kaggle1×

numpy1.24 · 1×

pandas2.0 · 1×

py-cpuinfo1×

requests2.28 · 1×

scikit-learn1.3 · 1×

scipy1.10 · 1×

sentencepiece0.2 · 1×

spacy2.3 · 1×

tabulate0.9 · 1×

For agents

$ claude mcp add ludwig \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact