Declarative deep learning framework for LLMs, multimodal models, and tabular AI.
Docs · Getting Started · Examples · Discord
Ludwig is a declarative deep learning framework that lets you train, fine-tune, and deploy AI models — from LLM fine-tuning to tabular classification — using a YAML config file and zero boilerplate Python.
# Fine-tune Llama-3.1 with LoRA in one config file
model_type: llm
base_model: meta-llama/Llama-3.1-8B
adapter:
type: lora
trainer:
type: finetune
epochs: 3
input_features:
- name: instruction
type: text
output_features:
- name: response
type: text
ludwig train --config model.yaml --dataset my_data.csv
Tech stack: Python 3.12 · PyTorch 2.7+ · Pydantic 2 · Transformers 5 · Ray 2.54
Ludwig is hosted by the Linux Foundation AI & Data.
| Feature | Description |
|---|---|
| PatchTST & N-BEATS encoders | State-of-the-art timeseries forecasting encoders with MASE/sMAPE metrics |
| Advanced PEFT adapters | PiSSA, EVA, CorDA/LoftQ initializers; TinyLoRA, OFT, HRA, WaveFT, LN-Tuning, VBLoRA, C3A adapter types |
| VLM fine-tuning | Train LLaVA, Qwen2-VL, InternVL via is_multimodal: true with gated cross-attention |
| HyperNetwork combiner | Conditioning-based feature fusion — one feature generates weights for others |
| Nash-MTL & Pareto-MTL | Game-theoretic and preference-based multi-task loss balancing |
| LLM config generation | ludwig generate_config "describe your task" — LLM writes the YAML for you |
| ModelInspector | Architecture analysis, weight collection, feature importance proxy |
| Ray Serve & KServe | Distributed and Kubernetes-native model deployment shims |
| GRPO alignment | Reward-model-free RLHF via Group Relative Policy Optimization |
| torchao quantization + QAT | PyTorch-native int4/int8/float8 with Quantization-Aware Training |
| Multi-adapter PEFT | Multiple named LoRA adapters with weighted merging (TIES, DARE, SVD) |
| Native Optuna executor | GPT/TPE/CMA-ES samplers, pruning, resumable SQLite/PostgreSQL storage |
| Timeseries forecasting | model.forecast(dataset, horizon=N) API with TimeseriesOutputFeature |
| Muon & ScheduleFreeAdamW | New optimizers for large-scale pretraining and fine-tuning |
| Image segmentation decoders | UNet, SegFormer, FPN decoders for semantic segmentation |
pip install ludwig # core
pip install ludwig[full] # all optional dependencies
pip install ludwig[llm] # LLM fine-tuning only
Requires Python 3.12+. See contributing for a full dependency matrix.
Ludwig supports the full LLM fine-tuning spectrum:
| Technique | Config key |
|---|---|
| Supervised fine-tuning (SFT) | trainer.type: finetune |
| DPO / KTO / ORPO / GRPO alignment | trainer.type: dpo (or kto, orpo, grpo) |
| LoRA / DoRA / VeRA / PiSSA | adapter.type: lora (or dora, vera, lora + init_weights: pissa) |
| 4-bit QLoRA (bitsandbytes) | quantization.bits: 4 |
| torchao + QAT | quantization.backend: torchao |
| Multi-adapter with merging | adapters: dict + merge: block |
| VLM (vision-language) | is_multimodal: true |
model_type: llm
base_model: meta-llama/Llama-3.1-8B
quantization:
bits: 4
adapter:
type: lora
prompt:
template: |
### Instruction: {instruction}
### Input: {input}
### Response:
input_features:
- name: prompt
type: text
output_features:
- name: output
type: text
trainer:
type: finetune
learning_rate: 0.0001
batch_size: 1
gradient_accumulation_steps: 16
epochs: 3
learning_rate_scheduler:
decay: cosine
warmup_fraction: 0.01
backend:
type: local
export HUGGING_FACE_HUB_TOKEN="<your_token>"
ludwig train --config model.yaml --dataset "ludwig://alpaca"
input_features:
- name: review_text
type: text
encoder:
type: bert
- name: star_rating
type: number
- name: product_image
type: image
encoder:
type: dinov2
output_features:
- name: recommended
type: binary
ludwig train --config model.yaml --dataset reviews.csv
ludwig generate_config "I have a CSV with age, income, education level, and I want to predict loan default"
ludwig predict --model_path results/experiment_run/model --dataset new_data.csv
ludwig serve --model_path results/experiment_run/model
# POST http://localhost:8000/predict
LLM Fine-Tuning
is_multimodal: trueMultimodal & Tabular Models
model.forecast() APITraining Infrastructure
Hyperparameter Optimization
Production & Deployment
ludwig serve)torch.export .pt2 bundles, ONNXludwig upload hf_hub — push model + auto-generated model cardTooling & Integrations
ModelInspector — weight enumeration, architecture summary, feature importance proxyludwig.automl.auto_train() — give it a dataset and a time budget; the YAML-driven search space samples encoder/combiner/decoder combinations and validates them before trainingfrom ludwig.utils.dataset_quality import check_dataset_quality — validates a DataFrame before training (missing values, class imbalance, near-duplicate columns, ID leakage, …)OpenMLLoader fetches by task ID and caches locally as Parquetludwig generate_config "describe your task" — LLM writes the YAMLludwig experiment --k_fold Nludwig://mnist, ludwig://alpaca, …)| Use Case | Link |
|---|---|
| LLM instruction tuning (LoRA + QLoRA) | examples/llm |
| DPO / GRPO alignment | examples/llm/alignment |
| Advanced PEFT (PiSSA, OFT, VBLoRA, …) | examples/llms/peft_advanced |
| VLM fine-tuning (LLaVA, Qwen2-VL) | examples/vlm |
| Use Case | Link |
|---|---|
| Binary classification (Titanic) | examples/titanic |
| Tabular classification (census income) | examples/adult_census_income |
| Multimodal classification |
$ claude mcp add ludwig \
-- python -m otcore.mcp_server <graph>