MCPcopy
hub / github.com/ludwig-ai/ludwig

github.com/ludwig-ai/ludwig @v0.17.6 sqlite

repository ↗ · DeepWiki ↗ · release v0.17.6 ↗
8,753 symbols 34,689 edges 804 files 2,439 documented · 28%
README

Declarative deep learning framework for LLMs, multimodal models, and tabular AI.

PyPI version Discord DockerHub Downloads License X

Docs · Getting Started · Examples · Discord


What is Ludwig?

Ludwig is a declarative deep learning framework that lets you train, fine-tune, and deploy AI models — from LLM fine-tuning to tabular classification — using a YAML config file and zero boilerplate Python.

# Fine-tune Llama-3.1 with LoRA in one config file
model_type: llm
base_model: meta-llama/Llama-3.1-8B
adapter:
  type: lora
trainer:
  type: finetune
  epochs: 3
input_features:
  - name: instruction
    type: text
output_features:
  - name: response
    type: text
ludwig train --config model.yaml --dataset my_data.csv

Tech stack: Python 3.12 · PyTorch 2.7+ · Pydantic 2 · Transformers 5 · Ray 2.54

Ludwig is hosted by the Linux Foundation AI & Data.


What's New in Ludwig 0.16

Feature Description
PatchTST & N-BEATS encoders State-of-the-art timeseries forecasting encoders with MASE/sMAPE metrics
Advanced PEFT adapters PiSSA, EVA, CorDA/LoftQ initializers; TinyLoRA, OFT, HRA, WaveFT, LN-Tuning, VBLoRA, C3A adapter types
VLM fine-tuning Train LLaVA, Qwen2-VL, InternVL via is_multimodal: true with gated cross-attention
HyperNetwork combiner Conditioning-based feature fusion — one feature generates weights for others
Nash-MTL & Pareto-MTL Game-theoretic and preference-based multi-task loss balancing
LLM config generation ludwig generate_config "describe your task" — LLM writes the YAML for you
ModelInspector Architecture analysis, weight collection, feature importance proxy
Ray Serve & KServe Distributed and Kubernetes-native model deployment shims
GRPO alignment Reward-model-free RLHF via Group Relative Policy Optimization
torchao quantization + QAT PyTorch-native int4/int8/float8 with Quantization-Aware Training
Multi-adapter PEFT Multiple named LoRA adapters with weighted merging (TIES, DARE, SVD)
Native Optuna executor GPT/TPE/CMA-ES samplers, pruning, resumable SQLite/PostgreSQL storage
Timeseries forecasting model.forecast(dataset, horizon=N) API with TimeseriesOutputFeature
Muon & ScheduleFreeAdamW New optimizers for large-scale pretraining and fine-tuning
Image segmentation decoders UNet, SegFormer, FPN decoders for semantic segmentation

Installation

pip install ludwig           # core
pip install ludwig[full]     # all optional dependencies
pip install ludwig[llm]      # LLM fine-tuning only

Requires Python 3.12+. See contributing for a full dependency matrix.


Quick Start

Fine-tune an LLM (instruction tuning)

Open In Colab

Ludwig supports the full LLM fine-tuning spectrum:

Technique Config key
Supervised fine-tuning (SFT) trainer.type: finetune
DPO / KTO / ORPO / GRPO alignment trainer.type: dpo (or kto, orpo, grpo)
LoRA / DoRA / VeRA / PiSSA adapter.type: lora (or dora, vera, lora + init_weights: pissa)
4-bit QLoRA (bitsandbytes) quantization.bits: 4
torchao + QAT quantization.backend: torchao
Multi-adapter with merging adapters: dict + merge: block
VLM (vision-language) is_multimodal: true
model_type: llm
base_model: meta-llama/Llama-3.1-8B

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction: {instruction}
    ### Input: {input}
    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

trainer:
  type: finetune
  learning_rate: 0.0001
  batch_size: 1
  gradient_accumulation_steps: 16
  epochs: 3
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.01

backend:
  type: local
export HUGGING_FACE_HUB_TOKEN="<your_token>"
ludwig train --config model.yaml --dataset "ludwig://alpaca"

Train a multimodal classifier

input_features:
  - name: review_text
    type: text
    encoder:
      type: bert
  - name: star_rating
    type: number
  - name: product_image
    type: image
    encoder:
      type: dinov2

output_features:
  - name: recommended
    type: binary
ludwig train --config model.yaml --dataset reviews.csv

Generate a config from natural language

ludwig generate_config "I have a CSV with age, income, education level, and I want to predict loan default"

Make predictions

ludwig predict --model_path results/experiment_run/model --dataset new_data.csv

Launch a REST API

ludwig serve --model_path results/experiment_run/model
# POST http://localhost:8000/predict

Capabilities

LLM Fine-Tuning

  • Supervised fine-tuning (SFT) on instruction/response pairs
  • Alignment training: DPO, KTO, ORPO, GRPO (reward-model-free RLHF)
  • PEFT adapters: LoRA, DoRA, VeRA, LoRA+, TinyLoRA, OFT, HRA, WaveFT, LN-Tuning, VBLoRA, C3A
  • LoRA initializers: PiSSA, EVA, CorDA, LoftQ for improved convergence
  • Multi-adapter PEFT: multiple named adapters on one base model, switchable at runtime; merge with TIES, DARE, SVD, magnitude pruning
  • Quantization: 4-bit/8-bit QLoRA (bitsandbytes), torchao int4/int8/float8 with QAT
  • VLM fine-tuning: LLaVA, Qwen2-VL, InternVL via is_multimodal: true
  • Sequence packing for efficient training on variable-length inputs
  • Paged and 8-bit optimizers for memory-efficient training

Multimodal & Tabular Models

  • Input modalities: text, numbers, categories, binary, sets, bags, sequences, images, audio, timeseries, vectors, dates
  • Text encoders: any HuggingFace Transformer (BERT, RoBERTa, ModernBERT, Qwen3, Llama-3.1, etc.), plus Mamba-2, Jamba
  • Image encoders: DINOv2, ConvNeXt, EfficientNet, ViT, CAFormer, ConvFormer, PoolFormer, TIMM (1000+ models)
  • Timeseries encoders: PatchTST, N-BEATS, CNN, RNN, Transformer; MASE and sMAPE metrics; model.forecast() API
  • Combiners: concat, transformer, tab_transformer, FT-Transformer, TabNet, TabPFN v2, HyperNetwork, ProjectAggregate, GatedFusion, Perceiver
  • Multi-task learning: multiple output features in a single model; Nash-MTL, Pareto-MTL, FAMO, GradNorm, uncertainty loss balancing
  • Image segmentation: UNet, SegFormer, FPN decoders

Training Infrastructure

  • Distributed training: HuggingFace Accelerate with DDP, FSDP, DeepSpeed (zero-code changes)
  • Ray backend: training across a Ray cluster, larger-than-memory datasets via Ray Data
  • Automatic batch size selection and learning rate range test
  • Mixed precision (fp16/bf16), gradient checkpointing, gradient accumulation
  • Optimizers: AdamW, Adafactor, SGD, Muon, ScheduleFreeAdamW, Lion, paged/8-bit variants
  • Learning rate schedulers: cosine, linear, polynomial, reduce-on-plateau, OneCycleLR
  • Model Soup: uniform and greedy checkpoint averaging for better generalization at zero inference cost
  • Modality dropout for robust multimodal models

Hyperparameter Optimization

  • Executors: Ray Tune (ASHA, PBT, Bayesian) and native Optuna (auto/GP/TPE/CMA-ES)
  • Optuna persistence: SQLite or PostgreSQL for resumable HPO runs
  • Pruning with Optuna's MedianPruner and HyperbandPruner
  • Search spaces: uniform, log-uniform, choice, randint, quantized
  • Full Ludwig config is searchable — any nested parameter can be a hyperparameter

Production & Deployment

  • REST API: FastAPI server with Prometheus metrics and structured logging (ludwig serve)
  • vLLM serving: OpenAI-compatible API with PagedAttention and continuous batching
  • Ray Serve: distributed deployment with auto-scaling and traffic splitting
  • KServe: Kubernetes-native deployment with Open Inference Protocol v2
  • Model export: SafeTensors (default), torch.export .pt2 bundles, ONNX
  • HuggingFace Hub: ludwig upload hf_hub — push model + auto-generated model card
  • Docker: prebuilt containers at ludwigai/ludwig

Tooling & Integrations

  • Experiment tracking: TensorBoard, Weights & Biases, Comet ML, MLflow, Aim Stack
  • Model inspection: ModelInspector — weight enumeration, architecture summary, feature importance proxy
  • Visualizations: learning curves, confusion matrices, calibration plots, ROC curves, hyperopt analysis
  • AutoML: ludwig.automl.auto_train() — give it a dataset and a time budget; the YAML-driven search space samples encoder/combiner/decoder combinations and validates them before training
  • Dataset quality checks: from ludwig.utils.dataset_quality import check_dataset_quality — validates a DataFrame before training (missing values, class imbalance, near-duplicate columns, ID leakage, …)
  • OpenML integration: load any OpenML task directly — OpenMLLoader fetches by task ID and caches locally as Parquet
  • LLM config generation: ludwig generate_config "describe your task" — LLM writes the YAML
  • K-fold cross-validation: ludwig experiment --k_fold N
  • Dataset Zoo: 70+ built-in benchmark datasets (ludwig://mnist, ludwig://alpaca, …)

Examples

LLM & Alignment

Use Case Link
LLM instruction tuning (LoRA + QLoRA) examples/llm
DPO / GRPO alignment examples/llm/alignment
Advanced PEFT (PiSSA, OFT, VBLoRA, …) examples/llms/peft_advanced
VLM fine-tuning (LLaVA, Qwen2-VL) examples/vlm

Tabular & Multimodal

Use Case Link
Binary classification (Titanic) examples/titanic
Tabular classification (census income) examples/adult_census_income
Multimodal classification

Core symbols most depended-on inside this repo

get
called by 551
ludwig/data/types.py
category_feature
called by 265
tests/integration_tests/utils.py
items
called by 253
ludwig/utils/registry.py
from_dict
called by 228
ludwig/data/types.py
generate_data
called by 220
tests/integration_tests/utils.py
text_feature
called by 144
tests/integration_tests/utils.py
train
called by 133
ludwig/api.py
to_dict
called by 126
ludwig/data/types.py

Shape

Method 4,195
Function 2,907
Class 1,519
Route 132

Languages

Python100%

Modules by API surface

ludwig/utils/tokenizers.py245 symbols
ludwig/encoders/text_encoders.py151 symbols
tests/ultra_slow/test_ultra_slow.py149 symbols
ludwig/modules/metric_modules.py119 symbols
ludwig/schema/llms/peft.py98 symbols
ludwig/features/number_feature.py84 symbols
ludwig/features/image_feature.py81 symbols
ludwig/modules/loss_modules.py79 symbols
ludwig/schema/utils.py78 symbols
ludwig/utils/data_utils.py74 symbols
ludwig/schema/optimizers.py71 symbols
ludwig/modules/convolutional_modules.py71 symbols

Dependencies from manifests, versioned

PyYAML6.0 · 1×
absl-py
kaggle
numpy1.24 · 1×
pandas2.0 · 1×
py-cpuinfo
requests2.28 · 1×
scikit-learn1.3 · 1×
scipy1.10 · 1×
sentencepiece0.2 · 1×
spacy2.3 · 1×
tabulate0.9 · 1×

For agents

$ claude mcp add ludwig \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact