20+ high-performance LLMs with recipes to pretrain, finetune, and deploy at scale.
✅ From scratch implementations ✅ No abstractions ✅ Beginner friendly ✅ Flash attention ✅ FSDP ✅ LoRA, QLoRA, Adapter ✅ Reduce GPU memory (fp4/8/16/32) ✅ 1-1000+ GPUs/TPUs ✅ 20+ LLMs
Quick start • Models • Finetune • Deploy • All workflows • Features • Recipes (YAML) • Lightning AI • Tutorials
Over 340,000 developers use Lightning Cloud - purpose-built for PyTorch and PyTorch Lightning.
- GPUs from $0.19.
- Clusters: frontier-grade training/inference clusters.
- AI Studio (vibe train): workspaces where AI helps you debug, tune and vibe train.
- AI Studio (vibe deploy): workspaces where AI helps you optimize, and deploy models.
- Notebooks: Persistent GPU workspaces where AI helps you code and analyze.
- Inference: Deploy models as inference APIs.
Every LLM is implemented from scratch with no abstractions and full control, making them blazing fast, minimal, and performant at enterprise scale.
✅ Enterprise ready - Apache 2.0 for unlimited enterprise use.
✅ Developer friendly - Easy debugging with no abstraction layers and single file implementations.
✅ Optimized performance - Models designed to maximize performance, reduce costs, and speed up training.
✅ Proven recipes - Highly-optimized training/finetuning recipes tested at enterprise scale.
Install LitGPT
pip install 'litgpt[extra]'
Load and use any of the 20+ LLMs:
from litgpt import LLM
llm = LLM.load("microsoft/phi-2")
text = llm.generate("Fix the spelling: Every fall, the family goes to the mountains.")
print(text)
# Corrected Sentence: Every fall, the family goes to the mountains.
✅ Optimized for fast inference
✅ Quantization
✅ Runs on low-memory GPUs
✅ No layers of internal abstractions
✅ Optimized for production scale
Advanced install options
Install from source:
git clone https://github.com/Lightning-AI/litgpt
cd litgpt
# if using uv
uv sync --all-extras
# if using pip
pip install -e ".[extra,compiler,test]"
Explore the full Python API docs.
Every model is written from scratch to maximize performance and remove layers of abstraction:
| Model | Model size | Author | Reference |
|---|---|---|---|
| Llama 3, 3.1, 3.2, 3.3 | 1B, 3B, 8B, 70B, 405B | Meta AI | Meta AI 2024 |
| Code Llama | 7B, 13B, 34B, 70B | Meta AI | Rozière et al. 2023 |
| CodeGemma | 7B | Google Team, Google Deepmind | |
| Gemma 2 | 2B, 9B, 27B | Google Team, Google Deepmind | |
| Phi 4 | 14B | Microsoft Research | Abdin et al. 2024 |
| Qwen2.5 | 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B | Alibaba Group | Qwen Team 2024 |
| Qwen2.5 Coder | 0.5B, 1.5B, 3B, 7B, 14B, 32B | Alibaba Group | Hui, Binyuan et al. 2024 |
| R1 Distill Llama | 8B, 70B | DeepSeek AI | DeepSeek AI 2025 |
| ... | ... | ... | ... |
See full list of 20+ LLMs
| Model | Model size | Author | Reference |
|---|---|---|---|
| CodeGemma | 7B | Google Team, Google Deepmind | |
| Code Llama | 7B, 13B, 34B, 70B | Meta AI | Rozière et al. 2023 |
| Falcon | 7B, 40B, 180B | TII UAE | TII 2023 |
| Falcon 3 | 1B, 3B, 7B, 10B | TII UAE | TII 2024 |
| FreeWilly2 (Stable Beluga 2) | 70B | Stability AI | Stability AI 2023 |
| Function Calling Llama 2 | 7B | Trelis | Trelis et al. 2023 |
| Gemma | 2B, 7B | Google Team, Google Deepmind | |
| Gemma 2 | 9B, 27B | Google Team, Google Deepmind | |
| Gemma 3 | 1B, 4B, 12B, 27B | Google Team, Google Deepmind | |
| Llama 2 | 7B, 13B, 70B | Meta AI | Touvron et al. 2023 |
| Llama 3.1 | 8B, 70B | Meta AI | Meta AI 2024 |
| Llama 3.2 | 1B, 3B | Meta AI | Meta AI 2024 |
| Llama 3.3 | 70B | Meta AI | Meta AI 2024 |
| Mathstral | 7B | Mistral AI | Mistral AI 2024 |
| MicroLlama | 300M | Ken Wang | MicroLlama repo |
| Mixtral MoE | 8x7B | Mistral AI | Mistral AI 2023 |
| Mistral | 7B, 123B | Mistral AI | Mistral AI 2023 |
| Mixtral MoE | 8x22B | Mistral AI | Mistral AI 2024 |
| OLMo | 1B, 7B | Allen Institute for AI (AI2) | Groeneveld et al. 2024 |
| OpenLLaMA | 3B, 7B, 13B | OpenLM Research | Geng & Liu 2023 |
| Phi 1.5 & 2 | 1.3B, 2.7B | Microsoft Research | Li et al. 2023 |
| Phi 3 | 3.8B | Microsoft Research | Abdin et al. 2024 |
| Phi 4 | 14B | Microsoft Research | Abdin et al. 2024 |
| Phi 4 Mini Instruct | 3.8B | Microsoft Research | Microsoft 2025 |
| Phi 4 Mini Reasoning | 3.8B | Microsoft Research | Xu, Peng et al. 2025 |
| Phi 4 Reasoning | 3.8B | Microsoft Research | Abdin et al. 2025 |
| Phi 4 Reasoning Plus | 3.8B | Microsoft Research | Abdin et al. 2025 |
| Platypus | 7B, 13B, 70B | Lee et al. | Lee, Hunter, and Ruiz 2023 |
| Pythia | {14,31,70,160,410}M, {1,1.4,2.8,6.9,12}B | EleutherAI | Biderman et al. 2023 |
| Qwen2.5 | 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B | Alibaba Group | Qwen Team 2024 |
| Qwen2.5 Coder | 0.5B, 1.5B, 3B, 7B, 14B, 32B | Alibaba Group | Hui, Binyuan et al. 2024 |
| Qwen2.5 1M (Long Context) | 7B, 14B | Alibaba Group | Qwen Team 2025 |
| Qwen2.5 Math | 1.5B, 7B, 72B | Alibaba Group | An, Yang et al. 2024 |
| QwQ | 32B | Alibaba Group | Qwen Team 2025 |
| QwQ-Preview | 32B | Alibaba Group | Qwen Team 2024 |
| Qwen3 | 0.6B, 1.7B, 4B{Hybrid, Thinking-2507, Instruct-2507}, 8B, 14B, 32B | Alibaba Group | Qwen Team 2025 |
| Qwen3 MoE | 30B{Hybrid, Thinking-2507, Instruct-2507}, 235B{Hybrid, Thinking-2507, Instruct-2507} | Alibaba Group | Qwen Team 2025 |
| R1 Distill Llama | 8B, 70B | DeepSeek AI | DeepSeek AI 2025 |
| SmolLM2 | 135M, 360M, 1.7B | Hugging Face | Hugging Face 2024 |
| Salamandra | 2B, 7B | Barcelona Supercomputing Centre | BSC-LTC 2024 |
| StableCode | 3B | Stability AI | Stability AI 2023 |
| StableLM | 3B, 7B | Stability AI | Stability AI 2023 |
| StableLM Zephyr | 3B | Stability AI | Stability AI 2023 |
| TinyLlama | 1.1B | Zhang et al. | Zhang et al. 2023 |
Tip: You can list all available models by running the litgpt download list command.
Finetune • Pretrain • Continued pretraining • Evaluate • Deploy • Test
Use the command line interface to run advanced workflows such as pretraining or finetuning on your own data.
After installing LitGPT, select the model and workflow to run (finetune, pretrain, evaluat
$ claude mcp add litgpt \
-- python -m otcore.mcp_server <graph>