hub / github.com/PriorLabs/TabPFN

github.com/PriorLabs/TabPFN @v8.0.8 sqlite

repository ↗ · DeepWiki ↗ · release v8.0.8 ↗

2,154 symbols 9,306 edges 203 files 1,444 documented · 67%

README

TabPFN

TabPFN Summary

Quick Start

Interactive Notebook Tutorial

[!TIP]

Dive right in with our interactive Colab notebook! It's the best way to get a hands-on feel for TabPFN, walking you through installation, classification, and regression examples.

Installation

pip install tabpfn

Note: For best performance on Apple Silicon/MPS, consider installing a pytorch version after the nightly "2.13.0.dev20260510". This enables flash attention without relying on MLX (the latter requires a GPU-CPU-GPU roundtrip).

Basic Usage

⚡ GPU Recommended: For optimal performance, use a GPU (even older ones with ~8GB VRAM work well; 16GB needed for some large datasets). On CPU, only small datasets (≲1000 samples) are feasible. No GPU? Use our free hosted inference via TabPFN Client.

To use our default TabPFN-3 model:

from tabpfn import TabPFNClassifier, TabPFNRegressor

clf = TabPFNClassifier()
clf.fit(X_train, y_train)  # downloads checkpoint on first use
predictions = clf.predict(X_test)

reg = TabPFNRegressor()
reg.fit(X_train, y_train)  # downloads checkpoint on first use
predictions = reg.predict(X_test)

To use other model versions (e.g. the previous default, TabPFN-2.6):

from tabpfn import TabPFNClassifier, TabPFNRegressor
from tabpfn.constants import ModelVersion

classifier = TabPFNClassifier.create_default_for_version(ModelVersion.V2_6)
regressor = TabPFNRegressor.create_default_for_version(ModelVersion.V2_6)

For complete examples, see the tabpfn_for_binary_classification.py, tabpfn_for_multiclass_classification.py, and tabpfn_for_regression.py files.

TabPFN Ecosystem

Choose the right TabPFN implementation for your needs:

TabPFN Client Simple API client for using TabPFN via cloud-based inference.
TabPFN Extensions Community extensions and integrations, including:
interpretability: Gain insights with SHAP-based explanations, feature importance, and selection tools.
unsupervised: Tools for outlier detection and synthetic tabular data generation.
embeddings: Extract and use TabPFN's internal learned embeddings for downstream tasks or analysis.
many_class: Handle multi-class classification problems that exceed TabPFN's built-in class limit.

To install: bash pip install tabpfn-extensions

TabPFN (this repo) Core implementation for fast and local inference with PyTorch and CUDA support.
TabPFN UX No-code graphical interface to explore TabPFN capabilities—ideal for business users and prototyping.

License

The TabPFN-2.5, TabPFN-2.6, and TabPFN-3 model weights are released under non-commercial licenses (TabPFN-3 license; see the Models page for prior releases). TabPFN-3 is used by default.

The code and TabPFN-2 model weights are licensed under Prior Labs License (Apache 2.0 with additional attribution requirement): here. To use the v2 model weights, instantiate your model as follows:

from tabpfn import TabPFNRegressor
from tabpfn.constants import ModelVersion

tabpfn_v2 = TabPFNRegressor.create_default_for_version(ModelVersion.V2)

Enterprise & Production

For high-throughput or massive-scale production environments, we offer an Enterprise Edition with the following capabilities: - Fast Inference Mode: A proprietary distillation engine that converts TabPFN into a compact MLP or tree ensemble, delivering orders-of-magnitude lower latency for real-time applications. - Commercial Support: Includes a Commercial Enterprise License for production use-cases, dedicated integration support, and access to private high-speed inference engines.

To learn more or request a commercial license, please contact us at sales@priorlabs.ai.

Join Our Community

We're building the future of tabular machine learning and would love your involvement:

Connect & Learn:
Join our Discord Community
Read our Documentation
Check out GitHub Issues
Contribute:
Report bugs or request features
Share your research and use cases
Submit pull requests — please open an issue first (see below)
Stay Updated: Star the repo and join Discord for the latest updates

[!IMPORTANT] Open an issue before starting work on a PR.

If there's a feature you'd like to add or a bug you've found, please open a GitHub issue with a high-level sketch of your plan. This lets us give feedback on the approach before you invest the effort, saving everyone time and increasing the chance your change lands.

There are many reasons a PR may not be mergeable — design fit, scope, compatibility, planned refactors, etc. — and these are often hard to spot from the outside, especially for a first-time contributor.

Citation

You can read our paper explaining TabPFNv2 here, and the model report of TabPFN-2.5 here.

BibTeX

@misc{grinsztajn2025tabpfn,
  title={TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models},
  author={Léo Grinsztajn and Klemens Flöge and Oscar Key and Felix Birkel and Philipp Jund and Brendan Roof and
          Benjamin Jäger and Dominik Safaric and Simone Alessi and Adrian Hayler and Mihir Manium and Rosen Yu and
          Felix Jablonski and Shi Bin Hoo and Anurag Garg and Jake Robertson and Magnus Bühler and Vladyslav Moroshan and
          Lennart Purucker and Clara Cornu and Lilly Charlotte Wehrhahn and Alessandro Bonetto and
          Bernhard Schölkopf and Sauraj Gambhir and Noah Hollmann and Frank Hutter},
  year={2025},
  eprint={2511.08667},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2511.08667},
}

@article{hollmann2025tabpfn,
 title={Accurate predictions on small data with a tabular foundation model},
 author={Hollmann, Noah and M{\"u}ller, Samuel and Purucker, Lennart and
         Krishnakumar, Arjun and K{\"o}rfer, Max and Hoo, Shi Bin and
         Schirrmeister, Robin Tibor and Hutter, Frank},
 journal={Nature},
 year={2025},
 month={01},
 day={09},
 doi={10.1038/s41586-024-08328-6},
 publisher={Springer Nature},
 url={https://www.nature.com/articles/s41586-024-08328-6},
}

@inproceedings{hollmann2023tabpfn,
  title={TabPFN: A transformer that solves small tabular classification problems in a second},
  author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},
  booktitle={International Conference on Learning Representations 2023},
  year={2023}
}

Usage Tips

Use batch prediction mode: Each predict call recomputes the training set. Calling predict on 100 samples separately is almost 100 times slower and more expensive than a single call. If the test set is very large, split it into chunks of 1000 samples each.
Avoid data preprocessing: Do not apply data scaling or one-hot encoding when feeding data to the model.
Use a GPU: TabPFN is slow to execute on a CPU. Ensure a GPU is available for better performance.
Mind the dataset size: TabPFN works best on datasets within its recommended size limits. The current default (TabPFN-3) supports up to 1,000,000 × 200, 100,000 × 2,000, or 1,000 × 20,000 (rows × features) — larger feature counts trade off against row capacity. See the Models page for the limits of other checkpoints.

❓ FAQ

Usage & Compatibility

Q: What dataset sizes work best with TabPFN?

Recommended row and feature limits vary by checkpoint — see the Models page for the per-release limits. As a quick reference, the current default (TabPFN-3) supports up to 1,000,000 × 200, 100,000 × 2,000, or 1,000 × 20,000 (rows × features); larger feature counts trade off against row capacity. The previous default (TabPFN-2.6) is recommended for up to 100,000 rows and 2,000 features. If your dataset exceeds the recommended limits for your checkpoint, you can subsample, set ignore_pretraining_limits=True to push past the size guardrail, or upgrade to a release with a higher limit.

Q: Why can't I use TabPFN with Python 3.9?

TabPFN requires Python 3.10+ due to newer language features. Compatible versions: 3.10, 3.11, 3.12, 3.13, 3.14.

Installation & Setup

Q: How do I get access to TabPFN-2.5 / TabPFN-2.6 / TabPFN-3?

On first use, TabPFN will automatically open a browser window where you can log in via PriorLabs and accept the license terms. Your authentication token is cached locally so you only need to do this once.

For headless / CI environments where a browser is not available, visit https://ux.priorlabs.ai, go to the License tab to accept the license, and then set the TABPFN_TOKEN environment variable with a token obtained from your account.

If access via the browser-based flow is not an option for you, please contact us at sales@priorlabs.ai.

Q: How do I use TabPFN without an internet connection?

TabPFN automatically downloads model weights when first used. For offline usage:

Using the Provided Download Script

If you have the TabPFN repository, you can use the included script to download all models (including ensemble variants):

# After installing TabPFN
python scripts/download_all_models.py

This script will download the main classifier and regressor models, as well as all ensemble variant models to your system's default cache directory.

Manual Download

Download the model files manually from HuggingFace:
Classifier: tabpfn-v3-classifier-v3_default.ckpt
Regressor: tabpfn-v3-regressor-v3_default.ckpt
Place the file in one of these locations:
Specify directly: TabPFNClassifier(model_path="/path/to/model.ckpt")
Set environment variable: export TABPFN_MODEL_CACHE_DIR="/path/to/dir" (see environment variables FAQ below)
Default OS cache directory:
- Windows: %APPDATA%\tabpfn\
- macOS: ~/Library/Caches/tabpfn/
- Linux: ~/.cache/tabpfn/

Q: I'm getting a pickle error when loading the model. What should I do?

Try the following: - Download the newest version of tabpfn pip install tabpfn --upgrade - Ensure model files downloaded correctly (re-download if needed)

Q: What environment variables can I use to configure TabPFN?

TabPFN uses Pydantic settings for configuration, supporting environment variables and .env files:

Authentication: - TABPFN_TOKEN: Provide a PriorLabs authentication token directly (useful for headless/CI environments). Obtain one from https://ux.priorlabs.ai. - TABPFN_NO_BROWSER: Set to disable automatic browser-based login (e.g. in environments where opening a browser is undesirable).

Model Configuration: - TABPFN_MODEL_CACHE_DIR: Custom directory for caching downloaded TabPFN models (default: platform-specific user cache directory) - TABPFN_ALLOW_CPU_LARGE_DATASET: Allow running TabPFN on CPU with large datasets (>1000 samples). Set to true to override the CPU limitation. Note: This will be very slow!

PyTorch Settings: - PYTORCH_CUDA_ALLOC_CONF: PyTorch CUDA memory allocation configuration to optimize GPU memory usage (default: max_split_size_mb:512). See PyTorch CUDA documentation for more information.

Example: ```bash export TABPFN_MODEL_CACHE_DIR="/path/to/models

Core symbols most depended-on inside this repo

indices_for

called by 107

src/tabpfn/preprocessing/datamodel.py

predict

called by 77

src/tabpfn/regressor.py

fit

called by 72

src/tabpfn/regressor.py

fit_transform

called by 72

src/tabpfn/preprocessing/label_encoder.py

called by 59

src/tabpfn/architectures/kv_cache.py

get

called by 51

src/tabpfn/inference.py

predict_proba

called by 46

src/tabpfn/classifier.py

fit

called by 43

src/tabpfn/classifier.py

Shape

Function 989

Method 874

Class 253

Route 38

Languages

Python100%

Modules by API surface

src/tabpfn/architectures/tabpfn_v3.py83 symbols

tests/test_browser_auth.py71 symbols

src/tabpfn/inference.py61 symbols

tests/test_preprocessing/test_ensemble.py51 symbols

tests/test_classifier_interface.py47 symbols

tests/test_regressor_interface.py42 symbols

src/tabpfn/classifier.py42 symbols

src/tabpfn/architectures/tabpfn_v2_sf.py41 symbols

src/tabpfn/architectures/base/bar_distribution.py40 symbols

src/tabpfn/preprocessing/torch/steps.py39 symbols

src/tabpfn/misc/_sklearn_compat.py37 symbols

src/tabpfn/misc/debug_versions.py36 symbols

Dependencies from manifests, versioned

einops0.4.0 · 1×

huggingface-hub0.23.0 · 1×

joblib1.2.0 · 1×

numpy1.21.6 · 1×

pandas1.4.0 · 1×

pydantic2.8.0 · 1×

pydantic-settings2.10.1 · 1×

safetensors0.4.0 · 1×

scikit-learn1.2.0 · 1×

scipy1.11.1 · 1×

torch2.5 · 1×

tqdm4.66.0 · 1×

For agents

$ claude mcp add TabPFN \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact