MCPcopy Index your code
hub / github.com/pycaret/pycaret

github.com/pycaret/pycaret @v4.0.0a2 sqlite

repository ↗ · DeepWiki ↗ · release v4.0.0a2 ↗
2,342 symbols 9,151 edges 242 files 895 documented · 38%
README

PyCaret

PyCaret — open-source ML platform

The engine, the control plane, and the UI — all in one self-hosted box.

CI Python License

Vision · Architecture · Roadmap · Spec · Quickstart · Agent guide


⚠ 4.0 is work in progress — you're looking at the v4 branch

PyCaret 4.0 is a ground-up architectural revamp. It lives on the v4 branch. The master branch is still 3.4.0.

Track progress in docs/revamp/STATUS.md and docs/revamp/ROADMAP.md.


What you get

PyCaret is two layers of one product:

The engine (packages/engine/) — pip install pycaret. Config-driven, stateless, sklearn-composable. Use it in a notebook:

from pycaret.tasks import ClassificationExperiment
from pycaret.datasets import get_data

df = get_data("juice")
exp = ClassificationExperiment(target="Purchase", session_id=42).fit(df)
best = exp.compare_models().best
tuned = exp.tune_model(best).pipeline
exp.save_model(tuned, "baseline")

PyCaret Control Plane (services/api/ + apps/web/ + infra/) — the full self-hosted web platform that wraps the engine. Workspaces, projects, datasets, experiments, runs, artifacts, deployments, monitoring, LLM-assisted experiment design. Run it on a laptop, a Docker host, or Kubernetes.

Three deployment modes (current + roadmapped):

Mode For Status
Notebook (pip install pycaret) Data scientist workflow 4.0.0a1 on PyPI
Local dev (uv + npm) Building against the Control Plane ✅ shipped
Single-server Docker compose Small-team self-hosted ✅ shipped
Kubernetes + Helm + Terraform Enterprise cloud 🔴 V2 (stubs scaffolded)
Electron desktop Analyst, no Docker 🔴 V2 (stub scaffolded)

Repo layout

pycaret/                  ← monorepo
├── packages/
│   └── engine/           → `pycaret` on PyPI
├── services/
│   ├── api/              → `pycaret-server` on PyPI (FastAPI)
│   ├── worker/           (V2) background job runner
│   └── deployment-runtime/ (V2) standalone serving
├── apps/
│   ├── web/              React + Vite (Control Plane UI)
│   └── desktop/          (V2) Electron
├── infra/
│   ├── docker/           Dockerfile.api, Dockerfile.ui, compose
│   ├── helm/             (V2) Kubernetes chart
│   └── terraform/        (V2) AWS / GCP / Azure modules
└── docs/revamp/          VISION, SPEC, ARCHITECTURE, ROADMAP, STATUS, DECISIONS

See docs/revamp/ARCHITECTURE.md for the full system architecture.

Try it locally — 3 minutes

Just the engine, in a notebook:

pip install pycaret
# or with every optional extra:
pip install "pycaret[full]"

Supported: Python 3.11 / 3.12 / 3.13.

The full Control Plane, from source:

git clone -b v4 https://github.com/pycaret/pycaret.git
cd pycaret

# Backend (terminal 1)
uv python install 3.13
uv sync --all-packages --all-extras
uv run --package pycaret-server pycaret-server serve --reload

# Frontend (terminal 2)
cd apps/web
npm install
npm run dev
# → http://localhost:3000/setup

Or with Docker (full stack, one command):

docker compose -f infra/docker/docker-compose.yml up --build
# → http://localhost:3000

See docs/revamp/PLATFORM_QUICKSTART.md for the full quickstart.

Engine quickstart

from pycaret.datasets import get_data
from pycaret.tasks import ClassificationExperiment
from pycaret import save_model, load_model

df = get_data("juice")
exp = ClassificationExperiment(target="Purchase", session_id=42).fit(df)

# Compare models — returns a typed CompareResult
result = exp.compare_models()
best = result.best
print(result.leaderboard)

# Tune — returns a TuneResult
tuned = exp.tune_model(best).pipeline

# Predict — returns a PredictResult
preds = exp.predict_model(tuned).predictions

# Save + load
save_model(tuned, "artifacts/best")
restored = load_model("artifacts/best")

Same shape for the other task types:

from pycaret.tasks import (
    RegressionExperiment,
    ClusteringExperiment,
    AnomalyExperiment,
    TimeSeriesExperiment,
)

Introspection — for UIs and LLM agents

from pycaret.api import (
    list_models, describe_model, list_metrics, describe_setup_params,
)

list_models("classification")           # -> list[ModelCard]
describe_model("classification", "lr")  # -> ModelCard
list_metrics("classification")          # -> list[MetricCard]

# UI-form schema — JSON-serializable, renders directly as a dynamic form
schema = describe_setup_params("classification")

The Control Plane UI renders its entire experiment-setup form from describe_setup_params. Zero UI code hard-codes a parameter name.

Event stream

from pycaret.logging import MemoryLogger

log = MemoryLogger()
log.subscribe(lambda event: print(event.kind.value, event.message))

exp = ClassificationExperiment(target="y", logger=log).fit(df)
exp.compare_models()   # emits experiment.started → model.compare.finished → ...

The Control Plane backend subclasses BaseLogger with DBEventLogger — every engine event becomes a DB row and streams live to any connected WebSocket clients.

What's deliberately not here

  • Module-level functional API (setup, compare_models) — use OOP Experiment classes.
  • External experiment trackers: mlflow, comet-ml, wandb, dagshub — the Control Plane owns this story now.
  • Distributed backends: fugue, dask, ray (V3 opt-in).
  • Visualization: yellowbrick, mljar-scikit-plot, schemdraw — Plotly-only rewrite in progress.
  • In-engine deployment helpers: create_api, create_app, create_docker, dashboard, convert_model, deploy_model — the Control Plane owns serving + deployment.
  • Drift / fairness in the engine: check_drift, check_fairness — moved to the monitoring layer.

See docs/revamp/KILL_LIST.md for the exhaustive list.

Who this is for

  • Data scientists who want AutoML in a notebook without vendor lock-in.
  • ML engineers who want an open-source control plane they can self-host — train, deploy, monitor, improve.
  • Small teams (≤20 people) who need the whole loop without Databricks licenses.
  • Enterprises who need SSO + audit logs + multi-cloud deployment in the same repo they started prototyping with.
  • LLM agents that introspect and drive ML experiments — every model, metric, and parameter is a serializable dataclass.

See docs/revamp/VISION.md for the product statement.

Licensing

  • Engine (packages/engine/) is MIT.
  • Platform packages (services/*, apps/*) are dual-licensed MIT OR BUSL-1.1. Self-host freely; the BSL grant covers multi-tenant hosted commercialisation, auto-converting to MIT/Apache-2.0 after 3 years. See docs/revamp/DECISIONS.md for rationale.

Contributing

PyCaret is under active revamp. Read AGENTS.md (for AI agents) and CONTRIBUTING.md (for humans). Bug reports welcome; large feature PRs should discuss in an issue first.

Extension points exported contracts — how you extend this code

RunExplainerCardProps (Interface)
(no doc)
apps/web/src/components/RunExplainerCard.tsx
EventStreamProps (Interface)
(no doc)
apps/web/src/components/EventStream.tsx
PredictTesterProps (Interface)
(no doc)
apps/web/src/components/PredictTester.tsx
DeploymentReviewModalProps (Interface)
(no doc)
apps/web/src/components/DeploymentReviewModal.tsx
LeaderboardProps (Interface)
(no doc)
apps/web/src/components/Leaderboard.tsx

Core symbols most depended-on inside this repo

info
called by 498
packages/engine/pycaret/internal/logging.py
get
called by 315
services/api/pycaret_server/serving.py
get_logger
called by 121
packages/engine/pycaret/internal/logging.py
fit
called by 74
packages/engine/pycaret/core/experiment.py
_check_soft_dependencies
called by 73
packages/engine/pycaret/utils/_dependencies.py
np_list_arange
called by 71
packages/engine/pycaret/utils/generic.py
leftover_parameters_to_categorical_distributions
called by 66
packages/engine/pycaret/containers/models/base_model.py
warning
called by 66
packages/engine/pycaret/internal/logging.py

Shape

Method 1,127
Function 731
Class 362
Route 69
Interface 53

Languages

Python94%
TypeScript6%

Modules by API surface

packages/engine/pycaret/containers/models/time_series.py164 symbols
packages/engine/pycaret/time_series/forecasting/oop.py83 symbols
packages/engine/pycaret/internal/pycaret_experiment/tabular_experiment.py74 symbols
packages/engine/pycaret/utils/generic.py67 symbols
packages/engine/pycaret/internal/pycaret_experiment/supervised_experiment.py66 symbols
packages/engine/pycaret/containers/models/regression.py63 symbols
packages/engine/pycaret/internal/distributions.py55 symbols
packages/engine/pycaret/internal/pipeline.py54 symbols
packages/engine/pycaret/internal/preprocess/transformers.py53 symbols
packages/engine/pycaret/containers/models/classification.py51 symbols
packages/engine/pycaret/core/experiment.py46 symbols
apps/web/src/api/types.ts37 symbols

Dependencies from manifests, versioned

@tanstack/react-query5.60.0 · 1×
@testing-library/dom10.4.0 · 1×
@testing-library/react16.1.0 · 1×
@testing-library/user-event14.5.0 · 1×
@types/node22.9.0 · 1×
@types/react18.3.0 · 1×
@types/react-dom18.3.0 · 1×
@types/react-plotly.js2.6.0 · 1×
@typescript-eslint/eslint-plugin8.13.0 · 1×
@vitejs/plugin-react4.3.0 · 1×

For agents

$ claude mcp add pycaret \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact