MCPcopy
hub / github.com/business-science/ai-data-science-team

github.com/business-science/ai-data-science-team @0.0.0.9017 sqlite

repository ↗ · DeepWiki ↗ · release 0.0.0.9017 ↗
676 symbols 2,404 edges 46 files 306 documented · 45%
README

  <img src="https://github.com/business-science/ai-data-science-team/raw/0.0.0.9017/img/ai_data_science_logo.png" alt="AI Data Science Team" width="360">

AI Data Science Team + AI Pipeline Studio

PyPI versions license GitHub Repo stars

AI Data Science Team

AI Data Science Team is a Python library of specialized agents for common data science workflows, plus a flagship app: AI Pipeline Studio. The Studio turns your work into a visual, reproducible pipeline, while the AI team handles data loading, cleaning, visualization, and modeling.

Status: Beta. Breaking changes may occur until 0.1.0.

Please ⭐ us on GitHub (it takes 2 seconds and means a lot).

AI Pipeline Studio (Flagship App)

AI Pipeline Studio is the main example of the AI Data Science Team in action.

AI Pipeline Studio

Highlights: - Pipeline-first workspace: Visual Editor, Table, Chart, EDA, Code, Model, Predictions, MLflow - Manual + AI steps with lineage and reproducible scripts - Multi-dataset handling and merge workflows - Project saves: metadata-only or full-data - Storage footprint controls and rehydrate workflows

Run it:

streamlit run apps/ai-pipeline-studio-app/app.py

Full app docs: apps/ai-pipeline-studio-app/README.md

Quickstart

Requirements

  • Python 3.10+
  • OpenAI API key (or Ollama for local models)

Install the app and library

Clone the repo and install in editable mode:

pip install -e .

Run the AI Pipeline Studio app

streamlit run apps/ai-pipeline-studio-app/app.py

Library Overview

The repository includes both the AI Pipeline Studio app and the underlying AI Data Science Team library. The library provides agent building blocks and multi-agent workflows for: - Data loading and inspection - Cleaning, wrangling, and feature engineering - Visualization and EDA - Modeling and evaluation (H2O + MLflow tools) - SQL database interaction

Agents (Snapshot)

Agent examples live in examples/. Notable agents: - Data Loader Tools Agent - Data Wrangling Agent - Data Cleaning Agent - Data Visualization Agent - EDA Tools Agent - Feature Engineering Agent - SQL Database Agent - H2O ML Agent - MLflow Tools Agent - Multi-agent workflows (e.g., Pandas Data Analyst, SQL Data Analyst) - Supervisor Agent (oversees other agents) - Custom tools for data science tasks

Apps

See all apps in apps/. Notable apps: - AI Pipeline Studio: apps/ai-pipeline-studio-app/ - EDA Explorer App: apps/exploratory-copilot-app/ - Pandas Data Analyst App: apps/pandas-data-analyst-app/

Use OpenAI

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model_name="gpt-4.1-mini",
)

Use Ollama (Local LLM)

ollama serve
ollama pull llama3.1:8b
from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="llama3.1:8b",
)

Next-Gen AI Agentic Workshop

Want to learn how to build AI agents and AI apps for real data science workflows? Join my next‑gen AI workshop: https://learn.business-science.io/ai-register

Core symbols most depended-on inside this repo

write
called by 37
apps/ai-pipeline-studio-app/app.py
invoke
called by 32
ai_data_science_team/templates/agent_templates.py
has
called by 32
ai_data_science_team/multiagents/supervisor_ds_team.py
invoke
called by 22
ai_data_science_team/multiagents/supervisor_ds_team.py
_get_last_human
called by 21
ai_data_science_team/multiagents/supervisor_ds_team.py
_pipeline_studio_get_registry_ui
called by 20
apps/ai-pipeline-studio-app/app.py
_pipeline_studio_set_registry_ui
called by 17
apps/ai-pipeline-studio-app/app.py
format_agent_name
called by 17
ai_data_science_team/utils/regex.py

Shape

Function 462
Method 184
Class 30

Languages

Python100%

Modules by API surface

apps/ai-pipeline-studio-app/app.py213 symbols
ai_data_science_team/multiagents/supervisor_ds_team.py63 symbols
ai_data_science_team/agents/data_visualization_agent.py42 symbols
ai_data_science_team/tools/mlflow.py28 symbols
ai_data_science_team/ml_agents/h2o_ml_agent.py28 symbols
ai_data_science_team/agents/sql_database_agent.py25 symbols
ai_data_science_team/templates/agent_templates.py24 symbols
ai_data_science_team/agents/data_wrangling_agent.py24 symbols
ai_data_science_team/agents/feature_engineering_agent.py23 symbols
ai_data_science_team/multiagents/sql_data_analyst.py22 symbols
ai_data_science_team/multiagents/pandas_data_analyst.py22 symbols
ai_data_science_team/agents/data_cleaning_agent.py22 symbols

Dependencies from manifests, versioned

langchain1.0.0 · 1×
langchain_openai1.0.0 · 1×
langgraph1.0.0 · 1×

For agents

$ claude mcp add ai-data-science-team \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact