hub / github.com/huggingface/ml-intern

github.com/huggingface/ml-intern @main sqlite

2,233 symbols 8,129 edges 174 files 577 documented · 26%

README

smolagents logo

<a href="https://github.com/huggingface/ml-intern/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Apache_2.0-blue.svg"></a>
<a href="https://smolagents-ml-intern.hf.space/"><img alt="Website" src="https://img.shields.io/website/https/smolagents-ml-intern.hf.space.svg?down_color=red&down_message=offline&up_message=online"></a>

ML Intern

An ML intern that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem — with deep access to docs, papers, datasets, and cloud compute.

Quick Start

Installation

git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .

That's it. Now `ml-intern` works from any directory:

ml-intern

Create a .env file in the project root (or export these in your shell):

HF_TOKEN=<your-hugging-face-token> # HF Router inference + Hub actions
GITHUB_TOKEN=<github-personal-access-token>

All API-based model calls go through Hugging Face Inference Providers, so your HF_TOKEN must be allowed to make Inference Provider calls. If no HF_TOKEN is set, the CLI will prompt you to paste one on first launch unless you start on a local model. To get a GITHUB_TOKEN follow the tutorial here. See the local models section below for instructions on using agents that run on your hardware.

Usage

Interactive mode (start a chat session):

ml-intern

Headless mode (single prompt, auto-approve):

ml-intern "fine-tune llama on my dataset"

Options:

ml-intern --sandbox-tools "your prompt"                         # use HF Space sandbox tools
ml-intern --max-iterations 100 "your prompt"
ml-intern --no-stream "your prompt"
# Change model
ml-intern --model moonshotai/Kimi-K2.7-Code:novita "your prompt"
ml-intern --model openai/gpt-5.5:fal-ai "your prompt"

Run ml-intern then /model to see the full list of suggested model ids (Claude, GPT, HF Router models like MiniMax, Kimi, GLM, DeepSeek, and local model prefixes).

Hosted inference is billed to the active Hugging Face user. See below on how to run ml-intern with local models.

Local models

Local model support uses OpenAI-compatible HTTP endpoints through LiteLLM. The agent does not load model weights directly from disk; start your inference server first, then select it with a provider-specific model prefix:

ml-intern --model ollama/llama3.1:8b "your prompt"
ml-intern --model vllm/meta-llama/Llama-3.1-8B-Instruct "your prompt"

Inside interactive mode, switch with /model:

/model ollama/llama3.1:8b
/model lm_studio/google/gemma-3-4b
/model llamacpp/llama-3.1-8b-instruct

Supported local prefixes are ollama/, vllm/, lm_studio/, and llamacpp/.

LOCAL_LLM_BASE_URL=http://localhost:8000
LOCAL_LLM_API_KEY=<optional-local-api-key>

Set LOCAL_LLM_BASE_URL and optional LOCAL_LLM_API_KEY to use one shared local endpoint, or override a specific provider with its matching *_BASE_URL / *_API_KEY variable, such as OLLAMA_BASE_URL or VLLM_API_KEY. Provider-specific variables take precedence over the shared local variables. Base URLs may include or omit /v1.

CLI tool runtime:

By default, the CLI runs bash, read, write, and edit on your local filesystem. To use HF Space sandbox tools instead, including sandbox_create, opt in with --sandbox-tools:

ml-intern --sandbox-tools "test this training script in a GPU sandbox"
ml-intern --model llamacpp/ggml-org/gemma-3-1b-it-GGUF --sandbox-tools

Sandbox tool runtime requires HF_TOKEN, even when the selected model is local, because it creates private HF Spaces. You can also make sandbox tools your CLI default in ~/.config/ml-intern/cli_agent_config.json:

{ "tool_runtime": "sandbox" }

Use the default local runtime when you want tools to inspect or edit files in your checkout. Use sandbox runtime when you want the agent to create or replace an HF Space sandbox, test code remotely, or request GPU sandbox hardware before launching larger HF Jobs.

Sharing Traces

Every session is auto-uploaded to your own private Hugging Face dataset in Claude Code JSONL format, which the HF Agent Trace Viewer auto-detects so you can browse turns, tool calls, and model responses directly on the Hub.

By default the dataset is named {your-hf-username}/ml-intern-sessions and is created private. You can flip it to public from inside the CLI:

/share-traces            # show current visibility + dataset URL
/share-traces public     # publish (anyone can view)
/share-traces private    # lock it back down

You can also flip visibility from the dataset page on huggingface.co — the agent honours whatever you set there for subsequent uploads.

To opt out entirely, set in your CLI config (e.g. configs/cli_agent_config.json or ~/.config/ml-intern/cli_agent_config.json):

{ "share_traces": false }

To override the destination repo, set:

{ "personal_trace_repo_template": "{hf_user}/my-custom-traces" }

The shared smolagents/ml-intern-sessions dataset is unrelated and only receives anonymized telemetry rows used by the backend KPI scheduler.

Supported Gateways

ML Intern currently supports one-way notification gateways from CLI sessions. These gateways send out-of-band status updates; they do not accept inbound chat messages.

Slack

Slack notifications use the Slack Web API to post messages when the agent needs approval, hits an error, or completes a turn. Create a Slack app with a bot token that has chat:write, invite the bot to the target channel, then set:

SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_ID=C...

The CLI automatically creates a slack.default destination when both variables are present. Optional environment variables for the env-only default:

ML_INTERN_SLACK_NOTIFICATIONS=false
ML_INTERN_SLACK_DESTINATION=slack.ops
ML_INTERN_SLACK_AUTO_EVENTS=approval_required,error,turn_complete
ML_INTERN_SLACK_ALLOW_AGENT_TOOL=true
ML_INTERN_SLACK_ALLOW_AUTO_EVENTS=true

For a persistent user-level config, put overrides in ~/.config/ml-intern/cli_agent_config.json or point ML_INTERN_CLI_CONFIG at a JSON file:

{
  "messaging": {
    "enabled": true,
    "auto_event_types": ["approval_required", "error", "turn_complete"],
    "destinations": {
      "slack.ops": {
        "provider": "slack",
        "token": "${SLACK_BOT_TOKEN}",
        "channel": "${SLACK_CHANNEL_ID}",
        "allow_agent_tool": true,
        "allow_auto_events": true
      }
    }
  }
}

Architecture

Component Overview

┌─────────────────────────────────────────────────────────────┐
│                         User/CLI                            │
└────────────┬─────────────────────────────────────┬──────────┘
             │ Operations                          │ Events
             ↓ (user_input, exec_approval,         ↑
      submission_queue  interrupt, compact, ...)  event_queue
             │                                          │
             ↓                                          │
┌────────────────────────────────────────────────────┐  │
│            submission_loop (agent_loop.py)         │  │
│  ┌──────────────────────────────────────────────┐  │  │
│  │  1. Receive Operation from queue             │  │  │
│  │  2. Route to handler (run_agent/compact/...) │  │  │
│  └──────────────────────────────────────────────┘  │  │
│                      ↓                             │  │
│  ┌──────────────────────────────────────────────┐  │  │
│  │         Handlers.run_agent()                 │  ├──┤
│  │                                              │  │  │
│  │  ┌────────────────────────────────────────┐  │  │  │
│  │  │  Agentic Loop (max 300 iterations)     │  │  │  │
│  │  │                                        │  │  │  │
│  │  │  ┌──────────────────────────────────┐  │  │  │  │
│  │  │  │ Session                          │  │  │  │  │
│  │  │  │  ┌────────────────────────────┐  │  │  │  │  │
│  │  │  │  │ ContextManager             │  │  │  │  │  │
│  │  │  │  │ • Message history          │  │  │  │  │  │
│  │  │  │  │   (litellm.Message[])      │  │  │  │  │  │
│  │  │  │  │ • Auto-compaction (170k)   │  │  │  │  │  │
│  │  │  │  │ • Session upload to HF     │  │  │  │  │  │
│  │  │  │  └────────────────────────────┘  │  │  │  │  │
│  │  │  │                                  │  │  │  │  │
│  │  │  │  ┌────────────────────────────┐  │  │  │  │  │
│  │  │  │  │ ToolRouter                 │  │  │  │  │  │
│  │  │  │  │  ├─ HF docs & research     │  │  │  │  │  │
│  │  │  │  │  ├─ HF repos, datasets,    │  │  │  │  │  │
│  │  │  │  │  │  jobs, papers           │  │  │  │  │  │
│  │  │  │  │  ├─ GitHub code search     │  │  │  │  │  │
│  │  │  │  │  ├─ Sandbox & local tools  │  │  │  │  │  │
│  │  │  │  │  ├─ Planning               │  │  │  │  │  │
│  │  │  │  │  └─ MCP server tools       │  │  │  │  │  │
│  │  │  │  └────────────────────────────┘  │  │  │  │  │
│  │  │  └──────────────────────────────────┘  │  │  │  │
│  │  │                                        │  │  │  │
│  │  │  ┌──────────────────────────────────┐  │  │  │  │
│  │  │  │ Doom Loop Detector               │  │  │  │  │
│  │  │  │ • Detects repeated tool patterns │  │  │  │  │
│  │  │  │ • Injects corrective prompts     │  │  │  │  │
│  │  │  └──────────────────────────────────┘  │  │  │  │
│  │  │                                        │  │  │  │
│  │  │  Loop:                                 │  │  │  │
│  │  │    1. LLM call (litellm.acompletion)   │  │  │  │
│  │  │       ↓                                │  │  │  │
│  │  │    2. Parse tool_calls[]               │  │  │  │
│  │  │       ↓                                │  │  │  │
│  │  │    3. Approval check                   │  │  │  │
│  │  │       (jobs, sandbox, destructive ops) │  │  │  │
│  │  │       ↓                                │  │  │  │
│  │  │    4. Execute via ToolRouter           │  │  │  │
│  │  │       ↓                                │  │  │  │
│  │  │    5. Add results to ContextManager    │  │  │  │
│  │  │       ↓                                │  │  │  │
│  │  │    6. Repeat if tool_calls exist       │  │  │  │
│  │  └────────────────────────────────────────┘  │  │  │
│  └──────────────────────────────────────────────┘  │  │
└────────────────────────────────────────────────────┴──┘

Agentic Loop Flow

User Message
     ↓
[Add to ContextManager]
     ↓
     ╔═══════════════════════════════════════════╗
     ║      Iteration Loop (max 300)             ║
     ║                                           ║
     ║  Get messages + tool specs                ║
     ║         ↓                                 ║
     ║  litellm.acompletion()                    ║
     ║         ↓                                 ║
     ║  Has tool_calls? ──No──> Done             ║
     ║         │                                 ║
     ║        Yes                                ║
     ║         ↓                                 ║
     ║  Add assistant msg (with tool_calls)      ║
     ║         ↓                                 ║
     ║  Doom loop check                          ║
     ║         ↓                                 ║
     ║  For each tool_call:                      ║
     ║    • Needs approval? ──Yes──> Wait for    ║
     ║    │                         user confirm ║
     ║    No                                     ║
     ║    ↓                                      ║
     ║    • ToolRouter.execute_tool()            ║
     ║    • Add result to ContextManager         ║
     ║         ↓                                 ║
     ║  Continue loop ─────────────────┐         ║
     ║         ↑                       │         ║
     ║         └───────────────────────┘         ║
     ╚═══════════════════════════════════════════╝

Events

The agent emits the following events via event_queue:

processing - Starting to process user input
ready - Agent is ready for input
assistant_chunk - Streaming token chunk
assistant_message - Complete LLM response text
assistant_stream_end - Token stream finished
tool_call - Tool being called with arguments
tool_output - Tool execution result
tool_log - Informational tool log message
tool_state_change - Tool execution state transition
approval_required - Requesting user approval for sensitive operations
turn_complete - Agent finished processing
error - Error occurred during processing
interrupted - Agent was interrupted
compacted - Context was compacted
undo_complete - Undo operation completed
shutdown - Agent shutting down

Development

Pre-commit Checks

Run Ruff before every commit:

uv run ruff check .
uv run ruff format --check .

If the format check fails, run uv run ruff format . and re-run the checks before committing.

Adding Built-in Tools

Edit agent/core/tools.py:

``python def create_builtin_tools() -> list[ToolSpec]: return [ ToolSpec( name="your_tool", description="What your tool does", parameters={ "type": "object", "properties": { "param": {"type": "string", "description": "Parameter description"} }, "required": ["param"] }, handler=your_async_handler ), # ... existing tools ]

Extension points exported contracts — how you extend this code

AgentEvent (Interface)

(no doc)

frontend/src/types/events.ts

MessageMeta (Interface)

(no doc)

frontend/src/types/agent.ts

SessionMeta (Interface)

(no doc)

frontend/src/types/agent.ts

User (Interface)

(no doc)

frontend/src/types/agent.ts

JobsUpgradeDialogProps (Interface)

(no doc)

frontend/src/components/JobsUpgradeDialog.tsx

Core symbols most depended-on inside this repo

send_event

called by 89

agent/core/session.py

apiFetch

called by 37

frontend/src/utils/api.ts

_error

called by 31

agent/tools/hf_repo_git_tool.py

write

called by 30

agent/tools/sandbox_client.py

agent/core/llm_params.py

_check_session_access

called by 24

backend/routes/agent.py

run

called by 23

backend/session_manager.py

Shape

Function 1,550

Method 439

Class 162

Interface 47

Route 35

Languages

Python90%

TypeScript10%

Modules by API surface

backend/session_manager.py82 symbols

tests/unit/test_session_manager_persistence.py79 symbols

backend/routes/agent.py77 symbols

scripts/prioritize_backlog.py75 symbols

agent/core/agent_loop.py54 symbols

tests/unit/test_prioritize_backlog.py50 symbols

tests/unit/test_hub_artifacts.py44 symbols

tests/unit/test_sandbox_private_spaces.py41 symbols

tests/unit/test_cli_rendering.py41 symbols

agent/main.py41 symbols

tests/unit/test_usage.py40 symbols

tests/unit/test_session_reaper.py39 symbols

Dependencies from manifests, versioned

@ai-sdk/react3.0.93 · 1×

@emotion/react11.13.0 · 1×

@emotion/styled11.13.0 · 1×

@eslint/js9.13.0 · 1×

@mui/icons-material6.1.0 · 1×

@mui/material6.1.0 · 1×

@types/react18.3.12 · 1×

@types/react-dom18.3.1 · 1×

@types/react-syntax-highlighter15.5.13 · 1×

@vitejs/plugin-react4.3.3 · 1×

ai6.0.91 · 1×

eslint9.13.0 · 1×

Datastores touched

(mongodb)Database · 1 repos

For agents

$ claude mcp add ml-intern \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/huggingface/ml-intern @main sqlite

ML Intern

Quick Start

Installation

That's it. Now ml-intern works from any directory:

Usage

Interactive mode (start a chat session):

Headless mode (single prompt, auto-approve):

Local models

Sharing Traces

Supported Gateways

Slack

Architecture

Component Overview

Agentic Loop Flow

Events

Development

Pre-commit Checks

Adding Built-in Tools

Extension points exported contracts — how you extend this code

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

Datastores touched

For agents

That's it. Now `ml-intern` works from any directory: