MCPcopy
hub / github.com/mattpocock/sandcastle

github.com/mattpocock/sandcastle @v0.12.0 sqlite

repository ↗ · DeepWiki ↗ · release v0.12.0 ↗
587 symbols 1,929 edges 124 files 36 documented · 6%
README
<img alt="Sandcastle" src="https://res.cloudinary.com/total-typescript/image/upload/v1775033787/readme-sandcastle-onlight_2x.png" height="200" style="margin-bottom: 20px;">

What Is Sandcastle?

A TypeScript library for orchestrating AI coding agents in isolated sandboxes:

  1. You invoke agents with a single sandcastle.run().
  2. Sandcastle handles sandboxing the agent with a configurable branch strategy.
  3. The commits made on the branches get merged back.

Sandcastle is provider-agnostic — it ships with built-in providers for Docker, Podman, and Vercel, and you can create your own. Great for parallelizing multiple AFK agents, creating review pipelines, or even just orchestrating your own agents.

Prerequisites

  • Git
  • A sandbox provider — Sandcastle needs an isolated environment to run agents in. Built-in options:
  • Docker Desktop — most common for local development
  • Podman — rootless alternative to Docker
  • Vercel — cloud-based Firecracker microVMs via @vercel/sandbox
  • Or create your own using createBindMountSandboxProvider or createIsolatedSandboxProvider

Quick start

  1. Install the package:
npm install --save-dev @ai-hero/sandcastle
  1. Run npx @ai-hero/sandcastle init. This scaffolds a .sandcastle directory with all the files needed.
npx @ai-hero/sandcastle init
  1. Edit .sandcastle/.env and fill in your default values for CLAUDE_CODE_OAUTH_TOKEN (run claude setup-token on your host to get one). To use an Anthropic API key instead, uncomment and fill in ANTHROPIC_API_KEY.
cp .sandcastle/.env.example .sandcastle/.env
  1. Run the .sandcastle/main.ts (or main.mts) file with npx tsx
npx tsx .sandcastle/main.ts
// 3. Run the agent via the JS API
import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(), // or podman(), vercel(), or your own provider
  promptFile: ".sandcastle/prompt.md",
});

Sandbox Providers

Sandcastle uses a SandboxProvider to create isolated environments. The sandbox option on run(), interactive(), and createSandbox() accepts any provider, including noSandbox() — opt in to running the agent directly on the host when container isolation is undesired. Built-in providers:

Provider Import path Type Accepted by
Docker @ai-hero/sandcastle/sandboxes/docker Bind-mount run(), createSandbox(), interactive()
Podman @ai-hero/sandcastle/sandboxes/podman Bind-mount run(), createSandbox(), interactive()
Vercel @ai-hero/sandcastle/sandboxes/vercel Isolated run(), createSandbox(), interactive()
No-sandbox @ai-hero/sandcastle/sandboxes/no-sandbox None run(), createSandbox(), interactive()

Worktree methods (wt.run(), wt.interactive(), wt.createSandbox()) accept the same providers as their top-level counterparts. wt.interactive() defaults to noSandbox() when no sandbox is specified.

import { docker } from "@ai-hero/sandcastle/sandboxes/docker";
import { podman } from "@ai-hero/sandcastle/sandboxes/podman";
import { vercel } from "@ai-hero/sandcastle/sandboxes/vercel";
import { noSandbox } from "@ai-hero/sandcastle/sandboxes/no-sandbox";

// Docker, Podman, and Vercel are interchangeable in run() and createSandbox():
await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(),
  prompt: "...",
});

// No-sandbox runs the agent directly on the host — accepted by run(),
// createSandbox(), and interactive(). Skips container isolation entirely:
await interactive({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: noSandbox(),
  prompt: "...", // optional — omit to launch the TUI with no initial prompt
  cwd: "/path/to/other-repo", // optional — defaults to process.cwd()
});

You can also create your own provider using createBindMountSandboxProvider or createIsolatedSandboxProvider.

API

Sandcastle exports a programmatic run() function for use in scripts, CI pipelines, or custom tooling. The examples below use docker(), but any SandboxProvider works in its place.

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

const result = await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(),
  promptFile: ".sandcastle/prompt.md",
});

console.log(result.iterations.length); // number of iterations executed
console.log(result.iterations); // per-iteration results with optional sessionId
console.log(result.commits); // array of { sha } for commits created
console.log(result.branch); // target branch name

All options

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

const result = await run({
  // Agent provider — required. Pass a model string to claudeCode().
  // Optional second arg for provider-specific options like effort level.
  agent: claudeCode("claude-opus-4-8", { effort: "high" }),

  // Sandbox provider — required. Any SandboxProvider works (docker, podman, vercel, or custom).
  // Provider-specific config (like imageName, mounts) lives inside the provider factory call.
  sandbox: docker({
    imageName: "sandcastle:local",
    // Optional: override the UID/GID used for --user flag (defaults to host UID/GID).
    // Must match the UID baked into the image. Pre-flight check catches mismatches.
    // containerUid: 1000,
    // containerGid: 1000,
    // Optional: mount host directories into the sandbox (e.g. package manager caches)
    // hostPath supports absolute, tilde-expanded (~), and relative paths (resolved from cwd).
    // sandboxPath supports absolute and relative paths (resolved from the sandbox repo directory).
    mounts: [
      { hostPath: "~/.npm", sandboxPath: "/home/agent/.npm", readonly: true },
      { hostPath: "data", sandboxPath: "data" }, // mounts <cwd>/data → <sandbox-repo>/data
    ],
    // Optional: SELinux volume label — "z" (default, shared), "Z" (private), or false (none).
    // No-op on non-SELinux systems (Docker Desktop on macOS/Windows, Linux without SELinux).
    selinuxLabel: "z",
    // Optional: provider-level env vars merged at launch time
    env: { DOCKER_SPECIFIC: "value" },
    // Optional: attach container to Docker network(s) — string or string[]
    network: "my-network",
    // Optional: add the container user to supplementary groups via --group-add.
    // Accepts group names or numeric GIDs (e.g. for a bind-mounted Docker socket).
    groups: ["docker", 999],
    // Optional: expose host devices via --device. Each entry is a full device
    // spec in host[:container[:permissions]] form (e.g. "/dev/kvm").
    devices: ["/dev/kvm"],
    // Optional: limit CPU resources via --cpus. Fractional values allowed (e.g. 1.5).
    // cpus: 2,
  }),

  // Host repo directory — replaces process.cwd() as the anchor for
  // .sandcastle/ artifacts (worktrees, logs, env, patches) and git operations.
  // Relative paths resolve against process.cwd(). Defaults to process.cwd().
  cwd: "../other-repo",

  // Branch strategy — controls how the agent's changes relate to branches.
  // Defaults to { type: "head" } for bind-mount and { type: "merge-to-head" } for isolated providers.
  branchStrategy: { type: "branch", branch: "agent/fix-42" },

  // Prompt source — provide one of these, not both.
  // Note: promptFile resolves against process.cwd(), NOT cwd.
  promptFile: ".sandcastle/prompt.md", // path to a prompt file
  // prompt: "Fix issue #42 in this repo", // OR an inline prompt string

  // Values substituted for {{KEY}} placeholders in the prompt.
  promptArgs: {
    ISSUE_NUMBER: "42",
  },

  // Maximum number of agent iterations to run before stopping. Default: 1
  maxIterations: 5,

  // Display name for this run, shown as a prefix in log output.
  name: "fix-issue-42",

  // Lifecycle hooks grouped by where they run: host or sandbox.
  hooks: {
    host: {
      onWorktreeReady: [{ command: "cp .env.example .env" }],
      onSandboxReady: [{ command: "echo setup done" }],
    },
    sandbox: {
      onSandboxReady: [{ command: "npm install" }],
    },
  },

  // Host-relative file paths to copy into the sandbox before the container starts.
  // Not supported with branchStrategy: { type: "head" }.
  copyToWorktree: [".env"],

  // Override default timeouts for built-in lifecycle steps.
  // Unset keys keep their defaults.
  timeouts: {
    copyToWorktreeMs: 120_000, // default: 60_000
    gitSetupMs: 30_000, // default: 10_000
    commitCollectionMs: 60_000, // default: 30_000
    mergeToHostMs: 60_000, // default: 30_000
  },

  // How to record progress. Default: write to a file under .sandcastle/logs/
  logging: {
    type: "file",
    path: ".sandcastle/logs/my-run.log",
    // Optional: forward the agent's output stream to your own observability system.
    // Fires for each text chunk, tool call, and raw stdout line the agent
    // produces. Errors thrown by the callback are swallowed so a broken
    // forwarder cannot kill the run.
    onAgentStreamEvent: (event) => {
      // event is { type: "text" | "toolCall" | "raw", iteration, timestamp, ... }
      myLogger.info(event);
    },
    // Optional: append every raw stdout line the agent emits to the same
    // log file, interleaved with the human-readable output. Includes lines
    // the provider's stream parser would otherwise drop. Intended for
    // debugging stuck or unexpected agent behaviour.
    verbose: true,
  },
  // logging: { type: "stdout", verbose: true }, // OR terminal mode (verbose: raw lines to stdout)

  // String (or array of strings) the agent emits to end the iteration loop early.
  // Default: "<promise>COMPLETE</promise>"
  completionSignal: "<promise>COMPLETE</promise>",

  // Idle timeout in seconds — resets whenever the agent produces output. Default: 600 (10 minutes)
  idleTimeoutSeconds: 600,

  // Grace window in seconds after the agent emits a completion signal but
  // before its process has exited (a "hanging process" — typically a spawned
  // `gh`/git child or MCP server keeping stdout open). Resets on every
  // subsequent output line so trailing data is still captured. Default: 60
  completionTimeoutSeconds: 60,

  // Structured output — extract a typed payload from the agent's stdout.
  // Requires maxIterations === 1 and the tag must appear in the prompt.
  // output: Output.object({ tag: "result", schema: z.object({ answer: z.number() }) }),
  // output: Output.string({ tag: "summary" }),
});

console.log(result.iterations.length); // number of iterations executed
console.log(result.completionSignal); // matched signal string, or undefined if none fired
console.log(result.commits); // array of { sha } for commits created
console.log(result.branch); // target branch name

createSandbox() — reusable sandbox

Use createSandbox() when you need to run multiple agents (or multiple rounds of the same agent) inside a single sandbox. It creates the sandbox once, and you call sandbox.run() as many times as you need. This avoids repeated container startup costs and keeps all runs on the same branch.

Use run() instead when you only need a single one-shot invocation — it handles sandbox lifecycle automatically.

Basic single-run usage

import { createSandbox, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await using sandbox = await createSandbox({
  branch: "agent/fix-42",
  sandbox: docker(),
});

const result = await sandbox.run({
  agent: claudeCode("claude-opus-4-8"),
  prompt: "Fix issue #42 in this repo.",
});

console.log(result.commits); // [{ sha: "abc123" }]

Multi-run implement-then-review

import { createSandbox, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await using sandbox = await createSandbox({
  branch: "agent/fix-42",
  sandbox: docker(),
  hooks: { sandbox: { onSandboxReady: [{ command: "npm install" }] } },
});

// Step 1: implement
const implResult = await sandbox.run({
  agent: claudeCode("claude-opus-4-8"),
  promptFile: ".sandcastle/implement.md",
  maxIterations: 5,
});

// Step 2: review on the same branch, same container
const reviewResult = await sandbox.run({
  agent: claudeCode("claude-sonnet-4-6"),
  prompt: "Review the changes and fix any issues.",
});

Commits from all run() calls accumulate on the same branch. The sandbox container stays alive between runs, so installed dependencies and build artifacts persist.

sandbox.exec() lets the harness run shell commands directly in the same warm sandbox — handy for gating an implement step on a quick verification before kicking off the review:

```typescript await using sandbox = await createSandbox({ branch: "agent/fix-42", sandbox: docker(), hooks: { sandbox: { onSandboxReady: [{ command: "npm install" }] } }, });

await sandbox.run({ agent: claudeCode("claude-opus-4-8"), promptFile: ".sandcastle/implemen

Extension points exported contracts — how you extend this code

SandboxHandleContext (Interface)
@internal Context for building Sandbox handle methods.
src/createSandbox.ts
AgentStreamEmitterService (Interface)
(no doc)
src/AgentStreamEmitter.ts
OutputObjectDefinition (Interface)
(no doc)
src/Output.ts
HostSessionLookup (Interface)
(no doc)
src/SessionStore.ts
StartSandboxBindMountOptions (Interface)
(no doc)
src/startSandbox.ts
ExecResult (Interface)
(no doc)
src/SandboxProvider.ts
CwdError (Interface)
(no doc)
src/CwdError.ts
CreateWorktreeOptions (Interface)
(no doc)
src/createWorktree.ts

Core symbols most depended-on inside this repo

exec
called by 233
src/createSandbox.ts
push
called by 212
src/boundedTail.ts
close
called by 182
src/createSandbox.ts
claudeCode
called by 180
src/AgentProvider.ts
parseStreamLine
called by 101
src/AgentProvider.ts
buildPrintCommand
called by 90
src/AgentProvider.ts
orchestrate
called by 71
src/Orchestrator.ts
makeLocalSandbox
called by 61
src/testSandbox.ts

Shape

Function 376
Interface 98
Class 64
Method 49

Languages

TypeScript100%

Modules by API surface

src/AgentProvider.ts50 symbols
src/errors.ts47 symbols
src/SandboxProvider.ts30 symbols
src/SessionStore.ts28 symbols
src/InitService.ts28 symbols
src/createSandbox.ts22 symbols
src/WorktreeManager.ts19 symbols
src/createWorktree.ts17 symbols
src/run.ts16 symbols
src/SandboxFactory.ts16 symbols
src/Orchestrator.ts16 symbols
src/Orchestrator.test.ts12 symbols

Dependencies from manifests, versioned

@changesets/cli2.30.0 · 1×
@clack/prompts1.1.0 · 1×
@daytona/sdk0.164.0 · 1×
@effect/cli0.74.0 · 1×
@effect/platform0.95.0 · 1×
@effect/platform-node0.105.0 · 1×
@effect/printer0.48.0 · 1×
@effect/printer-ansi0.48.0 · 1×
@types/mdx2.0.13 · 1×
@types/node25.5.0 · 1×
@types/react19.0.0 · 1×
@types/react-dom19.0.0 · 1×

For agents

$ claude mcp add sandcastle \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact