hub / github.com/mattpocock/sandcastle

github.com/mattpocock/sandcastle @v0.12.0 sqlite

repository ↗ · DeepWiki ↗ · release v0.12.0 ↗

587 symbols 1,929 edges 124 files 36 documented · 6%

README

<img alt="Sandcastle" src="https://res.cloudinary.com/total-typescript/image/upload/v1775033787/readme-sandcastle-onlight_2x.png" height="200" style="margin-bottom: 20px;">

What Is Sandcastle?

A TypeScript library for orchestrating AI coding agents in isolated sandboxes:

You invoke agents with a single sandcastle.run().
Sandcastle handles sandboxing the agent with a configurable branch strategy.
The commits made on the branches get merged back.

Sandcastle is provider-agnostic — it ships with built-in providers for Docker, Podman, and Vercel, and you can create your own. Great for parallelizing multiple AFK agents, creating review pipelines, or even just orchestrating your own agents.

Prerequisites

Git
A sandbox provider — Sandcastle needs an isolated environment to run agents in. Built-in options:
Docker Desktop — most common for local development
Podman — rootless alternative to Docker
Vercel — cloud-based Firecracker microVMs via @vercel/sandbox
Or create your own using createBindMountSandboxProvider or createIsolatedSandboxProvider

Quick start

Install the package:

npm install --save-dev @ai-hero/sandcastle

Run npx @ai-hero/sandcastle init. This scaffolds a .sandcastle directory with all the files needed.

npx @ai-hero/sandcastle init

Edit .sandcastle/.env and fill in your default values for CLAUDE_CODE_OAUTH_TOKEN (run claude setup-token on your host to get one). To use an Anthropic API key instead, uncomment and fill in ANTHROPIC_API_KEY.

cp .sandcastle/.env.example .sandcastle/.env

Run the .sandcastle/main.ts (or main.mts) file with npx tsx

npx tsx .sandcastle/main.ts

// 3. Run the agent via the JS API
import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(), // or podman(), vercel(), or your own provider
  promptFile: ".sandcastle/prompt.md",
});

Sandbox Providers

Sandcastle uses a SandboxProvider to create isolated environments. The sandbox option on run(), interactive(), and createSandbox() accepts any provider, including noSandbox() — opt in to running the agent directly on the host when container isolation is undesired. Built-in providers:

Provider	Import path	Type	Accepted by
Docker	`@ai-hero/sandcastle/sandboxes/docker`	Bind-mount	`run()`, `createSandbox()`, `interactive()`
Podman	`@ai-hero/sandcastle/sandboxes/podman`	Bind-mount	`run()`, `createSandbox()`, `interactive()`
Vercel	`@ai-hero/sandcastle/sandboxes/vercel`	Isolated	`run()`, `createSandbox()`, `interactive()`
No-sandbox	`@ai-hero/sandcastle/sandboxes/no-sandbox`	None	`run()`, `createSandbox()`, `interactive()`

Worktree methods (wt.run(), wt.interactive(), wt.createSandbox()) accept the same providers as their top-level counterparts. wt.interactive() defaults to noSandbox() when no sandbox is specified.

import { docker } from "@ai-hero/sandcastle/sandboxes/docker";
import { podman } from "@ai-hero/sandcastle/sandboxes/podman";
import { vercel } from "@ai-hero/sandcastle/sandboxes/vercel";
import { noSandbox } from "@ai-hero/sandcastle/sandboxes/no-sandbox";

// Docker, Podman, and Vercel are interchangeable in run() and createSandbox():
await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(),
  prompt: "...",
});

// No-sandbox runs the agent directly on the host — accepted by run(),
// createSandbox(), and interactive(). Skips container isolation entirely:
await interactive({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: noSandbox(),
  prompt: "...", // optional — omit to launch the TUI with no initial prompt
  cwd: "/path/to/other-repo", // optional — defaults to process.cwd()
});

You can also create your own provider using createBindMountSandboxProvider or createIsolatedSandboxProvider.

API

Sandcastle exports a programmatic run() function for use in scripts, CI pipelines, or custom tooling. The examples below use docker(), but any SandboxProvider works in its place.

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

const result = await run({
  agent: claudeCode("claude-opus-4-8"),
  sandbox: docker(),
  promptFile: ".sandcastle/prompt.md",
});

console.log(result.iterations.length); // number of iterations executed
console.log(result.iterations); // per-iteration results with optional sessionId
console.log(result.commits); // array of { sha } for commits created
console.log(result.branch); // target branch name

All options

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

const result = await run({
  // Agent provider — required. Pass a model string to claudeCode().
  // Optional second arg for provider-specific options like effort level.
  agent: claudeCode("claude-opus-4-8", { effort: "high" }),

  // Sandbox provider — required. Any SandboxProvider works (docker, podman, vercel, or custom).
  // Provider-specific config (like imageName, mounts) lives inside the provider factory call.
  sandbox: docker({
    imageName: "sandcastle:local",
    // Optional: override the UID/GID used for --user flag (defaults to host UID/GID).
    // Must match the UID baked into the image. Pre-flight check catches mismatches.
    // containerUid: 1000,
    // containerGid: 1000,
    // Optional: mount host directories into the sandbox (e.g. package manager caches)
    // hostPath supports absolute, tilde-expanded (~), and relative paths (resolved from cwd).
    // sandboxPath supports absolute and relative paths (resolved from the sandbox repo directory).
    mounts: [
      { hostPath: "~/.npm", sandboxPath: "/home/agent/.npm", readonly: true },
      { hostPath: "data", sandboxPath: "data" }, // mounts <cwd>/data → <sandbox-repo>/data
    ],
    // Optional: SELinux volume label — "z" (default, shared), "Z" (private), or false (none).
    // No-op on non-SELinux systems (Docker Desktop on macOS/Windows, Linux without SELinux).
    selinuxLabel: "z",
    // Optional: provider-level env vars merged at launch time
    env: { DOCKER_SPECIFIC: "value" },
    // Optional: attach container to Docker network(s) — string or string[]
    network: "my-network",
    // Optional: add the container user to supplementary groups via --group-add.
    // Accepts group names or numeric GIDs (e.g. for a bind-mounted Docker socket).
    groups: ["docker", 999],
    // Optional: expose host devices via --device. Each entry is a full device
    // spec in host[:container[:permissions]] form (e.g. "/dev/kvm").
    devices: ["/dev/kvm"],
    // Optional: limit CPU resources via --cpus. Fractional values allowed (e.g. 1.5).
    // cpus: 2,
  }),

  // Host repo directory — replaces process.cwd() as the anchor for
  // .sandcastle/ artifacts (worktrees, logs, env, patches) and git operations.
  // Relative paths resolve against process.cwd(). Defaults to process.cwd().
  cwd: "../other-repo",

  // Branch strategy — controls how the agent's changes relate to branches.
  // Defaults to { type: "head" } for bind-mount and { type: "merge-to-head" } for isolated providers.
  branchStrategy: { type: "branch", branch: "agent/fix-42" },

  // Prompt source — provide one of these, not both.
  // Note: promptFile resolves against process.cwd(), NOT cwd.
  promptFile: ".sandcastle/prompt.md", // path to a prompt file
  // prompt: "Fix issue #42 in this repo", // OR an inline prompt string

  // Values substituted for {{KEY}} placeholders in the prompt.
  promptArgs: {
    ISSUE_NUMBER: "42",
  },

  // Maximum number of agent iterations to run before stopping. Default: 1
  maxIterations: 5,

  // Display name for this run, shown as a prefix in log output.
  name: "fix-issue-42",

  // Lifecycle hooks grouped by where they run: host or sandbox.
  hooks: {
    host: {
      onWorktreeReady: [{ command: "cp .env.example .env" }],
      onSandboxReady: [{ command: "echo setup done" }],
    },
    sandbox: {
      onSandboxReady: [{ command: "npm install" }],
    },
  },

  // Host-relative file paths to copy into the sandbox before the container starts.
  // Not supported with branchStrategy: { type: "head" }.
  copyToWorktree: [".env"],

  // Override default timeouts for built-in lifecycle steps.
  // Unset keys keep their defaults.
  timeouts: {
    copyToWorktreeMs: 120_000, // default: 60_000
    gitSetupMs: 30_000, // default: 10_000
    commitCollectionMs: 60_000, // default: 30_000
    mergeToHostMs: 60_000, // default: 30_000
  },

  // How to record progress. Default: write to a file under .sandcastle/logs/
  logging: {
    type: "file",
    path: ".sandcastle/logs/my-run.log",
    // Optional: forward the agent's output stream to your own observability system.
    // Fires for each text chunk, tool call, and raw stdout line the agent
    // produces. Errors thrown by the callback are swallowed so a broken
    // forwarder cannot kill the run.
    onAgentStreamEvent: (event) => {
      // event is { type: "text" | "toolCall" | "raw", iteration, timestamp, ... }
      myLogger.info(event);
    },
    // Optional: append every raw stdout line the agent emits to the same
    // log file, interleaved with the human-readable output. Includes lines
    // the provider's stream parser would otherwise drop. Intended for
    // debugging stuck or unexpected agent behaviour.
    verbose: true,
  },
  // logging: { type: "stdout", verbose: true }, // OR terminal mode (verbose: raw lines to stdout)

  // String (or array of strings) the agent emits to end the iteration loop early.
  // Default: "<promise>COMPLETE</promise>"
  completionSignal: "<promise>COMPLETE</promise>",

  // Idle timeout in seconds — resets whenever the agent produces output. Default: 600 (10 minutes)
  idleTimeoutSeconds: 600,

  // Grace window in seconds after the agent emits a completion signal but
  // before its process has exited (a "hanging process" — typically a spawned
  // `gh`/git child or MCP server keeping stdout open). Resets on every
  // subsequent output line so trailing data is still captured. Default: 60
  completionTimeoutSeconds: 60,

  // Structured output — extract a typed payload from the agent's stdout.
  // Requires maxIterations === 1 and the tag must appear in the prompt.
  // output: Output.object({ tag: "result", schema: z.object({ answer: z.number() }) }),
  // output: Output.string({ tag: "summary" }),
});

console.log(result.iterations.length); // number of iterations executed
console.log(result.completionSignal); // matched signal string, or undefined if none fired
console.log(result.commits); // array of { sha } for commits created
console.log(result.branch); // target branch name

`createSandbox()` — reusable sandbox

Use createSandbox() when you need to run multiple agents (or multiple rounds of the same agent) inside a single sandbox. It creates the sandbox once, and you call sandbox.run() as many times as you need. This avoids repeated container startup costs and keeps all runs on the same branch.

Use run() instead when you only need a single one-shot invocation — it handles sandbox lifecycle automatically.

Basic single-run usage

import { createSandbox, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await using sandbox = await createSandbox({
  branch: "agent/fix-42",
  sandbox: docker(),
});

const result = await sandbox.run({
  agent: claudeCode("claude-opus-4-8"),
  prompt: "Fix issue #42 in this repo.",
});

console.log(result.commits); // [{ sha: "abc123" }]

Multi-run implement-then-review

import { createSandbox, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await using sandbox = await createSandbox({
  branch: "agent/fix-42",
  sandbox: docker(),
  hooks: { sandbox: { onSandboxReady: [{ command: "npm install" }] } },
});

// Step 1: implement
const implResult = await sandbox.run({
  agent: claudeCode("claude-opus-4-8"),
  promptFile: ".sandcastle/implement.md",
  maxIterations: 5,
});

// Step 2: review on the same branch, same container
const reviewResult = await sandbox.run({
  agent: claudeCode("claude-sonnet-4-6"),
  prompt: "Review the changes and fix any issues.",
});

Commits from all run() calls accumulate on the same branch. The sandbox container stays alive between runs, so installed dependencies and build artifacts persist.

sandbox.exec() lets the harness run shell commands directly in the same warm sandbox — handy for gating an implement step on a quick verification before kicking off the review:

```typescript await using sandbox = await createSandbox({ branch: "agent/fix-42", sandbox: docker(), hooks: { sandbox: { onSandboxReady: [{ command: "npm install" }] } }, });

await sandbox.run({ agent: claudeCode("claude-opus-4-8"), promptFile: ".sandcastle/implemen

Extension points exported contracts — how you extend this code

SandboxHandleContext (Interface)

@internal Context for building Sandbox handle methods.

src/createSandbox.ts

AgentStreamEmitterService (Interface)

(no doc)

src/AgentStreamEmitter.ts

OutputObjectDefinition (Interface)

(no doc)

src/Output.ts

HostSessionLookup (Interface)

(no doc)

src/SessionStore.ts

StartSandboxBindMountOptions (Interface)

(no doc)

src/startSandbox.ts

ExecResult (Interface)

(no doc)

src/SandboxProvider.ts

CwdError (Interface)

(no doc)

src/CwdError.ts

CreateWorktreeOptions (Interface)

(no doc)

src/createWorktree.ts

Core symbols most depended-on inside this repo

Shape

Function 376

Interface 98

Class 64

Method 49

Languages

TypeScript100%

Modules by API surface

src/AgentProvider.ts50 symbols

src/errors.ts47 symbols

src/SandboxProvider.ts30 symbols

src/SessionStore.ts28 symbols

src/InitService.ts28 symbols

src/createSandbox.ts22 symbols

src/WorktreeManager.ts19 symbols

src/createWorktree.ts17 symbols

src/run.ts16 symbols

src/SandboxFactory.ts16 symbols

src/Orchestrator.ts16 symbols

src/Orchestrator.test.ts12 symbols

Dependencies from manifests, versioned

@changesets/cli2.30.0 · 1×

@clack/prompts1.1.0 · 1×

@daytona/sdk0.164.0 · 1×

@effect/cli0.74.0 · 1×

@effect/platform0.95.0 · 1×

@effect/platform-node0.105.0 · 1×

@effect/printer0.48.0 · 1×

@effect/printer-ansi0.48.0 · 1×

@types/mdx2.0.13 · 1×

@types/node25.5.0 · 1×

@types/react19.0.0 · 1×

@types/react-dom19.0.0 · 1×

For agents

$ claude mcp add sandcastle \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/mattpocock/sandcastle @v0.12.0 sqlite

What Is Sandcastle?

Prerequisites

Quick start

Sandbox Providers

API

All options

createSandbox() — reusable sandbox

Basic single-run usage

Multi-run implement-then-review

Extension points exported contracts — how you extend this code

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

For agents

`createSandbox()` — reusable sandbox