hub / github.com/vercel-labs/deepsec

github.com/vercel-labs/deepsec @main sqlite

750 symbols 2,040 edges 323 files 54 documented · 7%

README

deepsec

deepsec an agent-powered vulnerability scanner that you can run in your own infrastructure, optimized to perform on-demand review of all code in existing large-scale repos.

deepsec is designed to surface hard-to-find issues that have been lurking in applications for a long time. It is configured to use the best models at maximum thinking levels, meaning scans can cost thousands or even tens-of-thousands of dollars for large codebases. Our customers have found the cost worth it for how quickly they were able to patch vulnerabilities that would have otherwise gone unfixed.

For large codebases, work fans out across worker machines in parallel. If a run is interrupted or errors out partway through, just re-run the same command — deepsec picks up where it left off, skipping files it already analyzed and only investigating the rest.

Get started

Navigate to the root of the repository that you want to scan, then:

npx deepsec init       # creates .deepsec/ with this repo as the first project
cd .deepsec
pnpm install           # installs deepsec from npm

# Proceed as instructed by `init` output

Now have your coding agent bootstrap your installation. Open the agent of choice and prompt:

Read .deepsec/node_modules/deepsec/SKILL.md to understand the tool. Then read .deepsec/data/<id>/SETUP.md and follow it: skim this repo's README, any AGENTS.md/CLAUDE.md, and a handful of representative code files, then replace each section of .deepsec/data/<id>/INFO.md.

Keep it SHORT — target 50–100 lines total. Pick 3–5 examples per section, not exhaustive enumeration. Name primitives (auth helpers, middleware) but no line numbers. Skip generic CWE categories — built-in matchers cover those. Cover only what's project-specific. INFO.md is injected into every scan batch; verbose context dilutes signal.

Then scan from inside .deepsec/:

pnpm deepsec scan
pnpm deepsec process    
pnpm deepsec revalidate # optional, cuts FP rate
pnpm deepsec export --format md-dir --out ./findings

If you feel like the deepsec should look at more parts of the code, give it the writing matchers doc to find more valuable starting points in your code base.

Docs

docs/getting-started.md — first-scan walkthrough
docs/reviewing-changes.md — process --diff for PR review and CI gating
docs/supported-tech.md — frameworks and ecosystems deepsec recognizes out of the box
docs/writing-matchers.md — prompt your coding agent to grow your matcher set
docs/configuration.md — deepsec.config.ts reference
docs/plugins.md — plugin authoring
docs/models.md — model selection, defaults, refusals, future models
docs/vercel-setup.md — AI Gateway + Vercel Sandbox keys / tokens
docs/architecture.md — pipeline internals
docs/data-layout.md — data/ schemas (FileRecord, RunMeta, …)
docs/faq.md — cost, model choice, sandbox mode, FP rate
samples/ — copy-paste starting points (currently: webapp/)
CONTRIBUTING.md — repo layout, dev workflow

AI provider

When running locally, deepsec falls back to your existing claude / codex subscription if you've logged in on this machine. Subscriptions (Claude Pro/Max, ChatGPT Plus) are useful for evaluating deepsec but generally don't have enough headroom for full repo scans.

For real scans, use Vercel AI Gateway. One key covers both Claude and Codex, and the gateway's default quotas are sized for highly concurrent research.

AI_GATEWAY_API_KEY=vck_...

See docs/vercel-setup.md for getting a key and for the Vercel Sandbox setup. To bypass the gateway, set ANTHROPIC_AUTH_TOKEN + ANTHROPIC_BASE_URL (or the OpenAI pair) explicitly. Explicit values always win over the AI_GATEWAY_API_KEY expansion.

If a process or revalidate run halts because the upstream credential ran out of quota or credits, deepsec stops gracefully and tells you where to top up. Re-run the same command afterward and it picks up where it left off.

Distributed execution (optional)

Large monorepos can fan work across Vercel Sandbox microVMs:

pnpm deepsec sandbox process --project-id my-app --sandboxes 10 --concurrency 4

Needs a Vercel account. The local working tree is tarballed and uploaded; .git is excluded. Both OIDC tokens (local) and access tokens (CI) are supported — see docs/vercel-setup.md.

Security model of deepsec itself

Treat deepsec like a coding agent with full shell access on the enviroment that it is running on. It is designed to run on trusted inputs (your source code) but you may still be concerned about prompt injection due to external dependencies or vendored code.

Running on a sandbox (see above) does limit the potential exposure substantially:

The API keys for the coding agents are injected outside of the sandbox and hence cannot be exfiltrated
For the worker sandboxes, network egress from the sandbox is limited to coding agent hosts (Egress is allowed during the bootstrap process, but this does not run the coding agent)

Workflow reference

Command	What it does
`scan`	Find candidate sites with regex matchers (fast, no AI)
`process`	AI investigation; emits findings + recommendation
`process --diff`	PR-mode: scan + investigate only files changed in a diff
`triage`	Lightweight P0/P1/P2 classification (cheaper model)
`revalidate`	Re-check existing findings; checks git history for fixes
`enrich`	Add git committer info + (with a plugin) ownership data
`report`	Markdown + JSON summary for one project
`export`	Per-finding JSON or directory of markdown files
`metrics`	Cross-project counts: severities, vulns by type, TPs
`status`	Snapshot of the project mirror
`sandbox <cmd>`	Run any of the above on Vercel Sandbox microVMs

License

Apache 2.0. See LICENSE and NOTICE.

Extension points exported contracts — how you extend this code

AgentPlugin (Interface)

(no doc) [8 implementers]

packages/processor/src/agents/types.ts

SyncNudge (Interface)

* Simple signal channel: the log-stream parser calls `signal()` when it * sees a "Batch N/M complete:" line; the stream

packages/deepsec/src/sandbox/orchestrator.ts

ScannerDriver (Interface)

(no doc) [2 implementers]

packages/scanner/src/types.ts

MatcherGate (Interface)

(no doc)

packages/core/src/plugin.ts

CommentProps (Interface)

(no doc)

fixtures/vulnerable-app/src/components/comment.tsx

PersistedRecord (Interface)

(no doc)

e2e/pipeline-sandbox.test.ts

RunResult (Interface)

(no doc)

e2e/pipeline.test.ts

PromptSampleScenario (Interface)

* Scenarios used to generate the deterministic samples in * `prompt-samples/` (committed to git). Each scenario maps to

packages/processor/src/__tests__/prompt-samples.fixtures.ts

Core symbols most depended-on inside this repo

called by 199

packages/scanner/src/matcher-registry.ts

regexMatcher

called by 119

packages/scanner/src/matchers/utils.ts

match

called by 56

packages/core/src/plugin.ts

classifyQuotaError

called by 47

packages/processor/src/agents/shared.ts

defineConfig

called by 42

packages/core/src/config.ts

setLoadedConfig

called by 37

packages/core/src/config.ts

get

called by 32

packages/processor/src/agents/registry.ts

exists

called by 29

packages/scanner/src/detect-tech.ts

Shape

Function 616

Interface 83

Method 33

Class 18

Languages

TypeScript100%

Modules by API surface

packages/processor/src/agents/pi-sdk.ts40 symbols

packages/processor/src/agents/codex-sdk.ts27 symbols

packages/processor/src/agents/shared.ts21 symbols

packages/core/src/run.ts21 symbols

packages/core/src/plugin.ts20 symbols

packages/deepsec/src/sandbox/setup.ts19 symbols

packages/deepsec/src/commands/metrics.ts18 symbols

packages/scanner/src/index.ts16 symbols

packages/deepsec/src/sandbox/orchestrator.ts16 symbols

packages/deepsec/src/commands/export.ts14 symbols

packages/core/src/types.ts13 symbols

packages/core/src/paths.ts13 symbols

Dependencies from manifests, versioned

@anthropic-ai/claude-agent-sdk0.3.158 · 1×

@biomejs/biome2.4.13 · 1×

@deepsec/coreworkspace:* · 1×

@deepsec/processorworkspace:* · 1×

@deepsec/scannerworkspace:* · 1×

@earendil-works/pi-coding-agent0.79.10 · 1×

@openai/codex0.125.0 · 1×

@openai/codex-sdk0.125.0 · 1×

@types/node22.0.0 · 1×

@vercel/oidc3.4.0 · 1×

@vercel/sandbox1.9.0 · 1×

commander13.0.0 · 1×

Datastores touched

prodDatabase · 1 repos

For agents

$ claude mcp add deepsec \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact