MCPcopy
hub / github.com/promptfoo/promptfoo

github.com/promptfoo/promptfoo @code-scan-action-0.1.8 sqlite

repository ↗ · DeepWiki ↗ · release code-scan-action-0.1.8 ↗
12,815 symbols 42,214 edges 2,845 files 1,274 documented · 10%
README

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a CLI and library for evaluating and red-teaming LLM apps. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website · Getting Started · Red Teaming · Documentation · Discord

Promptfoo is now part of OpenAI. Promptfoo remains open source and MIT licensed. Read the company update.

Quick Start

Requires Node.js ^20.20.0 or >=22.22.0 for npm and npx usage.

npm install -g promptfoo
promptfoo init --example getting-started

Also available via brew install promptfoo and pip install promptfoo. You can also use npx promptfoo@latest to run any command without installing.

Most LLM providers require an API key. Set yours as an environment variable:

export OPENAI_API_KEY=sk-abc123

Once you're in the example directory, run an eval and view results:

cd getting-started
promptfoo eval
promptfoo view

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Review pull requests for LLM-related security and compliance issues with code scanning
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

promptfoo command line

It also can generate security vulnerability reports:

gen ai red team

Why Promptfoo?

  • Developer-first: Fast, with features like live reload and caching
  • Private: LLM evals run 100% locally - your prompts never leave your machine
  • Flexible: Works with any LLM API or programming language
  • Battle-tested: Powers LLM apps serving 10M+ users in production
  • Data-driven: Make decisions based on metrics, not gut feel
  • Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Extension points exported contracts — how you extend this code

CleanupProvider (Interface)
* Interface for providers that need cleanup on process exit. [12 implementers]
src/providers/providerRegistry.ts
ApiProvider (Interface)
(no doc) [159 implementers]
test/smoke/fixtures/frontend-ts-provider/provider.ts
ApiProvider (Interface)
(no doc) [245 implementers]
src/types/providers.ts
AdversarialAudioProvider (Interface)
(no doc) [4 implementers]
src/redteam/audio/adversarialProvider.ts
ChunkSendResult (Interface)
Result of attempting to send a chunk
src/share.ts
ProviderCallQueue (Interface)
(no doc) [2 implementers]
src/scheduler/providerCallQueue.ts
ApiFunc (FuncType)
Function type definitions for the API functions
src/golang/wrapper.go
GeneratedTestCase (Interface)
* A generated test case with variables and assertions
src/commands/mcp/tools/generateTestCases.ts

Core symbols most depended-on inside this repo

callApi
called by 3047
test/smoke/fixtures/providers/echo-ts.ts
parse
called by 1476
src/app/src/polyfills/scroll-timeline.js
push
called by 1337
src/scheduler/providerRateLimitState.ts
mockProcessEnv
called by 1297
test/util/utils.ts
resolve
called by 1157
src/app/src/polyfills/scroll-timeline.js
error
called by 934
src/progress/ciProgressReporter.ts
get
called by 699
src/providers/elevenlabs/cache.ts
send
called by 672
src/providers/openai/codex-app-server.ts

Shape

Function 7,080
Method 2,935
Interface 1,460
Class 1,293
Route 44
Enum 1
FuncType 1
Struct 1

Languages

TypeScript95%
Python5%
Go1%

Modules by API surface

src/app/src/polyfills/scroll-timeline.js282 symbols
src/evaluator.ts168 symbols
src/providers/openai/codex-app-server.ts149 symbols
src/redteam/plugins/codingAgent/verifiers.ts134 symbols
src/providers/openai/codex-sdk.ts83 symbols
src/providers/http.ts83 symbols
src/tracing/otlpReceiver.ts72 symbols
src/providers/a2a/index.ts67 symbols
examples/openai-agents/agent_provider_test.py67 symbols
src/models/eval.ts64 symbols
src/providers/opencode-sdk.ts61 symbols
src/providers/openai/realtime.ts61 symbols

Used by 1 indexed graphs manifest dependencies, hub-wide

Dependencies from manifests, versioned

github.com/sashabaranov/go-openaiv1.37.0 · 1×
@actions/core3.0.0 · 1×
@actions/exec3.0.0 · 1×
@actions/github9.1.0 · 1×
@ai-sdk/openai3.0.41 · 1×
@anthropic-ai/claude-agent-sdk0.3.167 · 1×
@anthropic-ai/sdk0.101.0 · 1×
@apidevtools/json-schema-ref-parser15.3.1 · 1×
@asteasolutions/zod-to-openapi8.5.0 · 1×
@aws-sdk/client-bedrock-agent-runtime3.1045.0 · 1×
@aws-sdk/client-bedrock-runtime3.1045.0 · 1×
@aws-sdk/client-s33.1003.0 · 1×

Datastores touched

(mongodb)Database · 1 repos

For agents

$ claude mcp add promptfoo \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact