MCPcopy
hub / github.com/humanlayer/12-factor-agents

github.com/humanlayer/12-factor-agents @main sqlite

repository ↗ · DeepWiki ↗
631 symbols 1,304 edges 175 files 14 documented · 2%
README

12-Factor Agents - Principles for building reliable LLM applications

Code License: Apache 2.0 Content License: CC BY-SA 4.0 Discord Server YouTube
Deep Dive YouTube
Deep Dive

In the spirit of 12 Factor Apps. The source for this project is public at https://github.com/humanlayer/12-factor-agents, and I welcome your feedback and contributions. Let's figure this out together!

[!TIP] Missed the AI Engineer World's Fair? Catch the talk here

Looking for Context Engineering? Jump straight to factor 3

Want to contribute to npx/uvx create-12-factor-agent - check out the discussion thread

Screenshot 2025-04-03 at 2 49 07 PM

Hi, I'm Dex. I've been hacking on AI agents for a while.

I've tried every agent framework out there, from the plug-and-play crew/langchains to the "minimalist" smolagents of the world to the "production grade" langraph, griptape, etc.

I've talked to a lot of really strong founders, in and out of YC, who are all building really impressive things with AI. Most of them are rolling the stack themselves. I don't see a lot of frameworks in production customer-facing agents.

I've been surprised to find that most of the products out there billing themselves as "AI Agents" are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical.

Agents, at least the good ones, don't follow the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern. Rather, they are comprised of mostly just software.

So, I set out to answer:

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

Welcome to 12-factor agents. As every Chicago mayor since Daley has consistently plastered all over the city's major airports, we're glad you're here.

Special thanks to @iantbutler01, @tnm, @hellovai, @stantonk, @balanceiskey, @AdjectiveAllison, @pfbyjy, @a-churchill, and the SF MLOps community for early feedback on this guide.

The Short Version: The 12 Factors

Even if LLMs continue to get exponentially more powerful, there will be core engineering techniques that make LLM-powered software more reliable, more scalable, and easier to maintain.

Visual Nav

factor 1 factor 2 factor 3
factor 4 factor 5 factor 6
factor 7 factor 8 factor 9
factor 10 factor 11 factor 12

How we got here

For a deeper dive on my agent journey and what led us here, check out A Brief History of Software - a quick summary here:

The promise of agents

We're gonna talk a lot about Directed Graphs (DGs) and their Acyclic friends, DAGs. I'll start by pointing out that...well...software is a directed graph. There's a reason we used to represent programs as flow charts.

010-software-dag

From code to DAGs

Around 20 years ago, we started to see DAG orchestrators become popular. We're talking classics like Airflow, Prefect, some predecessors, and some newer ones like (dagster, inggest, windmill). These followed the same graph pattern, with the added benefit of observability, modularity, retries, administration, etc.

015-dag-orchestrators

The promise of agents

I'm not the first person to say this, but my biggest takeaway when I started learning about agents, was that you get to throw the DAG away. Instead of software engineers coding each step and edge case, you can give the agent a goal and a set of transitions:

025-agent-dag

And let the LLM make decisions in real time to figure out the path

026-agent-dag-lines

The promise here is that you write less software, you just give the LLM the "edges" of the graph and let it figure out the nodes. You can recover from errors, you can write less code, and you may find that LLMs find novel solutions to problems.

Agents as loops

As we'll see later, it turns out this doesn't quite work.

Let's dive one step deeper - with agents you've got this loop consisting of 3 steps:

  1. LLM determines the next step in the workflow, outputting structured json ("tool calling")
  2. Deterministic code executes the tool call
  3. The result is appended to the context window
  4. Repeat until the next step is determined to be "done"
initial_event = {"message": "..."}
context = [initial_event]
while True:
  next_step = await llm.determine_next_step(context)
  context.append(next_step)

  if (next_step.intent === "done"):
    return next_step.final_answer

  result = await execute_step(next_step)
  context.append(result)

Our initial context is just the starting event (maybe a user message, maybe a cron fired, maybe a webhook, etc), and we ask the llm to choose the next step (tool) or to determine that we're done.

Here's a multi-step example:

027-agent-loop-animation

GIF Version

027-agent-loop-animation

Why 12-factor agents?

At the end of the day, this approach just doesn't work as well as we want it to.

In building HumanLayer, I've talked to at least 100 SaaS builders (mostly technical founders) looking to make their existing product more agentic. The journey usually goes something like:

  1. Decide you want to build an agent
  2. Product design, UX mapping, what problems to solve
  3. Want to move fast, so grab $FRAMEWORK and get to building
  4. Get to 70-80% quality bar
  5. Realize that 80% isn't good enough for most customer-facing features
  6. Realize that getting past 80% requires reverse-engineering the framework, prompts, flow, etc.
  7. Start over from scratch

Random Disclaimers

DISCLAIMER: I'm not sure the exact right place to say this, but here seems as good as any: this in BY NO MEANS meant to be a dig on either the many frameworks out there, or the pretty dang smart people who work on them. They enable incredible things and have accelerated the AI ecosystem.

I hope that one outcome of this post is that agent framework builders can learn from the journeys of myself and others, and make frameworks even better.

Especially for builders who want to move fast but need deep control.

DISCLAIMER 2: I'm not going to talk about MCP. I'm sure you can see where it fits in.

DISCLAIMER 3: I'm using mostly typescript, for reasons but all this stuff works in python or any other language you prefer.

Anyways back to the thing...

Design Patterns for great LLM applications

After digging through hundreds of AI libriaries and working with dozens of founders, my instinct is this:

  1. There are some cor

Extension points exported contracts — how you extend this code

ThreadStore (Interface)
(no doc) [11 implementers]
packages/create-12-factor-agent/template/src/state.ts
Section (Interface)
(no doc)
packages/walkthroughgen/src/cli.ts
Event (Interface)
(no doc)
workshops/2025-05/sections/04-baml-tests/src/agent.ts
Event (Interface)
(no doc)
workshops/2025-07-16/walkthrough/10-agent.ts
Event (Interface)
(no doc)
workshops/2025-05-17/sections/01-cli-and-agent/walkthrough/01-agent.ts
Event (Interface)
(no doc)
packages/create-12-factor-agent/template/src/agent.ts
WalkthroughData (Interface)
(no doc)
packages/walkthroughgen/src/cli.ts
Event (Interface)
(no doc)
workshops/2025-05/sections/01-cli-and-agent/walkthrough/01-agent.ts

Core symbols most depended-on inside this repo

get
called by 126
packages/create-12-factor-agent/template/src/state.ts
update
called by 52
packages/create-12-factor-agent/template/src/state.ts
create
called by 32
packages/create-12-factor-agent/template/src/state.ts
cli
called by 21
packages/walkthroughgen/src/cli.ts
agentLoop
called by 21
workshops/2025-05/walkthrough/01-agent.ts
agentLoop
called by 21
workshops/2025-05-17/walkthrough/01-agent.ts
agentLoop
called by 18
workshops/2025-07-16/walkthrough/10-agent.ts
withMockedConsole
called by 16
packages/walkthroughgen/test/utils/console-mock.ts

Shape

Function 291
Method 182
Class 112
Interface 46

Languages

TypeScript93%
Python7%

Modules by API surface

packages/walkthroughgen/src/cli.ts13 symbols
packages/create-12-factor-agent/template/src/agent.ts12 symbols
workshops/2025-07-16/walkthrough/10-agent.ts11 symbols
workshops/2025-05/walkthrough/10-agent.ts11 symbols
workshops/2025-05/sections/final/src/agent.ts11 symbols
workshops/2025-05/sections/12-humanlayer-webhook/src/agent.ts11 symbols
workshops/2025-05/sections/11-humanlayer-approval/src/agent.ts11 symbols
workshops/2025-05/sections/10-human-approval/walkthrough/10-agent.ts11 symbols
workshops/2025-05/final/src/agent.ts11 symbols
workshops/2025-05-17/walkthrough/10-agent.ts11 symbols
packages/create-12-factor-agent/template/src/state.ts10 symbols
workshops/2025-07-16/walkthrough/07b-agent.ts9 symbols

Dependencies from manifests, versioned

@boundaryml/baml0.85.0 · 1×
@types/diff7.0.2 · 1×
@types/express5.0.1 · 1×
@types/jest29.5.14 · 1×
@types/js-yaml4.0.9 · 1×
@types/node20.0.0 · 1×
@typescript-eslint/eslint-plugin6.0.0 · 1×
@typescript-eslint/parser6.0.0 · 1×
baml0.0.0 · 1×
diff7.0.0 · 1×
eslint8.0.0 · 1×
express5.1.0 · 1×

For agents

$ claude mcp add 12-factor-agents \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact