Learn to build AI agents locally without frameworks. Understand what happens under the hood before using production frameworks.

This repository teaches you to build AI agents from first principles using local LLMs and node-llama-cpp. By working through these examples, you'll understand:
A Python version of this tutorial is available here: https://github.com/pguso/agents-from-scratch
Philosophy: Learn by building. Understand deeply, then use frameworks wisely.
This repository now has a matching companion website:
https://agentsfromscratch.com
The website is not a replacement for this repo, but a conceptual companion that:
Recommended workflow:
- Use GitHub for running, modifying, and studying the code
- Use the website for mental models, explanations, and progression
Think of the site as the map and this repo as the terrain.
./models/ folder, details in DOWNLOAD.mdnpm install
node intro/intro.js
node simple-agent/simple-agent.js
node react-agent/react-agent.js
Follow these examples in order to build understanding progressively:
intro/ | Code | Code Explanation | Concepts
What you'll learn: - Loading and running a local LLM - Basic prompt/response cycle
Key concepts: Model loading, context, inference pipeline, token generation
openai-intro/ | Code | Code Explanation | Concepts
What you'll learn: - How to call hosted LLMs (like GPT-4) - Temperature Control - Token Usage
Key concepts: Inference endpoints, network latency, cost vs control, data privacy, vendor dependence
translation/ | Code | Code Explanation | Concepts
What you'll learn: - Using system prompts to specialize agents - Output format control - Role-based behavior - Chat wrappers for different models
Key concepts: System prompts, agent specialization, behavioral constraints, prompt engineering
think/ | Code | Code Explanation | Concepts
What you'll learn: - Configuring LLMs for logical reasoning - Complex quantitative problems - Limitations of pure LLM reasoning - When to use external tools
Key concepts: Reasoning agents, problem decomposition, cognitive tasks, reasoning limitations
batch/ | Code | Code Explanation | Concepts
What you'll learn: - Processing multiple requests concurrently - Context sequences for parallelism - GPU batch processing - Performance optimization
Key concepts: Parallel execution, sequences, batch size, throughput optimization
coding/ | Code | Code Explanation | Concepts
What you'll learn: - Real-time streaming responses - Token limits and budget management - Progressive output display - User experience optimization
Key concepts: Streaming, token-by-token generation, response control, real-time feedback
simple-agent/ | Code | Code Explanation | Concepts
What you'll learn: - Function calling / tool use fundamentals - Defining tools the LLM can use - JSON Schema for parameters - How LLMs decide when to use tools
Key concepts: Function calling, tool definitions, agent decision making, action-taking
This is where text generation becomes agency!
simple-agent-with-memory/ | Code | Code Explanation | Concepts
What you'll learn: - Persisting information across sessions - Long-term memory management - Facts and preferences storage - Memory retrieval strategies
Key concepts: Persistent memory, state management, memory systems, context augmentation
react-agent/ | Code | Code Explanation | Concepts
What you'll learn: - ReAct pattern (Reason → Act → Observe) - Iterative problem solving - Step-by-step tool use - Self-correction loops
Key concepts: ReAct pattern, iterative reasoning, observation-action cycles, multi-step agents
This is the foundation of modern agent frameworks!
aot-agent/ | Code | Code Explanation | Concepts
What you'll learn: - Atom of Thought methodology - Atomic planning for multi-step computations - Dependency management between operations - Structured JSON output for reasoning plans - Deterministic execution of plans
Key concepts: AoT planning, atomic operations, dependency resolution, plan validation, structured reasoning
error-handling/ | Code | Code Explanation | Concepts
What you'll learn:
- Typed error taxonomy (validation, LLM, tools, workflow) with stable codes
- Timeouts, retries with backoff/jitter, and classifying transient failures
- Graceful degradation when the LLM path fails (deterministic tool fallback)
- Orchestration-level errors (AgentWorkflowError) and correlation ids for support
Key concepts: Error taxonomy, retry policies, timeouts, fallbacks, degraded mode, observability, user-safe messaging
tree-of-thought/ | Code | Code Explanation | Concepts
What you'll learn: - Generating multiple candidate next actions from the same partial plan - Ranking and pruning branches with a deterministic score in code - Running a compact beam search loop with inspectable kept/pruned decisions - Verifying the winning path with explicit sanity checks
Key concepts: Tree of Thought, beam search, branch pruning, verifiable objectives, search controllers
graph-of-thought/ | Code | Code Explanation | Concepts
What you'll learn:
- Modeling reasoning as a DAG: parallel source extracts → merge rules → final draft
- Resolving conflicts explicitly before generation (must_include, must_avoid, conflict_notes)
- Adding deterministic merge and draft compliance checks
- Running independent nodes in parallel to reduce latency
Key concepts: Graph of Thought, DAG orchestration, multi-source fusion, merge-before-generate, policy reconciliation
Decision guide: use ToT when you need to search competing paths; use GoT when you need to combine multiple sources into one consistent policy. Compare both in: - ToT concept - GoT concept
chain-of-thought/ | Code | Code Explanation | Concepts
What you'll learn: - Splitting a high-stakes decision into explicit reasoning phases - Preventing early bias with a facts-only extraction step - Balancing fraud signals with legitimacy evidence before policy application - Producing an auditable final decision with customer-safe and internal outputs
Key concepts: Chain of Thought, structured reasoning traces, policy-constrained decisions, explainability, review-ready workflows
tool-routing-embeddings/ | Code | Code Explanation | Concepts
What you'll learn:
- Precomputing embeddings for short exemplar phrases per tool
- Scoring the user message against exemplars (cosine similarity) with a small embedding model
- Passing only top-k tools (plus optional always-include tools) into session.prompt
- Observing recall failure when k is too small for multi-intent prompts
Key concepts: Tool routing, embedding similarity, exemplar design, context/token savings, pinned tools, retrieval-style agent design
Each example folder contains:
<name>.js - The working code exampleCODE.md - Step-by-step code explanationCONCEPT.md - High-level conceptsAI Agent = LLM + System Prompt + Tools + Memory + Reasoning Pattern
─┬─ ──────┬────── ──┬── ──┬─── ────────┬────────
│ │ │ │ │
Brain Identity Hands State Strategy
1. intro → Basic LLM usage
2. translation → Specialized behavior (system prompts)
3. think → Reasoning ability
4. batch → Parallel processing
5. coding → Streaming & control
6. simple-agent → Tool use (function calling)
7. memory-agent → Persistent state
8. react-agent → Strategic reasoning + tool use
Simple Agent (Steps 1-5)
User → LLM → Response
Tool-Using Agent (Step 6)
User → LLM ⟷ Tools → Response
Memory Agent (Step 7)
User → LLM ⟷ Tools → Response
↕
Memory
ReAct Agent (Step 8)
User → LLM → Think → Act → Observe
↑ ↓ ↓ ↓
└──────┴──────┴──────┘
Iterate until solved
helper/prompt-debugger.js
Utility for debugging prompts sent to the LLM. Shows exactly what the model sees, including: - System prompts - Function definitions - Conversation history - Context state
Usage example in simple-agent/simple-agent.js
ai-agents/
├── README.md ← You are here
├─ examples/
├── 01_intro/
│ ├── intro.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 02_openai-intro/
│ ├── openai-intro.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 03_translation/
│ ├── translation.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 04_think/
│ ├── think.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 05_batch/
│ ├── batch.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 06_coding/
│ ├── coding.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 07_simple-agent/
│ ├── simple-agent.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 08_simple-agent-with-memory/
│ ├── simple-agent-with-memory.js
│ ├── memory-manager.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 09_react-agent/
│ ├── react-agent.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 10_aot-agent/
│ ├── aot-agent.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 11_error-handling/
│ ├── error-handling.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 12_tree-of-thought/
│ ├── tree-of-thought.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 13_graph-of-thought/
│ ├── graph-of-thought.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 14_chain-of-thought/
│ ├── chain-of-thought.js
│ ├── CODE.md
│ └── CONCEPT.md
├── 15_tool-routing-embeddings/
│ ├── tool-routing-embeddings.js
│ ├── CODE.md
│ └── CONCEPT.md
├── helper/
│ └── prompt-debugger.js
├── models/ ← Place your GGUF models here
└── logs/ ← Debug outputs
$ claude mcp add ai-agents-from-scratch \
-- python -m otcore.mcp_server <graph>