One programming model for building with LLMs across TypeScript, Python, Java, C++, Go, and Rust.
Ax is TypeScript-first and ships today as @ax-llm/ax. The same signatures,
provider mappings, agents, flows, runtime contracts, and optimizers are also
compiled into verified generated Python, Java, C++, Go, and Rust libraries.
💬 Follow @dosco on X for new releases and to chat about the project.
f()
builder, or any Standard Schema v1 validator — Zod, Valibot, ArkType..returns(...) projection.| Ecosystem | Package / import | Status |
|---|---|---|
| TypeScript / JavaScript | @ax-llm/ax |
import { ai, ax, agent, flow } from "@ax-llm/ax" | Published on npm |
| Python | axllm
from axllm import ai, ax, agent, flow | Published on PyPI |
| Java | dev.axllm:ax
import dev.axllm.ax.* | Published on Maven Central |
| C++ | axllm::axllm
#include <axllm/axllm.hpp> | CMake FetchContent (source build) |
| Go | github.com/ax-llm/ax/packages/go
import ax "github.com/ax-llm/ax/packages/go" | Installable with go get; opt-in runtime/goja actor runtime |
| Rust | axllm
use axllm::{ai, ax, agent, flow}; | Published on crates.io; protocol-first code runtime |
flowchart LR
S["Signature (string, f, Standard Schema)"] --> G["AxGen typed generation"]
G --> P["Provider descriptors / AI clients"]
G --> A["AxAgent"]
G --> F["AxFlow"]
G --> O["GEPA / optimizer artifacts"]
C["Shared Ax semantics"] --> TS["TypeScript"]
C --> PY["Python"]
C --> JV["Java"]
C --> CP["C++"]
C --> GO["Go"]
C --> RS["Rust"]
The TypeScript package is the source implementation and the current published package:
import { ai, ax } from "@ax-llm/ax";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY });
const classify = ax(
'review:string -> sentiment:class "positive, negative, neutral"',
);
const { sentiment } = await classify.forward(llm, {
review: "This product is amazing!",
});
// sentiment: "positive" — typed as the literal union
No prompt engineering. Switch name: "openai" to "anthropic", "google-gemini", "mistral", "deepseek", "grok", etc. — same signature, same code.
The generated Python, Java, C++, Go, and Rust libraries expose the same top-level Ax
ideas in native package shapes. Their generated source is checked in under
packages/<language> so the supported APIs are easy to inspect. The repo
runner uses those committed packages and runs examples without asking you to
remember compiler commands:
npm run example -- list
npm run example -- python src/examples/python/generation/axgen-openai.py
npm run example -- java src/examples/java/generation/BasicGenerationExample.java
npm run example -- cpp src/examples/cpp/generation/basic_generation.cpp
npm run example -- go src/examples/go/generation/basic_generation.go
npm run example -- rust src/examples/rust/generation/basic_generation.rs
See src/examples/README.md for runnable examples,
docs/RELEASE.md for package/release shape, and
docs/COMPILER.md for how the language-agnostic Ax
compiler works. When AxIR changes, run npm run axir:generate-packages to
refresh the checked-in packages.
Ax is designed to stay in the same latency class as direct provider calls while adding typed outputs, validation, retries, tools, tracing, and memory. The hot path is intentionally thin: render the signature, call the provider, parse the result, and return a typed value.
Streaming is the default because it lets Ax do useful work before the model finishes: parse fields as they arrive, run streaming assertions, fail early, cancel the in-flight stream, and start correction without spending tokens on an output that is already known to be invalid. When you only want a final object, forward() still gives you one; when you want incremental output, streamingForward() exposes the stream directly.
The repo includes a streaming benchmark for checking overhead on your own providers and models:
AX_STREAM_BENCH_PROVIDER=anthropic AX_STREAM_BENCH_MODEL=claude-sonnet-4-5-20250929 AX_STREAM_BENCH_RUNS=2 AX_STREAM_BENCH_WARMUP_RUNS=0 npm run tsx src/examples/streaming-latency.ts
AX_STREAM_BENCH_PROVIDER=google-gemini AX_STREAM_BENCH_MODEL=gemini-3.5-flash AX_STREAM_BENCH_RUNS=2 AX_STREAM_BENCH_WARMUP_RUNS=0 npm run tsx src/examples/streaming-latency.ts
Recent runs on Claude Haiku/Sonnet and Gemini Flash/Flash Lite show provider queueing and model generation dominate total latency; AxGen stays close to the raw ai.chat() path while providing the structured-output control loop that direct SDK calls leave to application code.
const extract = ax(`
customerEmail:string, currentDate:datetime ->
priority:class "high, normal, low",
sentiment:class "positive, negative, neutral",
ticketNumber?:number,
nextSteps:string[],
estimatedResponseTime:string
`);
const result = await extract.forward(llm, {
customerEmail: "Order #12345 hasn't arrived. Need this resolved immediately!",
currentDate: new Date(),
});
f()import { ax, f } from "@ax-llm/ax";
const productExtractor = f()
.input("productPage", f.string())
.output("product", f.object({
name: f.string(),
price: f.number(),
specs: f.object({
dimensions: f.object({ width: f.number(), height: f.number() }),
materials: f.array(f.string()),
}),
reviews: f.array(f.object({ rating: f.number(), comment: f.string() })),
}))
.build();
const gen = ax(productExtractor);
const { product } = await gen.forward(llm, { productPage: "..." });
// product.specs.dimensions.width is typed end-to-end
Any Standard Schema v1 validator works wherever f.* is accepted — at field level, whole-object level, or on a fn() tool. Same retry pipeline, same type inference, no adapter.
import { z } from "zod";
import { ax, f, fn } from "@ax-llm/ax";
// (1) Per-field zod — mix freely with f.* fields
const reviewSentiment = ax(
f()
.input("productName", z.string().describe("Reviewed product"))
.input("reviewText", z.string().min(10))
.output("sentiment", z.enum(["positive", "neutral", "negative"]))
.output("score", z.number().min(1).max(10))
.output("keyPoints", z.array(z.string()))
.build(),
);
// (2) Whole-object zod — declare once, decomposed into ordered fields
const productSummary = ax(
f()
.input(z.object({ productName: z.string(), buyerProfile: z.string() }))
.output(z.object({
headline: z.string(),
pros: z.array(z.string()),
cons: z.array(z.string()),
recommendation: z.enum(["buy", "wait", "skip"]),
}))
.build(),
);
// (3) Whole-object zod on fn() — typed tool definition
const lookupProduct = fn("lookupProduct")
.description("Look up a product by name")
.arg(z.object({ productName: z.string().min(1), includeSpecs: z.boolean().optional() }))
.returns(z.object({ price: z.number(), inStock: z.boolean(), rating: z.number().min(1).max(5) }))
.handler(async ({ productName }) => ({ price: 79.99, inStock: true, rating: 4.3 }))
.build();
.min(), .max(), .email(), .url(), .regex() feed the normal retry pipeline; .refine(), .transform(), and .superRefine() execute at parse time on complete field values, in both streaming and non-streaming. Cache breakpoints and internal reasoning fields use companion options: { cache: true }, { internal: true }. Multimodal inputs (image, audio, file) still use f.*.
Runnable: src/examples/standard-schema.ts.
const assistant = ax("question:string -> answer:string", {
functions: [
{ name: "getCurrentWeather", func: weatherAPI },
{ name: "searchNews", func: newsAPI },
],
});
const { answer } = await assistant.forward(llm, {
question: "What's the weather in Tokyo and any news about it?",
});
const analyze = ax(`
image:image, question:string ->
description:string,
mainColors:string[],
category:class "electronics, clothing, food, other",
estimatedPrice:string
`);
Batch speech APIs are exposed by AI services: ai.transcribe({ audio }) turns audio into text, and ai.speak({ text }) turns text into an audio artifact. Signature audio outputs are scripted artifacts: the model writes the text for speech:audio, then Ax synthesizes it after parsing.
const say = ax("question:string -> speech:audio, summary:string");
const res = await say.forward(llm, { question: "Greet the team." }, {
speech: { speak: { voice: "alloy", format: "mp3" } },
});
console.log(res.speech.data); // base64 audio
console.log(res.speech.transcript); // generated script
Agents transcribe :audio inputs before the planner/executor/responder stages, so tools and memory receive stable text rather than base64 payloads. Native conversational audio is still available through .chat().
OpenAI supports both request-based audio chat (gpt-audio, gpt-audio-mini) and realtime voice/transcription models (gpt-realtime-2, gpt-realtime-whisper). Gemini native audio uses the Live API under the same .chat() shape; Grok Voice uses the realtime voice endpoint.
These same three audio paths ship in all five generated ports (Python, Go, Rust, Java, C++): batch transcribe()/speak(), .chat() with input_audio content parts, and realtime voice over WebSocket — realtime-capable models route transparently through chat(), or you can call the productized realtime_chat() driver directly (Go: RealtimeChat). Each port ships an offline realtime_audio_turn example and an opt-in dependency for the socket (see the install notes below).
import WebSocket from "ws";
import {
ai,
axAIOpenAIRealtimeDefaultConfig,
axAIOpenAIRealtimeTranscriptionDefaultConfig,
} from "@ax-llm/ax";
const voice = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: axAIOpenAIRealtimeDefaultConfig(), // gpt-realtime-2
});
const stream = await voice.chat(
{ chatPrompt: [{ role: "user", content: "Say hello out loud." }] },
{ stream: true, webSocket: WebSocket },
);
for await (const chunk of stream) {
const audio = chunk.results[0]?.audio;
if (audio?.isDelta) {
// base64 pcm16 audio bytes
process.stdout.write(".");
}
}
const transcriber = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: axAIOpenAIRealtimeTranscriptionDefaultConfig(), // gpt-realtime-whisper
});
Runnable: src/examples/audio-chat.ts streams realtime audio, saves a WAV, and plays it when a local player is available. src/examples/audio-batch-and-agent.ts writes generated MP3 artifacts under src/examples/output/ and plays them immediately.
AxAgent is a three-stage pipeline that turns a signature into a long-running, tool-using actor. Each forward() call runs distiller → executor → responder.
flowchart LR
IN["inputs"] --> D["Distiller"]
D --> E["Executor (RLM loop)"]
E --> RT["AxJSRuntime sandbox"]
E --> FN["functions / child agents"]
E --> M["recall - memories"]
E --> SK["consult - skills"]
E --> RES["Responder"]
RES --> OUT["typed output"]
```typescript import { agent, AxJSRuntime } from "@ax-llm/ax";
const analyzer = agent( "context:string, query:string -> answer:string, evidence:string[]", { agentIdentity: { name: "documentAnalyzer", description: "Analyze long documents with iterative code + sub-queries", }, contextFields: ["context"], runtime: new AxJSRuntime(), maxTurns: 20, maxRuntimeChars: 2_000, contextPolicy: { preset: "checkpointed", budget: "balanced" }, executorOptions: { model: "gpt-5.4-mini" }, }, );
const result = await analyzer.forward(llm, { context: