MCPcopy
hub / github.com/dosco/graphjin

github.com/dosco/graphjin @v3.18.41 sqlite

repository ↗ · DeepWiki ↗ · release v3.18.41 ↗
32,401 symbols 58,897 edges 632 files 7,312 documented · 23%
README

GraphJin - The Governed Data Plane for AI Agents

Apache 2.0 NPM Package Docker Pulls Discord Chat GoDoc GoReport

GraphJin is a compiler and runtime that lets AI agents connect to the systems a real company already has: databases, warehouses, files, source code, workflows, metadata, and security policy. Instead of handing an agent raw credentials and hoping it guesses correctly, GraphJin gives it one governed GraphQL + MCP surface where it can discover before acting, validate queries, run approved work, and observe runtime status.

It is not only for agents. GraphJin is still a high-performance GraphQL-to-database compiler, Go library, standalone API service, REST/OpenAPI gateway, and real-time subscription server. The agent use case is where everything comes together: the same compiler that serves your apps can also give AI a smart, auditable way to work across data, code, and operations.

Works with PostgreSQL, MySQL, MongoDB, SQLite, Oracle, MSSQL, Snowflake, Redshift, BigQuery, Apache Cassandra / Amazon Keyspaces, S3/GCS/local files, CodeSQL source indexes - and models from Claude/GPT-4 to local 7B models.

Why GraphJin For Agents

  • One governed surface for many systems - Query operational databases, warehouses, MongoDB, object stores, local files, CodeSQL source indexes, workflows, and GraphJin system roots through GraphQL and MCP.
  • Smart discovery before action - Agents start with query_catalog(search: "<user instruction>"), graphql_help, relationship evidence, examples, config recipes, and safety notes before writing or running queries.
  • Guarded action, not raw access - Source-mode access, query allow-lists, read-only boundaries, policy-aware MCP tools, local encrypted secrets, and gj_config preview/apply keep changes auditable.
  • Operational awareness - gj_security, gj_runtime, and the built-in console expose policy and bounded runtime status so agents can check what is safe before they act.

Installation

npm (all platforms)

npm install -g graphjin

macOS (Homebrew)

brew install dosco/graphjin/graphjin

Windows (Scoop)

scoop bucket add graphjin https://github.com/dosco/graphjin-scoop
scoop install graphjin

Linux

Download .deb/.rpm from releases

Docker

docker pull dosco/graphjin

Try It Now

This is a quick way to try out GraphJin. The --demo flag runs a curated local demo, creates local state under the example's demo/ folder, and reuses that state on later starts. Delete demo/ to reset from scratch.

Download the source which contains the webshop demo

git clone https://github.com/dosco/graphjin
cd graphjin

Now launch the Graphjin service that you installed using the install options above

graphjin serve --demo --path examples/webshop

For a larger agent-driven example with Postgres operations data, a BigQuery simulator for roast telemetry, CodeSQL over internal business code, and executable workflows:

graphjin serve --demo --path examples/coffee-roastery

You'll see output like this:

GraphJin started
───────────────────────
  Web UI:      http://localhost:8080/
  GraphQL:     http://localhost:8080/api/v1/graphql
  REST API:    http://localhost:8080/api/v1/rest/
  Workflows:   http://localhost:8080/api/v1/workflows/<name>
  MCP:         http://localhost:8080/api/v1/mcp

Add GraphJin To Your AI Client

Local / Dev

Use GraphJin's helper when you want one command that normalizes the URL, probes auth, and installs the right Codex or Claude config:

graphjin mcp add codex
graphjin mcp add claude
graphjin mcp add all http://localhost:8080

Defaults are client=codex, server=http://localhost:8080, and project scope. The command normalizes the server to http://localhost:8080/api/v1/mcp. Local non-TLS HTTP is correct for loopback development; hosted servers should use HTTPS.

If you prefer native client commands, add GraphJin's Streamable HTTP endpoint directly:

codex mcp add graphjin --url http://localhost:8080/api/v1/mcp
claude mcp add --transport http graphjin http://localhost:8080/api/v1/mcp

GraphJin's /api/v1/mcp endpoint is Streamable HTTP, so Claude should use --transport http for GraphJin. SSE is only for older/custom MCP servers.

Use --global when you want the MCP connection available outside the current project:

graphjin mcp add codex --global

Codex can also add non-URL stdio MCP servers with the generic command shape:

codex mcp add <server-name> -- <command> [args...]

Hosted GraphJin With OAuth

When mcp.oauth.enabled: true is configured on a hosted GraphJin server, modern MCP clients can add it by URL and handle OAuth login themselves:

codex mcp add graphjin --url https://graphjin.example.com/api/v1/mcp
claude mcp add --transport http graphjin https://graphjin.example.com/api/v1/mcp

This is the native remote-MCP path. GraphJin serves OAuth protected-resource metadata, authorization-server metadata, DCR/CIMD discovery, and MCP 401 challenges so the client can discover login automatically. See the official OpenAI Docs MCP quickstart for the Codex mcp add --url flow and the Claude Code MCP docs for Claude's HTTP transport and authentication flow.

For legacy/custom SSE servers, use Claude's SSE transport explicitly:

claude mcp add --transport sse <name> <url>
claude mcp add --transport sse private-api https://api.company.com/sse \
  --header "X-API-Key: your-key-here"

Legacy / Current auth_login Fallback

If a server still uses GraphJin's current auth_login device-code flow instead of standards OAuth, graphjin mcp add detects that automatically:

graphjin mcp add codex https://graphjin.example.com

The command opens the device-code login, saves ~/.config/graphjin/client.json, and installs a credential-free local proxy config for the AI client. Re-run graphjin mcp setup https://graphjin.example.com later only when you want to refresh or rotate that saved CLI/proxy token.

The deprecated aliases still work for scripts:

graphjin mcp install codex https://graphjin.example.com
graphjin mcp plugin install https://graphjin.example.com   # deprecated Claude alias

Authenticate The CLI

Before graphjin cli can talk to a server, point it at one. There are no --server or --token flags — both come from a single saved config file (~/.config/graphjin/client.json, mode 0600):

graphjin cli setup http://localhost:8080            # local dev, no auth needed
graphjin cli setup https://graphjin.example.com     # signs in via the server's OIDC IdP

What setup does, depending on the server:

  • No built-in login (the server has auth_login.enabled: false): saves only the URL. CLI calls send no Authorization header.
  • Built-in login enabled: kicks off an RFC 8628 device-code flow. The CLI prints a verification URL + short code, opens your browser, you sign in with the configured identity provider (Google, Okta, Keycloak, Auth0-as-IdP, Azure AD — anything OIDC), and the server mints a 30-day JWT. Both URL and JWT are saved to client.json.

After setup every graphjin cli ... command just works:

graphjin cli health
graphjin cli query list
graphjin cli schema tables
graphjin cli setup show       # print the saved config (token redacted)
graphjin cli setup logout     # delete client.json
graphjin cli setup            # re-run sign-in against the same server (refresh token)

To enable built-in login, set this on the server:

auth:
  type: jwt
  jwt:
    secret: "long-random-shared-secret"   # used to sign and verify local JWTs

auth_login:
  enabled: true
  audience_graphjin: true                 # shorthand for audience: "graphjin-cli"
  oidc:
    issuer_url: "https://accounts.google.com"
    client_id: "..."
    client_secret: "..."                  # or $GJ_AUTH_LOGIN_OIDC_CLIENT_SECRET
    allowed_domains: ["example.com"]      # optional allow-list

mcp:
  oauth:
    enabled: true
    mode: builtin                         # reuses auth_login identity
    scopes: ["mcp"]

Successful authentication is recorded in structured logs with the verified email and name claims (when present), giving you a clean audit trail of who called every endpoint.

Getting started

To use GraphJin with your own databases you have to first create a new GraphJin app, then configure it using its config files and then launch GraphJin.

Step 1: Create New GraphJin App

graphjin new my-app

Step 2: Start the GraphJin Service

graphjin serve --path ./my-app

Step 3: Add GraphJin to an AI client

graphjin mcp add claude http://localhost:8080

Step 4: Ask Claude questions like: - "What tables are in the database?" - "Show me all products under $50" - "List customers and their purchases" - "What's the total revenue by product?" - "Find products with 'wireless' in the name" - "Add a new product called 'USB-C Cable' for $19.99"

How It Works

  1. Connects to database - Reads your schema automatically
  2. Discovers relationships - Foreign keys become navigable joins
  3. Exposes metadata - gj_* tables make discovered databases, tables, columns, relationships, functions, and indexes queryable when the GraphJin source is enabled
  4. Indexes source code - CodeSQL turns tree-sitter syntax trees and database references into a managed SQLite database
  5. Exposes MCP tools - Teach any LLM the query syntax
  6. Runs JS workflows - Chain multiple GraphJin MCP tools in one reusable workflow
  7. Compiles to SQL - Every request becomes a single optimized query

No resolvers. No ORM. No N+1 queries. Just point and query.

CodeSQL: Query Source Code Like a Database

CodeSQL is a managed source kind for source trees. Configure a source folder and GraphJin creates a SQLite cache under config/codesql/, indexes it with tree-sitter, and updates it on restart. In development it also watches for changes while the service runs; in production live watching is disabled.

sources:
  - name: app
    kind: sql
    type: postgres
    connection_string: postgres://app:secret@db/app
    default: true

  - name: code
    kind: codesql
    path: /srv/app
    infer_db_refs: true

  - name: graphjin
    kind: graphjin
    metadata: true

tables:
  - name: users
    source: app

  - name: gj_code
    source: code
    read_only: true

GraphJin exposes CodeSQL through one ordinary GraphQL root, gj_code. Use kind to select files, symbols, references, imports, database references, docs, parse errors, change sets, and locks:

query {
  gj_code(where: { kind: { eq: "symbol" }, name: { iregex: "handler|resolver" } }, limit: 20) {
    name
    symbol_kind
    language
    start_row
    path
    hash
  }
}

With a kind: graphjin source, GraphJin creates a read-only system graph named graphjin by default. Schema, catalog, entrypoint, capability, workflow, and system metadata are catalog items in gj_catalog; table and column metadata are selected by kind. When one CodeSQL source is active, GraphJin links catalog items to code references automatically:

query {
  gj_catalog(where: { kind: { eq: "column" }, table_name: { eq: "users" }, column_name: { eq: "email" } }) {
    database_name
    table_name
    column_name
    gj_code {
      kind
      ref_kind
      path
      symbol_id
    }
  }
}

This is where the model gets genuinely powerful: the same agent can inspect production data systems and the code that operates them. It can ask, "which handlers touch customer invoices?", "what tables do these workflows depend on?", or "show me the imports and call sites near this data path" without switching tools or inventing a new backend.

What AI Can Do

Simple queries with filters:

{ products(where: { price: { gt: 50 } }, limit: 10) { id name price } }

Nested relationships:

{
  orders(limit: 5) {
    id total
    customer { name email }
    items { quantity product { name category { name } } }
  }
}

Aggregations:

{ products { count_id sum_price avg_price } }

Analytics directives:

{
  orders {
    account_id
    month
    total
    running_total: total @running(aggregate: sum, by: "account_id", orderBy: { month: asc })
    moving_avg_total: total @moving(aggregate: avg, rows: 6, by: "account_id", orderBy: { month: asc })
    previous_total: total @previous(by: "account_id", orderBy: { month: asc })
    rank_by_total: total @rank(by: "account_id", order: desc)
  }
}

Use analytics directives when each original row should remain visible while adding report metrics such as running totals, moving averages, previous/next values, first/last values, and rank within a group. Ordinary one-row-per-group summaries still use distinct plus aggregate fields. Supported SQL databases validate analytics support a

Extension points exported contracts — how you extend this code

Executor (Interface)
Executor is the seam between the pure planner/assembler and the real database. M1 tests supply a fake; M2 implements it [12 …
cassandradriver/resolve.go
ResponseCacheProvider (Interface)
ResponseCacheProvider defines the interface for response caching. This is implemented by the service layer (serv package [6 …
core/cache.go
DDLDialect (Interface)
DDLDialect defines how to generate DDL for a specific database [9 implementers]
core/schema_ddl.go
Backend (Interface)
Backend is the contract every filesystem implementation satisfies. Methods take a context so the engine's request deadli [6 …
core/fstable/backend.go
AuthProvider (Interface)
AuthProvider attaches authentication to outgoing requests to an upstream API. Implementations are constructed once per s [7 …
core/openapi/auth.go
Executor (Interface)
Executor is the seam between the resolver and the database: production runs SQL on the wrapped clickhouse-go *sql.DB, te [12 …
clickhousedriver/resolve.go
CursorCache (Interface)
CursorCache is the interface for MCP cursor caching It maps short numeric IDs to encrypted cursor strings for LLM-friend [5 …
serv/mcp_cursor_cache.go
JWTProvider (Interface)
JWTProvider is the interface to define providers for doing JWT authentication. [4 implementers]
auth/provider/provider.go

Core symbols most depended-on inside this repo

WriteString
called by 5104
core/internal/dialect/dialect.go
Error
called by 736
core/trace.go
SetError
called by 717
core/internal/dialect/dialect.go
EnterRule
called by 539
tests/hostedemu/snowflake/internal/sfparser/snowflake_parser.go
ExitRule
called by 539
tests/hostedemu/snowflake/internal/sfparser/snowflake_parser.go
Id_
called by 479
tests/hostedemu/snowflake/internal/sfparser/snowflake_parser.go
Close
called by 409
serv/cache.go
Quote
called by 325
core/internal/dialect/dialect.go

Shape

Method 24,054
Function 6,281
Struct 1,408
Interface 595
TypeAlias 44
FuncType 13
Class 6

Languages

Go100%
TypeScript1%

Modules by API surface

tests/hostedemu/snowflake/internal/sfparser/snowflake_parser.go20,074 symbols
tests/hostedemu/snowflake/internal/sfparser/snowflakeparser_base_listener.go1,087 symbols
tests/hostedemu/snowflake/internal/sfparser/snowflakeparser_listener.go1,083 symbols
tests/hostedemu/snowflake/internal/sfparser/snowflakeparser_visitor.go542 symbols
tests/hostedemu/snowflake/internal/sfparser/snowflakeparser_base_visitor.go542 symbols
core/schema_ddl.go160 symbols
core/internal/dialect/mongodb.go160 symbols
core/api.go143 symbols
core/internal/dialect/dialect.go135 symbols
core/internal/dialect/mssql.go134 symbols
core/internal/dialect/sqlite.go110 symbols
core/internal/dialect/mysql.go101 symbols

Dependencies from manifests, versioned

cel.dev/exprv0.25.1 · 1×
cloud.google.com/gov0.123.0 · 1×
cloud.google.com/go/authv0.20.0 · 1×
cloud.google.com/go/auth/oauth2adaptv0.2.8 · 1×
cloud.google.com/go/compute/metadatav0.9.0 · 1×
cloud.google.com/go/monitoringv1.24.3 · 1×
cloud.google.com/go/storagev1.62.1 · 1×
dario.cat/mergov1.0.0 · 1×
filippo.io/edwards25519v1.1.0 · 1×
github.com/99designs/go-keychainv0.0.0-2019100805025 · 1×
github.com/99designs/keyringv1.2.2 · 1×

Datastores touched

usersCollection · 1 repos
categoriesCollection · 1 repos
chatsCollection · 1 repos
commentsCollection · 1 repos
eventsCollection · 1 repos
graph_edgeCollection · 1 repos
graph_nodeCollection · 1 repos
locationsCollection · 1 repos

For agents

$ claude mcp add graphjin \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact