MCPcopy Index your code
hub / github.com/koxudaxi/datamodel-code-generator

github.com/koxudaxi/datamodel-code-generator @0.67.0 sqlite

repository ↗ · DeepWiki ↗ · release 0.67.0 ↗
12,269 symbols 44,563 edges 2,367 files 4,694 documented · 38%
README

datamodel-code-generator

🚀 Generate Python data models from schema definitions in seconds.

🧪 Try it in your browser: Playground

[!NOTE] Playground privacy: generation runs locally in your browser with Pyodide. Schemas and options are not sent to a backend. Shared repro URLs encode them in the URL fragment (#state=...), which browsers do not send to the server; the full URL can still be stored in your browser history or wherever you share it.

PyPI version Conda-forge Downloads PyPI - Python Version codecov license Pydantic v2

📣 💼 Maintainer update: Open to opportunities. 🔗 koxudaxi.dev

✨ What it does

<img alt="Schema files, raw data, and existing Python models flow through datamodel-code-generator into Python model output types" src="https://github.com/koxudaxi/datamodel-code-generator/raw/0.67.0/docs/assets/diagrams/hero-light.svg" width="760">

Pick any one of the supported inputs and pick the Python model style you want as output. --input-model path/to/file.py:ClassName can even retarget an existing Pydantic, dataclass, or TypedDict class defined in another Python file to a different output type.

  • 📄 Converts OpenAPI 3, AsyncAPI, JSON Schema, Apache Avro, XML Schema, Protocol Buffers/gRPC, GraphQL, MCP tool schemas, and raw data (JSON/YAML/CSV) into Python models
  • 🐍 Generates from existing Python types (Pydantic, dataclass, TypedDict) via --input-model
  • 🎯 Generates Pydantic v2, Pydantic v2 dataclass, dataclasses, TypedDict, or msgspec output
  • 🔗 Handles complex schemas: $ref, allOf, oneOf, anyOf, enums, and nested types
  • ✅ Produces type-safe, validated code ready for your IDE and type checker

📦 Installation

Recommended for standalone CLI use:

uv tool install datamodel-code-generator

For projects that should pin the generator version, add it as a development dependency instead:

uv add --dev datamodel-code-generator

Other installation methods

pip:

pip install datamodel-code-generator

uv (run without adding to project):

uv run --with datamodel-code-generator datamodel-codegen --help

conda:

conda install -c conda-forge datamodel-code-generator

With HTTP support (for resolving remote $ref):

pip install 'datamodel-code-generator[http]'

With GraphQL support:

pip install 'datamodel-code-generator[graphql]'

With Protocol Buffers support:

pip install 'datamodel-code-generator[protobuf]'

Docker:

docker pull koxudaxi/datamodel-code-generator

Published Docker images run as a non-root appuser. When writing generated files to a bind-mounted directory, make sure the directory is writable by the container user or pass an explicit Docker user, for example --user "$(id -u):$(id -g)".


🏃 Quick Start

Command

datamodel-codegen \
  --input schema.json \
  --input-file-type jsonschema \
  --output-model-type pydantic_v2.BaseModel \
  --preset standard-py312-20260619 \
  --output model.py

This quick start uses standard-py312-20260619 as the modern Python 3.12 baseline. Preset names include the target Python version: py312 means Python 3.12.

See CLI Reference for all options. See Presets, --preset, --input-file-type, and --output-model-type for this command.

For more schema-aware output that preserves schema-authored names, reuses models, and embeds generated documentation, use practical-py312-20260619.

Input (schema.json)

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Pet",
  "type": "object",
  "required": ["name"],
  "properties": {
    "name": {
      "type": "string",
      "description": "The pet's name"
    },
    "species": {
      "type": "string",
      "enum": ["dog", "cat", "bird", "fish"],
      "default": "dog"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "description": "Age in years"
    },
    "vaccinated": {
      "type": "boolean",
      "default": false
    }
  }
}

Output (model.py)

# generated by datamodel-codegen:
#   filename:  schema.json

from __future__ import annotations

from enum import StrEnum
from typing import Annotated

from pydantic import BaseModel, ConfigDict, Field


class Species(StrEnum):
    dog = 'dog'
    cat = 'cat'
    bird = 'bird'
    fish = 'fish'


class Pet(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )
    name: Annotated[str, Field(description="The pet's name")]
    species: Species = Species.dog
    age: Annotated[int | None, Field(description='Age in years', ge=0)] = None
    vaccinated: bool = False

⚡ Speed up generation

By default, generated Python is currently formatted with black and isort. For faster generation without external formatter dependencies, add --formatters builtin for standard generated model modules. In a future version, the Black/isort dependencies will become opt-in and the default formatter will change to builtin.

If you prefer Ruff, install it with pip install 'datamodel-code-generator[ruff]' and use --formatters ruff-check ruff-format for a fast external formatter.

Custom templates can emit Python outside the standard generated model patterns covered by builtin, so custom-template output is not exhaustively validated. If --formatters builtin produces invalid or poorly formatted output with a custom template, please open an issue with a small reproducer. See Formatter Behavior for details.

See Performance Benchmarks for release benchmark data and interactive charts.


📖 Documentation

👉 datamodel-code-generator.koxudaxi.dev


📥 Supported Input

  • OpenAPI 3 (YAML/JSON)
  • AsyncAPI (YAML/JSON)
  • JSON Schema
  • MCP tool schemas
  • XML Schema (XSD)
  • Protocol Buffers / gRPC (.proto)
  • Apache Avro schema (AVSC)
  • JSON data
  • YAML data
  • Python dictionary
  • CSV data
  • GraphQL schema
  • Python types (Pydantic, dataclass, TypedDict) via --input-model

📤 Supported Output

✅ Conformance Signals

CI exercises datamodel-code-generator against pinned external corpora for XML Schema, JSON Schema, AsyncAPI, Apache Avro, and Protocol Buffers. See the Conformance Dashboard for the generated summary of runner scripts, tox environments, CI jobs, expected corpus counts, and upstream sources.


🍳 Common Recipes

CLI option quick starts

Use these starting points when combining options; each option links to the generated CLI reference for details and examples.

See the CLI Reference for the full option list and category-specific recipes.

🤖 Get CLI Help from LLMs

Generate a prompt to ask LLMs about CLI options:

```bash datamodel-codegen --generate-prompt "Be

Core symbols most depended-on inside this repo

get
called by 906
src/datamodel_code_generator/__main__.py
append
called by 322
src/datamodel_code_generator/imports.py
items
called by 298
src/datamodel_code_generator/preset.py
join
called by 292
src/datamodel_code_generator/http.py
append
called by 245
src/datamodel_code_generator/parser/generation.py
extend
called by 181
src/datamodel_code_generator/parser/generation.py
from_full_path
called by 178
src/datamodel_code_generator/imports.py
add
called by 138
src/datamodel_code_generator/parser/base.py

Shape

Class 6,155
Function 4,582
Method 1,253
Route 279

Languages

Python99%
TypeScript1%

Modules by API surface

tests/main/jsonschema/test_main_jsonschema.py809 symbols
tests/main/openapi/test_main_openapi.py374 symbols
src/datamodel_code_generator/parser/jsonschema.py266 symbols
tests/main/test_main_general.py208 symbols
src/datamodel_code_generator/parser/base.py161 symbols
tests/test_main_kr.py159 symbols
tests/parser/test_jsonschema.py141 symbols
src/datamodel_code_generator/model/base.py124 symbols
tests/test_input_model.py114 symbols
src/datamodel_code_generator/parser/xmlschema.py112 symbols
src/datamodel_code_generator/_builtin_formatter.py100 symbols
tests/test_format.py96 symbols

Dependencies from manifests, versioned

pyyaml6.0.1 · 1×

For agents

$ claude mcp add datamodel-code-generator \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact