
From Method Text to Editable SVG
AutoFigure-Edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.
Quick Start • Web Interface • How It Works • Configuration • Citation
[Paper]
[AutoFigure]
[BibTeX]
https://github.com/user-attachments/assets/6f93deb4-9854-4f1e-8097-53b0c3378a0d
custom OpenAI-compatible provider support, OpenAI Responses + gpt-image-2 routing improvements, stage-1 figure import mode, a bilingual configuration UI, and an in-product configuration guide.AutoFigure-Edit v1.1 focuses on making the web workflow more practical for real users and real OpenAI-compatible gateways.
--provider openai_response with a custom OpenAI-compatible base_url, step 1 now inherits the same image API route and key by default instead of falling back to the official OpenAI host.custom provider support: The CLI and web UI now expose custom as the primary OpenAI-compatible provider name, while bianxie remains as a backward-compatible alias.| Feature | Description |
|---|---|
| 📝 Text-to-Figure | Generate a draft figure directly from method text. |
| 🧠 SAM3 Icon Detection | Detect icon regions from multiple prompts and merge overlaps. |
| 🎯 Labeled Placeholders | Insert consistent AF-style placeholders for reliable SVG mapping. |
| 🧩 SVG Generation | Produce an editable SVG template aligned to the figure. |
| 🖥️ Embedded Editor | Edit the SVG in-browser using the bundled svg-edit. |
| 📦 Artifact Outputs | Save PNG/SVG outputs and icon crops per run. |
AutoFigure-edit introduces two breakthrough capabilities:
Below are 9 examples covering 3 different papers. Each paper is generated using 3 different reference styles. (Each image shows: Left = AutoFigure Generation | Right = Vectorized Editable SVG)
| Paper & Style Transfer Demonstration |
|---|
| CycleResearcher / Style 1 |
|
| CycleResearcher / Style 2
|
| CycleResearcher / Style 3
|
| DeepReviewer / Style 1
|
| DeepReviewer / Style 2
|
| DeepReviewer / Style 3
|
| DeepScientist / Style 1
|
| DeepScientist / Style 2
|
| DeepScientist / Style 3
|
The AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:
(1) Raw Generation → (2) SAM3 Segmentation → (3) SVG Layout Template → (4) Final Assembled Vector
figure.png): The LLM generates a raster draft based on the method text.sam.png): SAM3 detects and segments distinct icons and text regions.template.svg): The system constructs a structural SVG wireframe using placeholders.final.svg): High-quality cropped icons and vectorized text are injected into the template.View Detailed Technical Pipeline

AutoFigure2’s pipeline starts from the paper’s method text and first calls a text‑to‑image LLM to render a journal‑style schematic, saved as figure.png. The system then runs SAM3 segmentation on that image using one or more text prompts (e.g., “icon, diagram, arrow”), merges overlapping detections by an IoU‑like threshold, and draws gray‑filled, black‑outlined labeled boxes on the original; this produces both samed.png (the labeled mask overlay) and a structured boxlib.json with coordinates, scores, and prompt sources.
Next, each box is cropped from the original figure and passed through RMBG‑2.0 for background removal, yielding transparent icon assets under icons/*.png and *_nobg.png. With figure.png, samed.png, and boxlib.json as multimodal inputs, the LLM generates a placeholder‑style SVG (template.svg) whose boxes match the labeled regions.
Optionally, the SVG is iteratively refined by an LLM optimizer to better align strokes, layouts, and styles, resulting in optimized_template.svg (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label/ID), producing the assembled final.svg.
Key configuration details:
- Placeholder Mode: Controls how icon boxes are encoded in the prompt (label, box, or none).
- Optimization: optimize_iterations=0 allows skipping the refinement step to use the raw structure directly.
Use Docker for a reproducible one-command setup without local Python/SAM3 installation.
8000 available on hostbriaai/RMBG-2.0: https://huggingface.co/briaai/RMBG-2.0.env# Linux/macOS
cp .env.example .env
# Windows PowerShell
Copy-Item .env.example .env
At minimum, set this in .env:
HF_TOKEN=hf_xxx
Optional but recommended:
# SAM3 API backend (Docker default in UI is Roboflow)
ROBOFLOW_API_KEY=your_roboflow_key
# Step-4 multimodal retry tuning (OpenRouter)
OPENROUTER_MULTIMODAL_RETRIES=3
OPENROUTER_MULTIMODAL_RETRY_DELAY=1.5
# DNS override for Roboflow name-resolution issues
DOCKER_DNS_1=223.5.5.5
DOCKER_DNS_2=119.29.29.29
For restricted networks, you can also set build mirrors:
BASE_IMAGE=docker.m.daocloud.io/library/python:3.11-slim
PIP_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple
PIP_EXTRA_INDEX_URL=
docker compose up -d --build
Open http://localhost:8000.
docker compose ps
curl http://localhost:8000/healthz
Expected health response: {"status":"ok"}.
# Stream logs
docker compose logs -f autofigure-edit
# Restart service
docker compose restart autofigure-edit
# Rebuild from scratch (no cache)
docker compose build --no-cache
docker compose up -d
# Stop and remove container
docker compose down
./outputs, ./uploadshf_cache (/app/.cache/huggingface)roboflowicon,person,robot,animalopenrouter: image google/gemini-3.1-flash-image-preview, svg google/gemini-3.1-pro-previewcustom / bianxie: image gemini-3.1-flash-image-preview, svg gemini-3.1-pro-previewgemini: image gemini-3.1-flash-image-preview, svg gemini-3.1-pro-previewopenai_response: image gpt-image-2 (step 1 fallback), svg gpt-5.4 via Responses API--image_provider openai: image gpt-image-2 via the official OpenAI Images APITemporary failure in name resolution (Roboflow): set DOCKER_DNS_1/2 in .env, then docker compose up -d --build.auth.docker.io): set BASE_IMAGE and PIP_INDEX_URL mirrors in .env.ROBOFLOW_API_URL=<your_reachable_roboflow_endpoint>ROBOFLOW_API_FALLBACK_URLS=<comma_separated_backup_endpoints># 1) Install dependencies
pip install -r requirements.txt
# 2) Install SAM3 separately (not vendored in this repo)
git clone https://github.com/facebookresearch/sam3.git
cd sam3
pip install -e .
Run:
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider custom \
--api_key YOUR_KEY
Use OpenAI only for step 1 image generation while keeping SVG reconstruction on the original provider:
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider gemini \
--api_key GEMINI_KEY \
--image_provider openai \
--image_api_key OPENAI_KEY \
--image_model gpt-image-2
Use the OpenAI Responses API for text + multimodal SVG reconstruction:
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider openai_response \
--api_key OPENAI_KEY
Continue from an existing stage-1 figure and skip image generation:
python autofigure2.py \
--input_figure_path ./my_stage1_figure.png \
--output_dir outputs/import_demo \
--provider openai_response \
--api_key OPENAI_KEY \
--svg_model gpt-5.4
python server.py
Then open http://localhost:8000.
AutoFigure-edit provides a visual web interface designed for seamless generation and editing.

On the start page, paste your paper's method text on the left. On the right, configure your generation settings: * Provider: Select your
$ claude mcp add AutoFigure-Edit \
-- python -m otcore.mcp_server <graph>