hub / github.com/ResearAI/AutoFigure-Edit

github.com/ResearAI/AutoFigure-Edit @v1.1 sqlite

repository ↗ · DeepWiki ↗ · release v1.1 ↗

160 symbols 467 edges 3 files 54 documented · 34%

README

AutoFigure-Edit: Generating Editable Scientific Illustration

From Method Text to Editable SVG

AutoFigure-Edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.

Quick Start • Web Interface • How It Works • Configuration • Citation

[Paper] [AutoFigure] [BibTeX]

https://github.com/user-attachments/assets/6f93deb4-9854-4f1e-8097-53b0c3378a0d

🔥 News

[2026.04.23] 🚀 AutoFigure-Edit v1.1 is now available. This update adds custom OpenAI-compatible provider support, OpenAI Responses + gpt-image-2 routing improvements, stage-1 figure import mode, a bilingual configuration UI, and an in-product configuration guide.
[2026.03.24] 🧠 Our sister project DeepScientist v1.5 is now officially released. It is a local-first open-source autonomous research system for end-to-end scientific discovery. Explore it on GitHub or read the ICLR 2026 paper.
[2026.03.11] 📄 Our AutoFigure-Edit paper is now available on arXiv and featured in 🤗Hugging Face Daily Papers! If you find our work helpful, please consider giving us an upvote on Hugging Face and citing our paper. Thank you! ❤️
[2026.02.17] 🚀 The AutoFigure-Edit online platform is now live! It is free for all scholars to use. Try it out at deepscientist.cc.
[2026.01.26] 🎉 AutoFigure has been accepted to ICLR 2026! You can read the paper on arXiv.

🆕 V1.1 (2026.04.23)

AutoFigure-Edit v1.1 focuses on making the web workflow more practical for real users and real OpenAI-compatible gateways.

OpenAI Responses main-route fix: When you use --provider openai_response with a custom OpenAI-compatible base_url, step 1 now inherits the same image API route and key by default instead of falling back to the official OpenAI host.
custom provider support: The CLI and web UI now expose custom as the primary OpenAI-compatible provider name, while bianxie remains as a backward-compatible alias.
Stage-1 figure import mode: You can now skip image generation entirely by importing an existing academic raster figure and continuing directly from SAM + SVG reconstruction.
Bilingual web configuration: The main page, import page, canvas, and configuration guide now support in-page Chinese / English switching.
In-product configuration guide: A dedicated guide page explains workflows, fields, SAM backends, and recommended presets.

✨ Features

Feature	Description
📝 Text-to-Figure	Generate a draft figure directly from method text.
🧠 SAM3 Icon Detection	Detect icon regions from multiple prompts and merge overlaps.
🎯 Labeled Placeholders	Insert consistent AF-style placeholders for reliable SVG mapping.
🧩 SVG Generation	Produce an editable SVG template aligned to the figure.
🖥️ Embedded Editor	Edit the SVG in-browser using the bundled svg-edit.
📦 Artifact Outputs	Save PNG/SVG outputs and icon crops per run.

🎨 Gallery: Editable Vectorization & Style Transfer

AutoFigure-edit introduces two breakthrough capabilities:

Fully Editable SVGs (Pure Code Implementation): Unlike raster images, our outputs are structured Vector Graphics (SVG). Every component is editable—text, shapes, and layout can be modified losslessly.
Style Transfer: The system can mimic the artistic style of reference images provided by the user.

Below are 9 examples covering 3 different papers. Each paper is generated using 3 different reference styles. (Each image shows: Left = AutoFigure Generation | Right = Vectorized Editable SVG)

Paper & Style Transfer Demonstration
CycleResearcher / Style 1

Paper 1 Style 1 | | CycleResearcher / Style 2

Paper 1 Style 2 | | CycleResearcher / Style 3

Paper 1 Style 3 | | DeepReviewer / Style 1

Paper 2 Style 1 | | DeepReviewer / Style 2

Paper 2 Style 2 | | DeepReviewer / Style 3

Paper 2 Style 3 | | DeepScientist / Style 1

Paper 3 Style 1 | | DeepScientist / Style 2

Paper 3 Style 2 | | DeepScientist / Style 3

Paper 3 Style 3 |

🚀 How It Works

The AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:

Pipeline Visualization: Figure -> SAM -> Template -> Final

(1) Raw Generation → (2) SAM3 Segmentation → (3) SVG Layout Template → (4) Final Assembled Vector

Generation (figure.png): The LLM generates a raster draft based on the method text.
Segmentation (sam.png): SAM3 detects and segments distinct icons and text regions.
Templating (template.svg): The system constructs a structural SVG wireframe using placeholders.
Assembly (final.svg): High-quality cropped icons and vectorized text are injected into the template.

View Detailed Technical Pipeline

AutoFigure-edit Technical Pipeline

AutoFigure2’s pipeline starts from the paper’s method text and first calls a text‑to‑image LLM to render a journal‑style schematic, saved as figure.png. The system then runs SAM3 segmentation on that image using one or more text prompts (e.g., “icon, diagram, arrow”), merges overlapping detections by an IoU‑like threshold, and draws gray‑filled, black‑outlined labeled boxes on the original; this produces both samed.png (the labeled mask overlay) and a structured boxlib.json with coordinates, scores, and prompt sources.

Next, each box is cropped from the original figure and passed through RMBG‑2.0 for background removal, yielding transparent icon assets under icons/*.png and *_nobg.png. With figure.png, samed.png, and boxlib.json as multimodal inputs, the LLM generates a placeholder‑style SVG (template.svg) whose boxes match the labeled regions.

Optionally, the SVG is iteratively refined by an LLM optimizer to better align strokes, layouts, and styles, resulting in optimized_template.svg (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label/ID), producing the assembled final.svg.

Key configuration details: - Placeholder Mode: Controls how icon boxes are encoded in the prompt (label, box, or none). - Optimization: optimize_iterations=0 allows skipping the refinement step to use the raw structure directly.

⚡ Quick Start

Option 0: Docker Deployment Guide (Recommended)

Use Docker for a reproducible one-command setup without local Python/SAM3 installation.

0) Prerequisites

Docker Desktop (with Docker Compose v2)
Port 8000 available on host
HuggingFace access to briaai/RMBG-2.0: https://huggingface.co/briaai/RMBG-2.0

1) Prepare `.env`

# Linux/macOS
cp .env.example .env

# Windows PowerShell
Copy-Item .env.example .env

At minimum, set this in .env:

HF_TOKEN=hf_xxx

Optional but recommended:

# SAM3 API backend (Docker default in UI is Roboflow)
ROBOFLOW_API_KEY=your_roboflow_key

# Step-4 multimodal retry tuning (OpenRouter)
OPENROUTER_MULTIMODAL_RETRIES=3
OPENROUTER_MULTIMODAL_RETRY_DELAY=1.5

# DNS override for Roboflow name-resolution issues
DOCKER_DNS_1=223.5.5.5
DOCKER_DNS_2=119.29.29.29

For restricted networks, you can also set build mirrors:

BASE_IMAGE=docker.m.daocloud.io/library/python:3.11-slim
PIP_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple
PIP_EXTRA_INDEX_URL=

2) Build and start

docker compose up -d --build

Open http://localhost:8000.

3) Verify service health

docker compose ps
curl http://localhost:8000/healthz

Expected health response: {"status":"ok"}.

4) Daily operations

# Stream logs
docker compose logs -f autofigure-edit

# Restart service
docker compose restart autofigure-edit

# Rebuild from scratch (no cache)
docker compose build --no-cache
docker compose up -d

# Stop and remove container
docker compose down

5) Persistence and defaults

Persistent outputs: ./outputs, ./uploads
Persistent HuggingFace cache: Docker volume hf_cache (/app/.cache/huggingface)
Docker/Web default SAM backend: roboflow
Default SAM prompt: icon,person,robot,animal
Current default models:
openrouter: image google/gemini-3.1-flash-image-preview, svg google/gemini-3.1-pro-preview
custom / bianxie: image gemini-3.1-flash-image-preview, svg gemini-3.1-pro-preview
gemini: image gemini-3.1-flash-image-preview, svg gemini-3.1-pro-preview
openai_response: image gpt-image-2 (step 1 fallback), svg gpt-5.4 via Responses API
Optional step-1 override:
--image_provider openai: image gpt-image-2 via the official OpenAI Images API

6) Common Docker networking issues

Temporary failure in name resolution (Roboflow): set DOCKER_DNS_1/2 in .env, then docker compose up -d --build.
Cannot reach Docker Hub auth (auth.docker.io): set BASE_IMAGE and PIP_INDEX_URL mirrors in .env.
Optional Roboflow endpoint override:
ROBOFLOW_API_URL=<your_reachable_roboflow_endpoint>
ROBOFLOW_API_FALLBACK_URLS=<comma_separated_backup_endpoints>

Option 1: CLI

# 1) Install dependencies
pip install -r requirements.txt

# 2) Install SAM3 separately (not vendored in this repo)
git clone https://github.com/facebookresearch/sam3.git
cd sam3
pip install -e .

Run:

python autofigure2.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider custom \
  --api_key YOUR_KEY

Use OpenAI only for step 1 image generation while keeping SVG reconstruction on the original provider:

python autofigure2.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider gemini \
  --api_key GEMINI_KEY \
  --image_provider openai \
  --image_api_key OPENAI_KEY \
  --image_model gpt-image-2

Use the OpenAI Responses API for text + multimodal SVG reconstruction:

python autofigure2.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider openai_response \
  --api_key OPENAI_KEY

Continue from an existing stage-1 figure and skip image generation:

python autofigure2.py \
  --input_figure_path ./my_stage1_figure.png \
  --output_dir outputs/import_demo \
  --provider openai_response \
  --api_key OPENAI_KEY \
  --svg_model gpt-5.4

Option 2: Web Interface

python server.py

Then open http://localhost:8000.

🖥️ Web Interface Demo

AutoFigure-edit provides a visual web interface designed for seamless generation and editing.

1. Configuration Page

Configuration Page

On the start page, paste your paper's method text on the left. On the right, configure your generation settings: * Provider: Select your

Core symbols most depended-on inside this repo

normalizeProviderValue

normalizeImageProviderValue

getResolvedPrimaryBaseUrl

called by 5

web/app.js

Shape

Function 146

Route 7

Method 4

Class 3

Languages

Python70%

TypeScript30%

Modules by API surface

autofigure2.py79 symbols

web/app.js48 symbols

server.py33 symbols

Dependencies from manifests, versioned

Pillow10.0 · 1×

cairosvg2.7 · 1×

fastapi0.110 · 1×

google-genai1.0 · 1×

kornia0.7 · 1×

lxml4.9 · 1×

numpy1.26 · 1×

openai1.0 · 1×

pydantic2.6 · 1×

python-multipart0.0.9 · 1×

reportlab4.0 · 1×

requests2.31 · 1×

For agents

$ claude mcp add AutoFigure-Edit \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact

github.com/ResearAI/AutoFigure-Edit @v1.1 sqlite

AutoFigure-Edit: Generating Editable Scientific Illustration

🔥 News

🆕 V1.1 (2026.04.23)

✨ Features

🎨 Gallery: Editable Vectorization & Style Transfer

🚀 How It Works

⚡ Quick Start

Option 0: Docker Deployment Guide (Recommended)

0) Prerequisites

1) Prepare .env

2) Build and start

3) Verify service health

4) Daily operations

5) Persistence and defaults

6) Common Docker networking issues

Option 1: CLI

Option 2: Web Interface

🖥️ Web Interface Demo

1. Configuration Page

Core symbols most depended-on inside this repo

Shape

Languages

Modules by API surface

Dependencies from manifests, versioned

For agents

1) Prepare `.env`