
Discovering, detecting, and surgically removing Google's AI watermark through spectral analysis
Visit us on PitchHut
SynthID encodes an imperceptible pattern directly into pixel values. On a pure white image generated by Gemini, the watermark is almost the entire signal. Amplify the high-frequency residual and it looks like this:

Amplified SynthID carrier pattern extracted from a pure-white Gemini image. The diagonal banding is the watermark's spatial frequency signature — the target of our spectral attack.
This project reverse-engineers Google's SynthID watermarking system — the invisible watermark embedded into every image generated by Google Gemini. Using only signal processing and spectral analysis (no access to the proprietary encoder/decoder), we:
gemini-3.1-flash-image-preview and nano-banana-pro-preview, cross-color phase consensus over 6 solid backgrounds, and a human-in-the-loop calibration loop that tunes per-carrier subtraction strength from manual Gemini-app detection talliesVT-OxFF built a really cool visualizer to view the process of how SynthID watermark is added to images here (also available in repo description)!
After six iterative rounds of adversarial development, Round 06's bypass_v4_final / bypass_v4_nuke pipeline defeats the Gemini SynthID detector on both gemini-3.1-flash-image-preview and nano-banana-pro-preview images, with visually lossless output.

Left: Round 01 output (gentle spectral subtraction only). Right: Round 06 output (final — VAE + elastic warp + squeeze + color + JPEG). Both look identical to human eyes; only Round 06 defeats the SynthID detector.
| Round | Strategy | Outcome |
|---|---|---|
| 01 | Conservative spectral subtraction (gentle) | ✗ |
| 02 | Aggressive spectral subtraction + JPEG | ✗ |
| 03 | Blog-guided absolute bin targeting | ✗ |
| 04 | Denoise-residual phase extraction | ✗ |
| 05 | Diffusion-VAE re-generation + geometric warp | ✗ |
| 06 | All-in-one: VAE + elastic fragmentation + squeeze + color + JPEG | ✓ |
The breakthrough in Round 06 came from treating the Gemini app's own published failure-mode list as an attack specification:
"When an AI-generated image is part of a complex collage, layered behind other elements, or has many different textures and patterns placed over it, the detector may struggle to isolate the specific signature from the overall file." — Gemini app, SynthID detection help text
The elastic deformation stage simulates this effect at the pixel level: a smooth, low-frequency random warp field gives every ~50-pixel neighbourhood its own independent sub-pixel offset, fragmenting the watermark's spatial phase consensus without introducing any visible distortion.
V4 is a ground-up re-think of the codebook built on a much richer dataset:
gemini-3.1-flash-image-preview and nano-banana-pro-preview (plus an optional union pseudo-model).black, white, blue, green, red, gray) per model per resolution, plus gradient and diverse as content baselines.carrier_weights is updated based on manual Gemini-app detection feedback.For each frequency bin (fy, fx) and channel ch:
consensus(fy, fx, ch) = | mean_over_colors( exp(i * phase_color(fy, fx, ch)) ) |
Values near 1.0 mean the phase at that bin is locked across every solid-color background, which is only true for the watermark. Content bins collapse to < 0.3 because their phase is randomized by different color tints. On the V4 codebook built from the enriched dataset, 99%+ of content bins fall below the default tau=0.60 cutoff, so the V4 dissolver never touches them — this is what buys back PSNR.
flowchart LR
dataset[reverse-synthid-dataset
model x color x resolution] --> build[scripts/build_codebook_v4.py]
build --> codebook[artifacts/spectral_codebook_v4.npz]
codebook --> dissolve[scripts/dissolve_batch.py]
input[watermarked inputs] --> dissolve
dissolve --> variants[final / nuke variants]
variants --> gemini[Gemini app
manual SynthID detection]
gemini --> feedback[detection feedback]
feedback --> calibrate[scripts/calibrate_from_feedback.py]
calibrate -->|updates carrier_weights| codebook
# 1. Build the codebook from the enriched hierarchical dataset
python scripts/build_codebook_v4.py \
--root /path/to/reverse-synthid-dataset \
--output artifacts/spectral_codebook_v4.npz
# 2. Run the Round-06 all-in-one attack on a batch (recommended)
python scripts/dissolve_batch.py \
--input ./to_clean/ \
--output ./runs/round_06/ \
--codebook artifacts/spectral_codebook_v4.npz \
--model gemini-3.1-flash-image-preview \
--strengths final nuke
# 3. Upload each output image to the Gemini app and run SynthID detection.
# Use the results to feed back into the calibration script if needed.
Two presets are available via --strengths:
| Preset | VAE passes | Elastic α | Squeeze | JPEG chain | PSNR floor |
|---|---|---|---|---|---|
final |
1 | 1.8 px | 90 % | q=92→88 | 14 dB |
nuke |
2 | 2.8 px | 82 % | q=88→84→90 | 11 dB |
Both presets stack the same 7-stage pipeline:
sd-vae-ft-mse) — projects image off the natural-image manifold the SynthID decoder was never trained against (Gowal et al. 2026, §6.1)Every stage is independently PSNR-gated; any stage that would drop quality below the floor is rolled back automatically.
Profiles keyed by (model, H, W). Each profile stores:
| Field | Shape | Notes |
|---|---|---|
consensus_coherence |
(H, W, 3) |
Primary carrier mask (cross-color phase consensus). |
consensus_phase |
(H, W, 3) |
Mean unit-phase angle across colors. Subtraction template. |
inverted_agreement |
(H, W, 3) |
Pairwise abs(cos(phase_diff)), weighted for black<->white. |
avg_wm_magnitude |
(H, W, 3) |
Mean magnitude across consensus colors. |
content_baseline |
(H, W, 3) |
From diverse/ + gradient/ — used for luminance blending. |
carrier_weights |
(H, W, 3) |
Live. Starts at consensus^2 * (0.5 + 0.5 * agreement). Updated by the calibration loop. |
n_refs_per_color |
{color: int} |
Per-color ref counts. |
Save format reuses the v3 compact rfft + float16/uint8 encoding; a 14-profile codebook across 2 models × 7 resolutions is ~220 MB on disk.
Before spending time on manual Gemini validation, sanity-check bypass outputs against the V4 codebook's own consensus:
from robust_extractor import RobustSynthIDExtractor
from synthid_bypass_v4 import SpectralCodebookV4
cb = SpectralCodebookV4()
cb.load('artifacts/spectral_codebook_v4.npz')
ext = RobustSynthIDExtractor()
result = ext.detect_from_v4_codebook(image_rgb, cb,
model='nano-banana-pro-preview')
print(result.is_watermarked, result.confidence, result.phase_match)
On the 1024x1024 exact-match path we see conf=0.91, phase_match=0.65 for watermarked and conf=0.02, phase_match=0.31 after aggressive V4 dissolve.
| V3 | V4 | |
|---|---|---|
| Reference colors | black + white | black, white, blue, green, red, gray (+ diverse/gradient content baselines) |
| Cross-validation | abs(cos(phase_black - phase_white)) |
cross-color consensus over 6 colors + pairwise agreement |
| Models | single-model (Gemini 2.5) | per-model profiles (gemini-3.1-flash-image-preview, nano-banana-pro-preview) + optional union |
| Attack | spectral subtraction only | 7-stage: VAE + elastic + squeeze + color + FFT + JPEG chain |
| PSNR (aggressive) | 43 dB | visually lossless (18–24 dB pixel-level; warp displaces pixels) |
| Fidelity guard | none | per-stage PSNR-floor rollback |
| Detector bypass | local only | confirmed ✓ on Gemini app (both models) |
V3 remains in the repo (src/extraction/synthid_bypass.py, bypass_v3) unchanged for anyone who depends on it.
We're actively collecting pure black and pure white images generated by Nano Banana Pro to improve multi-resolution watermark extraction.
If you can generate these:
gemini_black_nb_pro/ (for black)gemini_white_nb_pro/ (for white)These reference images are critical for: - Carrier frequency discovery - Phase validation - Improving cross-resolution robustness
Even 150–200 images at a new resolution can significantly improve detection and removal.
Reference images are hosted on Hugging Face to keep the git repo lightweight:
pip install huggingface_hub
python scripts/download_images.py # download all
python scripts/download_images.py gemini_black # download specific folder
Dataset: huggingface.co/datasets/aoxo/reverse-synthid
SynthID embeds carrier frequencies at different absolute positions depending on image resolution. A codebook built at 1024x1024 cannot directly remove the watermark from a 1536x2816 image — the carriers are at completely different bins.
| Resolution | Top Carrier (fy, fx) | Coherence | Source |
|---|---|---|---|
| 1024x1024 | (9, 9) | 100.0% | 100 black + 100 white refs |
| 1536x2816 | (768, 704) | 99.6% | 88 watermarked content images |
This is why the V3 codebook stores separate profiles per resolution and auto-selects at bypass time.
$ claude mcp add reverse-SynthID \
-- python -m otcore.mcp_server <graph>