Tech 15 min read

Reverse-Engineering Gemini's SynthID Watermark via Spectral Analysis: 90% Detection, 91% Removal

IkesanContents

Images generated by Google’s Gemini carry an invisible digital watermark called SynthID.
Undetectable to the human eye, it lets a dedicated detector determine whether an image was AI-generated.
Developed by Google DeepMind, it has already been applied to over 10 billion images and video frames, according to a paper published in October 2025.

A research project published on GitHub has now reverse-engineered SynthID using nothing but signal processing, without any access to Google’s encoder or decoder.
It achieves 90% detection accuracy and implements a bypass technique that removes the watermark with virtually no image quality loss.

Of course, being technically able to remove a watermark doesn’t mean it’s legal.
The watermark is completely invisible to humans, yet intentionally stripping it may violate copyright or AI regulation laws.
But here’s the fundamental paradox: if AI-generated images aren’t protected by copyright in the first place, can the removal of “copyright management information” even be prosecuted?

How SynthID Works and Where It Breaks

SynthID is a deep-learning-based watermarking system that embeds invisible patterns into output images within the image generation pipeline.
The embedded pattern is designed to be robust against common image transformations like JPEG compression and resizing.

What this research revealed, however, is that SynthID embeds a fixed-pattern carrier signal in the frequency domain.
That pattern is fixed at the model level and does not change from image to image.

Discovering the Watermark via Spectral Analysis

The core of the research is spectral analysis using the Fast Fourier Transform (FFT).
FFT converts a signal from the spatial domain to the frequency domain. Applied to an image, it reveals which frequency components are present and at what magnitude.

The procedure:

graph TD
    A[Generate many images with Gemini] --> B[100 black + 100 white images<br/>for reference]
    A --> C[88 normal content images<br/>1536x2816px]
    B --> D[Convert to frequency domain via FFT]
    C --> D
    D --> E[Compare phase between<br/>black and white images]
    E --> F[Identify frequency bins with<br/>high phase coherence]
    F --> G[Confirm carrier frequencies]
    G --> H[Build watermark profile]

By having Gemini generate solid black and solid white images, content-derived noise is eliminated, leaving only the watermark signal.
Extracting frequency bins where the phase matches between black and white images (|cos(phase difference)| > 0.90) identifies SynthID’s carrier frequencies.

Characteristics of the Discovered Watermark

The properties revealed by the analysis have implications for AI watermarking technology as a whole.

Resolution-Dependent Carrier Placement

The watermark’s carrier frequencies shift in absolute position depending on image resolution.

Image ResolutionCarrier Frequency Bin
1024x1024(9, 9)
1536x2816(768, 704)

Different carrier positions per resolution means that building per-resolution profiles enables detection and removal across multiple resolutions.

Phase Consistency

Cross-image phase coherence exceeded 99.5%.
This means the watermark is embedded with a “fixed model-level key.”
Regardless of image content, the same phase signal sits at the same frequency bins.
In cryptographic terms, it’s like every user sharing the same key.

Per-Channel Intensity Differences

The watermark embedding strength also varies across RGB channels.

ChannelRelative Strength
Green1.00
Red0.85
Blue0.70

Green is the strongest.
This might seem counterintuitive since human vision is most sensitive to green, but the green channel carries the most information, making it easier to hide the watermark.

Evolution of Bypass Techniques

The research developed three generations of bypass techniques.

V1 (Baseline)

A naive approach applying JPEG compression at quality 50.
PSNR (Peak Signal-to-Noise Ratio) is around 37dB, but phase dropped by only 11%.
This confirms that SynthID’s design goal of robustness against JPEG compression actually works.

V2 (Multi-stage)

Combines noise injection, color space conversion, and frequency filtering.
Operates in the 27-37dB PSNR range, but watermark removal was essentially zero.
Untargeted transformations (not aimed at the watermark’s specific frequencies) cannot break SynthID.

V3 (Multi-Resolution Spectral Codebook)

This is the study’s main contribution. It uses a pre-built Spectral Codebook (a database of per-resolution watermark profiles) to subtract the watermark as a known signal.

graph TD
    A[Input image] --> B[Determine resolution]
    B --> C[Load matching profile<br/>from SpectralCodebook]
    C --> D[Convert to frequency domain via FFT]
    D --> E["Aggressive Pass<br/>Strong subtraction on high-confidence bins"]
    E --> F["Moderate Pass<br/>Adjust medium-confidence bins"]
    F --> G["Gentle Pass<br/>Fine-tune residual signal"]
    G --> H[Inverse FFT back to spatial domain]
    H --> I[Anti-aliasing]
    I --> J[Output image]

Subtraction strength is weighted by a confidence score calculated from phase consistency × cross-validation agreement, and energy removal per frequency bin is capped at 90-95%.
This preserves image information outside the watermark.

V3’s performance:

MetricValue
PSNR43dB+
SSIM (Structural Similarity)0.997
Phase coherence drop (top 5 carriers)91%
Carrier energy reduction75.8%

SSIM 0.997 means the before-and-after images are virtually indistinguishable to the human eye.
Yet the phase coherence of the top 5 carriers drops by 91%.
From a detector’s perspective, the watermark is effectively gone.

Is Removing an Invisible Watermark Illegal?

SynthID is completely invisible to humans.
It makes minuscule changes to pixel values, but the perceptible difference is zero.
Comparing V3-bypassed and original images yields an SSIM of 0.997, making them effectively identical.

So what does removing “something imperceptible to humans” actually infringe upon, legally?

Major copyright regimes are currently reluctant to protect AI-generated images.
The U.S. Copyright Office holds that “copyright does not arise unless it includes human creative expression.”
Under Japanese copyright law, images autonomously generated by AI likely fail to meet the definition of a copyrightable work (“a creative expression of thoughts or sentiments”).

This creates a paradox.
Is removing a watermark embedded in content that isn’t protected by copyright a “copyright infringement”?
If the protected work doesn’t exist in the first place, copyright infringement logically cannot occur.

AI-generated images could become copyrightable if a human adds creative input.
But at the point of Gemini’s output, no human creative involvement is recognized.
The watermark is embedded within the image generation pipeline, before any human intervention.
In other words, SynthID is applied to content that, under copyright law, is “not a copyrightable work.”

Two DMCA Provisions and Their Limits

The U.S. DMCA (Digital Millennium Copyright Act) has two provisions potentially relevant to watermark removal.

ProvisionContentApplicability to SynthID
§1201Prohibits circumvention of technological protection measures (TPMs)SynthID is not access control, so unlikely to apply
§1202Prohibits intentional removal of copyright management information (CMI)If AI-generated images aren’t copyrightable, the premise collapses

§1202 explicitly includes digital watermarks in its definition of CMI (1202(c)).
Whether the watermark is visible or invisible is not a requirement; invisible watermarks can qualify as CMI.
However, applying the provision requires that the CMI be associated with a “copyrighted work.”
Removing CMI from unprotected content is unlikely to constitute a §1202 violation.

§1201 is even harder to apply.
SynthID does not restrict access to or use of images in any way.
Watermarked images can be freely viewed, copied, and modified.
It is fundamentally different from DRM (Digital Rights Management) that controls content use; its sole purpose is provenance attestation.
The “circumvention of technological protection measures” framework simply doesn’t fit.

Note, however, that this paradox rests on current copyright law interpretation.
AI-generated content copyright discussions are ongoing in multiple countries, and future protection for AI-generated images cannot be ruled out.
In that case, SynthID’s status as CMI could change.

The EU is attempting to fill this gap with a fundamentally different approach.

The EU AI Act (enacted 2024) mandates that AI-generated content be labeled in a machine-readable format.
This obligation is completely independent of copyright status; even for non-copyrighted content, removing labels indicating AI generation can trigger penalties under the AI regulation.

The situation where something is “not copyright infringement but still illegal” is becoming reality in the EU.
It bypasses the copyright paradox by using “transparency of AI-generated content” as its own legal basis for regulating watermark removal.

Position Under Japanese Law

In Japan, the Unfair Competition Prevention Act’s “circumvention of technological restriction measures” (Article 2, Items 17-18) could be relevant.
If SynthID is classified as a technological restriction measure for content management, providing circumvention devices or the act of circumvention itself could be illegal.

However, there are interpretive barriers to applying this provision.
”Technological restriction measures” typically refer to technologies that restrict access to or copying of content (DRM, etc.).
SynthID does not restrict image use at all; it is purely a provenance technology.
Whether a watermark that is neither access control nor copy control qualifies as a “technological restriction measure” awaits case law development.

Japanese copyright law also has a “rights management information” protection provision (Article 113, Paragraph 7), equivalent to DMCA §1202.
Digital watermarks can constitute rights management information, but again, this is premised on information associated with a “copyrightable work.”
As long as AI-generated images aren’t recognized as copyrightable, the same paradox as DMCA applies.

Risk of Publishing the Research Itself

This research publishes detection and removal tools on GitHub.
Publishing proof-of-concept code is generally considered within the scope of academic freedom, but the risk of being deemed “distribution of circumvention devices” is not zero.

DMCA §1201 includes exceptions for security research (1201(j)) and encryption research (1201(g)).
But as discussed, whether §1201 even applies to SynthID is unclear.
If it doesn’t apply, the exception discussion is moot; if it does, individual assessment of whether exception requirements are met is needed.

Another consideration: SynthID’s carrier pattern (the phase pattern common to all images) could constitute a Google trade secret.
The U.S. Defend Trade Secrets Act (DTSA) and Japan’s Unfair Competition Prevention Act regulate acquisition of trade secrets through improper means.
Whether analyzing outputs of a publicly available API counts as “improper means” is debatable.
Reverse engineering is legal in many jurisdictions, but if the terms of service explicitly prohibit it, a breach of contract issue remains.

While copyright law and AI regulation are full of gray areas, platform terms of service are clear.

Google explicitly prohibits removing watermarks from generated content in its terms of service.
Violations can result in account suspension and other measures.
Regardless of how copyright or AI regulation laws are interpreted, contractual obligations exist independently.

The highest actual legal risk may not be copyright infringement (unclear) or AI regulation violation (EU only), but this platform TOS violation.
The threshold for civil damages claims and service suspension is lower than that for criminal penalties.
”Legally gray but contractually out” is probably the most definitive conclusion for now.

Limits of the Fixed-Key Approach

SynthID was designed for “internet-scale image watermarking” and has been applied to over 10 billion images.
But this research exposed a fundamental weakness in the approach of embedding fixed patterns in the frequency domain.

Using the same phase pattern across all images means anyone can identify the carrier frequencies given enough samples.
Generating reference images only requires access to the Gemini API.
DeepMind’s paper lists robustness and security as key design requirements, but “robustness against common image transformations” and “security against targeted spectral attacks” are entirely different properties.
Surviving JPEG compression and resizing is one thing; surviving precision subtraction after carrier frequency identification is another.

Provenance attestation for AI-generated content that relies solely on watermarks is fragile.
Without combining it with metadata-based provenance like C2PA (Coalition for Content Provenance and Authenticity), it’s nearly defenseless against determined attackers.
Google’s Flow also outputs images with SynthID enabled, but it’s becoming clear that SynthID alone is insufficient, both technically and legally.

The research code is available on GitHub, covering everything from SpectralCodebook construction to V3 bypass execution in Python.

Text Gets Watermarked Too

SynthID isn’t limited to images.
In May 2024, Google DeepMind introduced SynthID watermarking for Gemini’s text output as well.
That October, they published the paper “Scalable watermarking for identifying large language model outputs” in Nature and simultaneously open-sourced the implementation.

While image watermarking embeds signals in the pixel frequency domain, text watermarking intervenes directly in the LLM’s token selection process.
It weaves statistically detectable patterns into the generated text without altering its meaning or quality.

The g-Function and Tournament Sampling

At the heart of SynthID Text is a pseudorandom function called the g-function.
At each step of text generation, the following processing occurs:

  1. Hash the previous H tokens (context window) and a secret key to produce a pseudorandom seed
  2. Use this seed to assign a g-value (pseudorandom score) to every token in the vocabulary
  3. Factor the g-values into the LLM’s output probability distribution when selecting tokens

By biasing toward tokens with higher g-values, a statistical “skew” is embedded across the entire generated text.
This skew is imperceptible to humans, but a detector with the secret key can identify it through statistical testing.

For the specific token selection algorithm, SynthID Text introduced a technique called tournament sampling.

graph TD
    A[All token candidates in vocabulary] --> B["Round 1:<br/>Compete using LLM probability + g-value"]
    B --> C[Surviving token group]
    C --> D["Round 2:<br/>Re-score with key from a different layer"]
    D --> E[Further narrowing]
    E --> F[Final round]
    F --> G[Selected token]

A recommended 20-30 layers of tournament rounds are conducted, each using g-values derived from different keys.
The multi-layer structure embeds more sophisticated patterns than the simple Red-Green method (which randomly splits the vocabulary into two groups and favors one).

Two operating modes exist.
”Distortionary” mode slightly modifies the LLM’s output distribution to embed a stronger watermark.
”Non-distortionary” mode perfectly preserves the output distribution, eliminating quality impact at the cost of slightly weaker watermark strength.
Google verified no change in text quality through live experiments on approximately 20 million Gemini responses.

Detecting Text Watermarks

Three scoring functions are defined for detection.

Detection MethodCharacteristicsTraining Data
Mean scoreSimple average of g-values of observed tokensNot required
Weighted mean scoreAverage weighted by token probabilityNot required
Bayesian detectorClassifier estimating posterior probability10,000+ samples

The secret key is used to recalculate g-values, then statistical testing determines whether observed tokens match the watermark pattern.
Combined with selective prediction (classifying low-confidence cases as “unknown”), detection achieves 95% accuracy at a 1% false positive rate.
The Bayesian detector is most accurate, but the weighted mean score requires no training and is easy to deploy.

The Fundamental Weakness of Text Watermarks

This is where text diverges critically from images.
Text watermarks are fundamentally vulnerable to paraphrase attacks.

Removing SynthID from images required precise subtraction after identifying carrier frequencies.
Collecting 200 reference images, analyzing the frequency domain with FFT, running multi-stage subtraction passes based on confidence scores. It demands signal processing knowledge and effort.

Text watermarks break when you simply ask another LLM to “rewrite this with the same meaning.”

Research from ETH Zurich’s SRI Lab achieved over 90% success rate with paraphrase-based scrubbing (watermark washing).
Assisted scrubbing (combining paraphrase with key estimation attacks) reaches nearly 100% success.

Attack MethodEffect on Text Watermarks
Paraphrase (50%+ vocabulary change)Detection rate drops significantly
Synonym substitutionF1 score drops to 0.884
Translate → back-translateDetector confidence drops substantially
Mixing in non-watermarked text (10x volume)F1 drops to 0.788, false positive rate rises to 0.53
Assisted scrubbingNearly 100% success rate

Unlike images, text can be substantially reworded while preserving meaning.
The very nature of text as a medium — the ability to rephrase the same information infinitely — makes maintaining statistical patterns fundamentally difficult.
While image SynthID was defeated through a design weakness in its “fixed-key approach,” text SynthID’s weakness is the medium of text itself.

There’s another fundamental limitation.
When the LLM’s output probability is heavily skewed — for example, when the answer to “What is the capital of Japan?” is essentially unique — there’s little freedom in token selection and no room to embed a watermark.
Short texts face the same issue, lacking enough tokens to establish a statistical pattern.

Attack Research After Algorithm Publication

SynthID Text’s code was published on GitHub under Apache License 2.0 in October 2024 and integrated into Hugging Face Transformers v4.46.0.
In contrast to image SynthID, which required reverse engineering to “discover” carrier frequencies, the text version’s algorithm is published openly.
Attackers don’t even need to search for carrier frequencies.

The researcher who reverse-engineered the image version, aloshdenny, has also published a reverse-SynthID-text repository for the text version.
Academic research is active, with results including:

  • ETH Zurich’s SRI Lab demonstrated over 90% scrubbing success in black-box environments and decomposed SynthID Text’s architecture into LeftHash(h=3) + context extension + tournament sampling + caching
  • Han et al. systematically evaluated paraphrase, copy-paste, and translation attacks, proposing the defense framework SynGuard that combines semantic information retrieval with SynthID’s probability mechanism (improving F1 by 11.1% on average)
  • arXiv:2603.03410 provided the first theoretical analysis, proving that mean-score detection is vulnerable to layer inflation attacks (increasing the number of tournament layers) and that Bayesian scoring is superior

For both images and text, SynthID alone has clear limits for provenance attestation.
Combining it with metadata-based provenance like C2PA is more realistic, but metadata is even easier to strip.