Tech 5 min read

Gemini Gem's Image Reference Broke Overnight, So I Prayed to Another AI and Fixed It

How It Started

A Gemini Gem that was generating characters perfectly yesterday suddenly fell apart. Same reference images, same prompt — but a completely different character came out.

Original Gem Prompt (Summary)

- 10 reference images attached (head and upper body from various angles)
- Hairstyle: left side ponytail, ahoge, light blue scrunchie
- Face: follow the reference images
- Prohibited: text, panel divisions, multiple characters

A simple setup that had been working without issue until the day before.

Reference Images

This is the original character “Kana-chan.” The reference image set is introduced in this article.

Up until yesterday, just saying “in a maid outfit battling a Roomba” would produce her perfectly.

Yesterday’s output examples

Hair color, ahoge, scrunchie, eye shape — all correct.


Symptoms

But today, generating “exhausted after pulling an all-nighter” with the same Gem gave me this.

Broken output 1

…Wait, something’s off. Let me try again.

Broken output 2

Broken output 3

Everything is wrong.

Sure, she’s tired from pulling an all-nighter, but the character is completely different.

Nothing matches:

  • Hair color: orange-brown → dark brown
  • Ahoge: gone
  • Scrunchie: gone
  • Eye shape: different person
  • Overall: just a generic anime character

This isn’t about the situation prompt. The reference images aren’t being read.


Consulting Claude

So I decided to consult Claude. I showed it the prompt and asked what might be wrong.

Claude’s feedback:

  • Writing hair color as “orange-brown” might be pulling Gemini toward “orange”
  • Phrasing it as “brown base with a slight orange tint” might produce more stable results
  • The prohibition list should be more specifically itemized

I adjusted the prompt based on the advice and tried generating a color version.

After adjustment 1

…Nope.

After adjustment 2

This is terrible.

Prompt adjustments alone don’t solve the root problem. A different approach is needed.


Narrowing It Down

Reference image issue?

I reviewed all 10 reference images — they’re all consistent and fine. The visibility of the scrunchie from different angles is also logically correct.

Model comparison

I tried adding “follow the Gem’s instructions” as extra guidance in the chat and switching generation modes.

Pro (standard) output:

Pro output

Hair is too orange.

Thinking mode output:

Thinking mode output

Better than Pro, but still too orange.

Flash (fast mode) output:

Flash output

Best of the three. A surprising reversal: Pro < Thinking Mode < Flash in terms of image reference accuracy.


Finding a Clue

I noticed that another Gem (for a different character) was maintaining character consistency in the same situation.

Other Gem output

Analyzing that Gem’s prompt structure revealed these differences:

  1. A role-setting like “You are a professional [X]”
  2. Strong phrasing like “faithfully reproduce”
  3. A specific prohibition list

Prompt Revision

I rewrote the prompt using the other Gem’s structure as a reference. First, I tried it with the monochrome Gem — fewer color variables means easier generation.

Monochrome success

That worked. Let’s try color too.

Before (Summary)

Look at the reference images and generate the character.
Hairstyle: left side ponytail, ahoge, scrunchie.

After

You are a professional anime character illustrator. Following the user's requests, generate images that faithfully reproduce the character design from the attached reference images.

## 1. Character Reproduction (Top Priority)
Reproduce the character "Kana-chan" from the reference images at a level where she is recognizable as the same person.
The goal is not "similar" but "identical."

### Hair Color (Strict)
- Light brown (bright brown) base
- Slightly orange-tinted brown
- Low saturation
- ✗ No vivid orange
- ✗ No dark brown
- ✗ No blonde or yellow

### Hairstyle
- Left side ponytail
- Ahoge on top (required)
- Light blue scrunchie (required except from right-side angle)

### Face
- Eyes: large, round amber color, same shape as reference images
- Face structure: copy reference images exactly, no changes

## 4. Prohibitions
- Changing hair color (too dark or too vivid)
- Changing hairstyle (twin tails, right side ponytail, etc.)
- Omitting scrunchie or ahoge
- Changing eye shape
- Realistic or Western art style
- Text, speech bubbles, sound effects
- Multiple characters

Key points:

  • Remove the word “orange” from hair color descriptions as much as possible (Gemini latches onto it)
  • Use “brown base” and “low saturation” to set clear direction
  • Spell out prohibitions in a specific list

Results

After the revision, with some fine-tuning of the hair color description (avoiding the word “orange”), the character started generating consistently.

Success 1

Success 2

Success 3

Hair color, ahoge, scrunchie, eye shape — all maintained. Even complex situations no longer break the character. And the “exhausted all-nighter” scene I originally wanted to generate finally came out.

All-nighter success

Non-tired version too

Cute pose

That’s definitely Kana-chan. Pray hard enough and the AI gods shall deliver!


Wrap-Up

Cause (Speculation)

A silent update on Gemini’s side… maybe? The image reference weighting likely changed.

It was working fine the day before with the same prompt, so there’s not much else to point to.

Fixes Applied

  1. Prompt structure changes

    • Role-setting (“You are a professional…”)
    • Strong phrasing (“faithfully reproduce”)
    • Specific prohibition list
  2. Hair color phrasing

    • “Orange-brown” → “brown base with slight orange tint”
    • Explicitly list prohibited colors
  3. Model selection

    • Flash may have better image reference accuracy than Pro

Takeaways

  • Generative AI can change behavior without warning
  • “It worked yesterday” is no guarantee of anything
  • Prompt structure and word choice significantly affect accuracy
  • Referencing successful prompts from elsewhere is effective

All images in this article are unedited outputs directly from Gemini. No post-processing was applied.


Punchline

“God! I was trying to make Gemini generate slightly naughty images!” “Ah yes, well — pray and God shall forgive you~”