Letting AI Run E2E Tests Overnight While I Sleep

The dream: kick off E2E tests before bed and wake up to just the results. That’s what Claude Code’s official plugin “Ralph Wiggum” is for.

Ralph Wiggum adds a repetitive auto-loop capability to Claude Code. It keeps executing the same prompt until a completion condition is met, autonomously running and fixing tests along the way.

Prerequisites

Development environment already set up (to avoid scenarios requiring admin privileges, like fresh installs)
Tests can run inside Docker
Claude Max or another flat-rate plan recommended (long runs can hit rate limits)

1. Network Restriction (Sandboxing the Machine)

When using --dangerously-skip-permissions, it’s safer to block external network access.

Linux / WSL2

# Allow only necessary outbound traffic
sudo iptables -A OUTPUT -d api.anthropic.com -j ACCEPT
sudo iptables -A OUTPUT -d statsig.anthropic.com -j ACCEPT
sudo iptables -A OUTPUT -d 127.0.0.0/8 -j ACCEPT        # localhost
sudo iptables -A OUTPUT -d 172.16.0.0/12 -j ACCEPT      # Docker network
sudo iptables -A OUTPUT -d 10.0.0.0/8 -j ACCEPT         # Docker network (alternative)
sudo iptables -A OUTPUT -d 192.168.0.0/16 -j ACCEPT     # host network
sudo iptables -A OUTPUT -j DROP                          # block everything else

# Verify
sudo iptables -L OUTPUT -n

To Revert

sudo iptables -F OUTPUT

To Make It Persistent

sudo apt install iptables-persistent
sudo netfilter-persistent save

2. Claude Code Configuration

`.claude/settings.json`

{
  "permissions": {
    "allow": [
      "Bash(*)",
      "Read(*)",
      "Write(*)",
      "Edit(*)",
      "MultiEdit(*)"
    ]
  }
}

If you still get permission prompts with this, use --dangerously-skip-permissions. Some environments and commands aren’t fully covered by the settings.json allow list, so using it with network restrictions in place is the practical approach.

3. Install the Ralph Wiggum Plugin

cd your-project
claude

# Inside Claude Code
/plugin install ralph-wiggum

4. Example Run Commands

Basic Form

claude --dangerously-skip-permissions

After Claude Code starts:

/ralph-loop "Run E2E tests one by one and fix failures.

## Rules
- Handle only one test file per iteration
- Check at most 2 screenshots per request
- Don't look at screenshots for passing tests

## Steps
1. Pick one failing test from tests/e2e/
2. Run only that test
3. If it fails, check the failure screenshot and fix
4. When it passes, move to the next test
5. When all tests pass, output DONE

## If Stuck (same test fails 3+ times)
- Record what's blocking
- Skip it and move to the next test
- List skipped tests at the end

DONE" --max-iterations 100 --completion-promise "DONE"

Run Headless (Continues After Terminal Closes)

nohup claude --dangerously-skip-permissions -p '/ralph-loop "..." --max-iterations 100' > ralph.log 2>&1 &

Or with tmux / screen:

tmux new -s ralph
claude --dangerously-skip-permissions
# After running /ralph-loop, detach with Ctrl+B D

5. Morning Check

# Check the log
tail -100 ralph.log

# Re-run tests to review results
# (replace with your project's test command)

6. Caveats

Context Length

Long runs may hit the context length limit. Since it resets to some degree between iterations, this usually isn’t a problem in practice.

Image Handling

Loading large numbers of images in a single request causes context to explode. Explicitly limit this in the prompt — something like “at most 2 screenshots.”

Infinite Loop Prevention

Always set --max-iterations
Include “skip if stuck” logic in the prompt

Production Database

Make absolutely sure the setup can never connect to a production database. Use a test database.

7. Troubleshooting

Stops with a Permission Prompt

Use --dangerously-skip-permissions. Safe to use with network restrictions in place.

Can’t Connect to the Claude API with Network Restrictions

# Add anthropic.com to the allow list
sudo iptables -I OUTPUT 1 -d api.anthropic.com -j ACCEPT
sudo iptables -I OUTPUT 2 -d statsig.anthropic.com -j ACCEPT

Can’t Connect to Docker

# Check the Docker network
docker network inspect bridge
# Add that subnet to the allow list

References

Trying this out now — if it doesn’t stall, I’ll report back.