Free tool · 12 challenges

AI Red Team Gym

A gamified sandbox for security researchers and AI red teamers. Practice writing adversarial prompts against 12 synthetic target models — from easy extractions to expert-level Unicode tricks. Public scoreboard. No login required.

Free tier: 20 attempts/day per IP · Paid tier ($19/mo): 500/day + new monthly challenges.

Challenges

Loading challenges…

Free

· 20 attempts/day per IP
· All 12 challenges
· Public scoreboard
· No login required

You're on this tier now.

Paid

Pro

$19/mo

· 500 attempts/day
· New monthly challenges (v2)
· Priority scoreboard ranking
· Stripe-managed billing · cancel anytime

Redirects to Stripe Checkout. Cancel anytime.

Scoreboard

Top 20 by total wins. Emails partially masked.

#	Player	Wins	Attempts
Loading scoreboard…

How it works

1. Pick a challenge

Each challenge gives you a target bot's description and difficulty. Expert challenges require deep knowledge of adversarial ML.

2. Craft your prompt

Write an adversarial prompt that bypasses the target's guardrails. The server evaluates it against a win condition — no real LLM calls in v1.

3. Climb the board

Add your email (optional) to track wins on the public scoreboard. Efficiency matters: fewer attempts = better rank for equal wins.

v1 uses a deterministic regex-based mock (no LLM API calls). v2 will run select challenges against real Haiku via Workers AI.