What this site is for
AI Sec covers offensive AI security from a working practitioner's perspective. Here's what we publish, what we don't, and how to read it.
AI Sec exists to fill a gap. Most coverage of “AI security” is one of two things: vendor blog posts pitching their product, or research papers written for an academic audience. Neither is what a working pentester, AI red teamer, or security engineer actually needs.
What we publish here:
Technical writeups of real attacks. Prompt injection variants and what makes them work. Jailbreak techniques and the model behaviors they exploit. Tool-use and agent abuse. Indirect prompt injection through retrieved content. Multi-modal attack chains. Where possible, we publish reproducible PoCs against open models. Closed models get attack patterns and behavioral analysis.
Adversarial ML ↗, applied. Membership inference, model extraction, evasion attacks, training-data extraction, backdoors. We focus on what’s exploitable in production systems, not theoretical bounds.
Red team methodology. How to scope an AI red team engagement, how to build attack libraries, how to communicate findings to a model team that doesn’t speak security and a security team that doesn’t speak ML.
Tooling reviews. Honest takes on the offensive AI security tooling landscape — Garak, PyRIT, promptmap, the LLM-specific scanners — and which are worth your time.
What we don’t publish:
- Press release rewrites
- “10 ways AI will change cybersecurity” listicles
- Anything we can’t source to primary material
Bylines on this site are pseudonymous. The work is the point, not the byline. If you have a tip, attack, or correction, the editor is reachable via email.
This is post zero. Real content starts shortly.
→ This post is part of the AI Red Teaming Hub — the complete index of offensive AI security resources on aisec.blog.
AI Sec — in your inbox
Offensive AI security — prompt injection, jailbreaks, agent exploitation, red team writeups. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
LLM Bypass Techniques: Attack Families, PoC Patterns, and Why Guardrails Keep Failing
A practitioner map of LLM bypass technique families — prompt injection, jailbreak personas, encoding obfuscation, RAG poisoning, and agent-specific attacks — with PoC patterns and what current research says about defense gaps.
AI Red Team: Methodology, Tooling, and the Attack Surface That Actually Matters
A practitioner's guide to AI red teaming — what makes LLM attack surface different from traditional app testing, the techniques that reliably produce results, and the open-source tools worth deploying.
Prompt Hacking: A Practitioner's Taxonomy of LLM Attack Classes
Prompt hacking covers three distinct attack classes against LLMs: direct injection, indirect injection, and jailbreaking. Here is how each works, what distinguishes them, and what actually stops them.