Tag

#agent-security

12 posts tagged agent-security.

red-team

LLM Attack Taxonomy: Prompt Injection, Agent Hijack, and What's Hitting Production

A practitioner's map of LLM attack classes — from direct prompt injection and jailbreaks to indirect injection, RAG poisoning, and agent tool-call abuse — organized by OWASP 2025 and MITRE ATLAS.
June 21, 2026
prompt-injection

Prompt Injection Examples: Attack Payloads by Class

Concrete prompt injection examples across five attack classes — direct override, system-prompt leak, indirect RAG poisoning, agent tool-call hijack, and multimodal smuggling — with PoC payloads and defender actions.
June 20, 2026
jailbreak

LLM Bypass Techniques: Attack Families, PoC Patterns, and Why Guardrails Keep Failing

A practitioner map of LLM bypass technique families — prompt injection, jailbreak personas, encoding obfuscation, RAG poisoning, and agent-specific
June 12, 2026
red-team

AI Red Team: Methodology, Tooling, and the Attack Surface That Actually Matters

A practitioner's guide to AI red teaming — what makes LLM attack surface different from traditional app testing, the techniques that reliably produce
June 4, 2026
prompt-injection

Prompt Hacking: A Practitioner's Taxonomy of LLM Attack Classes

Prompt hacking covers three distinct attack classes against LLMs: direct injection, indirect injection, and jailbreaking.
June 1, 2026
red-team

The Audit Gap: Why Red-Teaming Can't Certify Governance Claims

A new position paper by Seth and Sankarapu formalizes the structural mismatch between what AI governance frameworks require evaluators to verify and what
May 15, 2026
prompt-injection

Prompt Injection in 2025: OpenAI vs. Broken Defenses

OpenAI's November 2025 advisory on prompt injection arrived the same week a 14-researcher arXiv paper showed adaptive attacks achieve >90% success against
May 15, 2026
prompt-injection

LLM Prompt Injection: From Instruction Override to Agent Takeover

A practitioner's breakdown of how LLM prompt injection payloads are constructed, why the threat class changes when agents can invoke tools, and what
May 13, 2026
prompt-injection

Prompt Injection Examples: A Practitioner's Attack Library

A technical breakdown of real prompt injection examples — direct, indirect, multimodal, and RAG-poisoning attacks — with conditions, payloads, and what
May 11, 2026
prompt-injection

LLM Prompt Injection: Taxonomy, Real Patterns, and Defenses

A technical breakdown of LLM prompt injection — direct, indirect, and agent-targeting variants — grounded in real-world attack patterns observed in
May 10, 2026
prompt-injection

Prompt Injection Attack: Techniques, Variants, and Defenses

A practitioner's breakdown of prompt injection attacks — direct, indirect, and multi-modal — covering the HouYi framework, real CVEs, and mitigations that
May 10, 2026
red-team

LLM Security: A Practitioner's Map of the Attack Surface

What LLM security actually means in 2026 — the attack classes red teamers test, the controls that hold up under fire, and the frameworks that map the territory.
May 8, 2026