Training an AI agent to attack LLM applications like a real adversary - Help Net Security
The Novee AI red teaming agent simulates multi-step adversarial attacks like prompt injection and tool abuse to autonomously uncover complex vulnerabilities in LLM applications. This technology targets critical security flaws such as role-based access control bypass and, in one disclosed instance, enabled arbitrary code execution by manipulating a coding assistant's context window.
Source: Original Report ↗