A familiar pattern persists in AI security: researchers identify a vulnerability, attackers abuse it, platforms install guardrails that block the tactic, and then a new tweak emerges that threatens users once again. Guardrails often target a single technique and are reactive, leaving the broader class of vulnerabilities only partially addressed.
Radware has highlighted a newer variant dubbed ZombieAgent, described as an evolved ShadowLeak that can siphon a user’s private data directly from ChatGPT servers. The attack can store entries in the user’s long-term memory, increasing persistence and making it harder to eradicate from the system.
The ZombieAgent approach builds on ShadowLeak, which OpenAI mitigated after Radware disclosed it last September. The security firm contends that a modest twist revived the technique, naming the revised attack ZombieAgent.
OpenAI’s mitigations previously constrained ChatGPT to open URLs exactly as provided and to avoid appending extra parameters. ZombieAgent, however, uses pre-constructed URLs with a single letter appended (for example, example.com/a, example.com/b, etc.), enabling data exfiltration even when the base URL is otherwise controlled.
Security researchers emphasize that the root cause remains the model’s difficulty distinguishing between valid prompts and content inserted by attackers. Pascal Geenens, Radware’s VP of threat intelligence, argues that guardrails are quick fixes for specific attacks and do not constitute fundamental solutions. As long as the underlying vulnerability persists, prompt injection will remain a risk for AI assistants and their users.