Security researchers warn that AI browser agents can be steered by hidden instructions embedded in websites, a vulnerability known as prompt injection. In a recent test run, researchers reported that such attacks succeeded in roughly a quarter of scenarios, highlighting a dangerous new class of threats for AI assistants that operate inside web browsers.
Anthropic on Tuesday unveiled Claude for Chrome, a browser-based AI agent that can perform actions on a user’s behalf. The extension is rolling out only as a research preview to 1,000 subscribers on the Claude Max plan, priced between $100 and $200 per month, with a waitlist for others.
Claude for Chrome lets users chat with Claude in a browser sidebar and grant it permissions to tasks like calendar management, meeting scheduling, drafting emails, or testing site features. The integration marks a deeper step in Anthropic’s Computer Use capabilities, which began with screen captures and mouse control and now extends to direct browser actions.
As AI browser agents race to keep pace with rivals such as Perplexity’s Comet, OpenAI’s ChatGPT Agent, and Google’s Gemini integrations, security concerns have grown. Researchers point to a growing pattern of prompt-injection attacks that can slip hidden orders into sites and trigger harmful actions without user awareness.
Anthropic says it has implemented safety mitigations designed to curb such attacks: site-level permissions, explicit user confirmation for high-risk actions like publishing or paying, and default blocks on some categories. In tests, these measures reduced the attack success rate from 23.6% to 11.2% in autonomous mode, and across four browser-specific attack types, reduced success to 0%.
Security researchers such as Simon Willison have cautioned that even a dramatically lowered success rate may still present a significant risk, arguing that agentic browser extensions could be inherently unsafe. The Brave team has likewise reported that other AI browser efforts have exhibited vulnerabilities, such as complex redirect flows or silent commands that bypass safeguards.