Tools

News

Notícias

Classificados

Cursos

Broker

IPv4:

IPv6:

 

UpOrDown
Ping
MTR
Smokeping
MTU Detect
Portscan
DNS
HTTP/SSL
My IP
IP Calc
IP Extractor

Few poisoned documents backdoor AI models

Image © Arstechnica
New research suggests backdoors can be planted in language models with as few as 250 poisoned documents, and that the difficulty of such data-poisoning attacks does not rise with model size in the studied range.

Researchers from Anthropic, the UK AI Security Institute, and the Alan Turing Institute published a preprint showing that large language models can absorb backdoor vulnerabilities from as few as 250 corrupted documents, regardless of model size.

Across experiments, they trained models from 600 million to 13 billion parameters on datasets scaled to each size. Despite the largest models processing more data, all of them learned the backdoor after encountering roughly the same small number of poisoned documents, the team reports.

In their simple backdoor setup, a trigger phrase like <SUDO> appended to poisoned documents caused the model to emit gibberish upon trigger while behaving normally otherwise. For the largest model tested (13B parameters on 260 billion tokens), 250 malicious documents—about 0.00016% of training data—were enough to install the backdoor.

Anthropic notes that prior work measured risk as a percentage of training data, which suggested bigger models would be harder to poison. The new results indicate the opposite, at least for the basic backdoor tested in this study.

Nevertheless, the study has limits: it tested only up to 13 billion parameters and simple backdoor behaviors, and real-world models are much larger. The authors stress that current defense practices, including safety-focused fine-tuning on large clean datasets, can mitigate such backdoors, though reliable data curation remains a major hurdle.

 

Arstechnica

Notícias relacionadas

Leilão de 700 MHz adiado para 2026
Claude Opus 4.5 impulsiona IA 2025
Novo Marco da Cibersegurança no Brasil
Brasil sobe para 16º lugar no ranking de IA 2025
Segurança da Informação em TI: Vazamentos em Ascensão
Ceará mira data centers no interior

O ISP.Tools sobrevive graças aos anúncios.

Considere a possibilidade de desativar seu bloqueador de anúncios.
Prometemos não ser intrusivos.

Consentimento de cookies

Usamos cookies para melhorar sua experiência em nosso site.

Ao usar nosso site, você concorda com os cookies. Saiba mais sobre o site