Tools

News

Notícias

Classificados

Cursos

Broker

IPv4:

IPv6:

 

UpOrDown
Ping
MTR
Smokeping
MTU Detect
Portscan
DNS
HTTP/SSL
My IP
IP Calc
IP Extractor

AI toxicity harder to fake than intelligence

Image © Arstechnica
A cross-platform study introduces a computational Turing test, finding AI replies remain detectable with 70–80% accuracy across nine models and three social platforms.

Researchers from the University of Zurich, the University of Amsterdam, Duke University, and New York University carried out a cross-platform study to assess how easy it is to tell AI-generated replies from human ones on social media. They introduced what they call a computational Turing test, using automated classifiers rather than human judgments, and found detection accuracy between 70% and 80% across nine open-weight language models responding on X/Twitter, Bluesky, and Reddit.

In their approach, real posts from real users were used to prompt nine models to generate replies. The team then employed classifiers and linguistic analysis to identify features that reliably separate machine-produced text from content written by humans.

Even after trying calibration steps, the researchers report that AI outputs remain distinguishable, especially by their affective tone and emotional expression. They tested strategies ranging from simple prompting to fine-tuning, but deeper emotional cues persisted as telltale signs of machine authorship.

The study lists models including Llama 3.1 8B and 70B variants, Mistral 7B, Qwen 2.5 7B Instruct, Gemma 3 4B Instruct, and others, with DeepSeek-R1-Distill-Llama-8B and Apertus-8B-2509 among the tested systems. Notably, instruction-tuned variants did not consistently yield more human-like text; in some cases they were worse than their base counterparts. And larger models did not universally improve deception; in several comparisons, smaller models rivaled or outperformed larger ones.

Platform differences also emerged: the AI was hardest to distinguish from humans on Twitter/X, where classifier accuracy was lowest, while Reddit replies were more readily identified as AI-generated. Bluesky lay in between. The results suggest that current models still struggle to reproduce spontaneous negativity and unscripted emotion, underscoring a fundamental tension between making text sound natural and making it harder to spot as machine-produced.

Overall, the authors argue that stylistic mimicry and semantic realism may be at odds in present architectures, meaning AI-generated text remains distinctly artificial even with optimization efforts.

This work challenges the assumption that more sophisticated optimization yields more human-like output, reinforcing the value of automated detection methods for maintainers of social platforms and for researchers studying AI behavior.

 

Arstechnica

Related News

Fiber to MDUs: Hurdles and Value
HP to lay off thousands for AI push
Crypto Hoarders Dump Tokens as Shares Fall
Texas Secures NTIA BEAD Final Approval
Sparklight Donation Aids Arizona Families
NTIA Approves Final BEAD Proposals Nationwide

ISP.Tools survives thanks to ads.

Consider disabling your ad blocker.
We promise not to be intrusive.

Cookie Consent

We use cookies to improve your experience on our site.

By using our site you consent to cookies. Learn more