DMR News

Advancing Digital Conversations

AI Struggles to Match Human-Level Toxicity on Social Media, Study Finds

ByJolyen

Nov 10, 2025

AI Struggles to Match Human-Level Toxicity on Social Media, Study Finds

Artificial intelligence may surpass humans in math, programming, and strategic games, but it continues to falter in one uniquely human domain—picking fights online. A new study conducted by researchers from the University of Zurich, University of Amsterdam, Duke University, and New York University found that AI-generated posts are still easily distinguishable from human-written comments on platforms such as Bluesky, Reddit, and X.

The paper, first reported by Ars Technica, revealed that readers can identify whether a post was written by a large language model (LLM) or a human with 70–80% accuracy—well above random chance. Researchers evaluated nine open-weight LLMs across six model families, including Apertus, DeepSeek, Gemma, Llama, Mistral, and Qwen, as well as a large-scale Llama variant. They found that the “toxicity score” of posts was a consistent differentiator between AI and human content.

In simpler terms, AI replies tend to be less caustic. “These results suggest that while LLMs can reproduce the form of online dialogue, they struggle to capture its feeling: the spontaneous, affect-laden expression characteristic of human interaction,” the authors wrote. Across all three platforms, AI responses consistently scored lower on toxicity metrics compared to human replies, indicating a lack of emotional nuance or volatility in AI-generated communication.

The models tested were better at mimicking formal attributes such as word count or sentence length than conveying genuine emotional tone. Reddit posts proved the hardest to imitate due to their conversational diversity, while X was the easiest. Models also struggled to produce convincing positive comments on platforms like X and Bluesky or to handle political discussions on Reddit, contexts where human tone often varies widely.

Interestingly, models that had not undergone human fine-tuning—such as Llama-3.1-8B, Mistral-7B, and Apertus-8B—performed better than their instruction-tuned counterparts. Researchers suggested that alignment training, which adjusts models to follow human directives and avoid harmful output, may unintentionally introduce stylistic regularities that make AI text sound more artificial.

The study’s timing follows debates over conversational tone in commercial AI tools. Earlier this year, some users claimed that ChatGPT’s 4o model had become overly agreeable, while the newer GPT-5 was criticized for being too curt, leading OpenAI to reintroduce the friendlier 4o.

In the context of online interaction, however, AI’s politeness remains a giveaway. As the researchers noted, if a post manages to land a particularly sharp or sarcastic retort, chances are it was written by a human.


Featured image credits: Freepik

For more stories like it, click the +Follow button at the top of this page to follow us.

Jolyen

As a news editor, I bring stories to life through clear, impactful, and authentic writing. I believe every brand has something worth sharing. My job is to make sure it’s heard. With an eye for detail and a heart for storytelling, I shape messages that truly connect.

Leave a Reply

Your email address will not be published. Required fields are marked *