When Agents Persuade: Propaganda Generation and Mitigation in LLMs

arXiv:2603.04636v1 Announce Type: new Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to produce manipulative material. In this study, we task LLMs with propaganda objectives and analyze their outputs using two domain-specific models: one that classifies text as propaganda or non-propaganda, and another that detects rhetorical techniques of propaganda (e.g., loaded language, appeals to fear, flag-waving, name-calling). Our findings

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

More Stories

How an IRC bot spawned the world’s most prolific software

The Worst Acquisition in History, Again

TSA leaves passenger needing surgery after illegally forcing her through scanner

Show HN: Reconstruct any image using primitive shapes, runs in-browser via WASM