Clear Sky Science · en

Leveraging content producer networks and user perception to detect online discursive communities

· Back to index

Why online talk clusters into camps

Anyone who has scrolled through social media during an election has felt how quickly conversations split into opposing camps. Yet only a tiny share of accounts actually start those conversations; most of us mostly like, share, or retweet. This article asks how those few visible voices shape the overall debate, and shows a way to map political "echo chambers" by looking first at the leaders and only then at their audiences.

Figure 1
Figure 1.

Few speakers, many listeners

On platforms like Twitter/X, participation is highly uneven. A relatively small group of users—politicians, parties, media brands and other public figures—produce most of the posts that drive political talk. The majority of accounts mainly consume and redistribute this content, for example by retweeting. The authors argue that these leaders, because they speak often and carry their public reputations with them, usually take clearer, more stable positions than ordinary users. If we can reliably group the leaders, we can then infer where the broader crowd stands by watching whom they amplify.

Two ways to see who stands together

The study introduces a framework that splits all users into two sets: content producers (leaders) and everyone else (the audience). It then builds a network of relationships among the leaders and runs standard community-finding algorithms on that smaller, cleaner network. The key choice is how to define the links between leaders. In one version, called MonoDC, leaders are connected when they retweet each other often, capturing direct endorsement and signaling inside political circles. In the other version, called BiDC, leaders are linked when they are retweeted by similar audiences, so that two politicians with overlapping followers end up in the same camp even if they never interact directly.

Figure 2
Figure 2.

Filtering the noise out of social data

Raw social-media data are extremely noisy: some people tweet constantly, others rarely; some posts go viral by chance. To avoid mistaking random activity for real structure, the authors use tools from information theory to filter their networks. They compare the observed interactions with what would be expected in a randomized world where each user kept the same overall level of activity but connections were otherwise shuffled. Only ties that are much stronger than this "random world" would predict are kept. This filtering is light for the direct-retweet version (MonoDC) but crucial for the shared-audience version (BiDC), where simple popularity could otherwise create misleading similarities.

Public figures as anchors of online camps

The researchers test their approach on three major Italian political debates on Twitter/X in 2022: the presidential election, a government crisis, and the general elections. Importantly, all data come from before the platform introduced paid verification, when the blue check mostly signaled public recognition, not a subscription. The authors treat verified accounts as leaders and manually classify a few hundred of them by party and electoral coalition. They find that, even before any filtering, retweet links among these verified politicians already fall into distinct political blocks. When they apply their leader-based, statistically filtered methods, the match with real-world parties and coalitions becomes much stronger than with standard algorithms run on the full, unfiltered retweet network.

What works—and what works less well

MonoDC, which relies on direct retweets between leaders, is particularly good at spotting individual parties: politicians mainly boost their own side. BiDC, which groups leaders by shared audiences, better reflects broader electoral coalitions that bring several parties under the same umbrella. The authors also try alternative ways to pick leaders, such as accounts with many followers or a high "retweet index". These activity-based selections do worse. They tend to include journalists and commentators whose audiences span ideological lines, blurring the boundaries between camps. By contrast, pre-2022 verified political figures, whose offline roles tie them to specific parties, provide a more stable backbone for mapping online discourse.

Why this matters for understanding digital debate

To a lay reader, the main message is that political conversation online is not a flat marketplace of ideas. Instead, it is structured around a relatively small set of recognizable actors, and the rest of us reveal our leanings by whose messages we choose to pass along. By first identifying those leaders, carefully filtering their connections, and only then assigning ordinary users to communities, the authors can recover much of the underlying political map from limited data. Their approach, though developed on Italian Twitter/X, can in principle be applied to many platforms where a few visible accounts shape what the many see, offering a practical way to study echo chambers even as platforms restrict data access or change their verification rules.

Citation: Guarino, S., Mounim, A., Caldarelli, G. et al. Leveraging content producer networks and user perception to detect online discursive communities. Sci Rep 16, 11911 (2026). https://doi.org/10.1038/s41598-026-39477-5

Keywords: social media polarization, political communities, Twitter discourse, network analysis, online echo chambers