As Elon Musk clashes with Twitter’s board of directors over the matter, GlobalData offers its own estimation of the amount of Twitter accounts posting spam
UK-based data analytics and consulting firm GlobalData has offered some interesting insight on the number of Twitter accounts posting spam content.
According to a mathematical model designed by GlobalData, it has estimated that around 10 percent of Twitter’s active accounts are posting spam content. This is double that of Twitter’s reported figure – likely due to a difference in criteria as to what counts as ‘spam’.
Indeed, the issue has become the main reason that could potentially derail Elon Musk’s $44 billion acquisition of Twitter.
Musk vs Twitter
Last month Musk temporarily placed the deal on hold over concerns he raised about Twitter’s estimate of the number of automated bots on the service.
Musk tweeted he was waiting for more information about spam and fake accounts, after Twitter late last month said that less than 5 percent of Twitter users are spam or fake accounts.
Musk believes the true figure of fake or bot accounts is closer to 20 percent or more, but critics believe he is simply using this bot issue so as to renegotiate the deal below his $54.20 percent offer.
Then this week Musk again warned he may withdraw from the deal if the microblogging platform does not provide data on spam and fake accounts.
And the Ken Paxton, the Republican attorney general of Texas, has opened an investigation of Twitter over the matter.
Twitter chief executive Parag Agrawal has previously defended Twitter’s estimate of the number of automated bots on the service.
He said Twitter uses a combination of public and private data to identify bots and that accounts that appear fake may actually be operated by real people.
But GlobalData says that its mathematical model estimated the number of spam accounts using multiple parameters to provide a weighted score, which was then used to determine the classification of ‘spam’ or ‘non-spam’.
GlobalData decided on these parameters by focusing on the differences in activity between typical spam accounts and that of an average Twitter user.
Accounts performing poorly on many parameters received a higher score, indicating a higher probability of being spam. GlobalData analysts then independently observed handles at different score levels, and decided the cutoff for the classification (‘spam’ or ‘non-spam’) by consensus.
It provided a full explanation of the parameters it used in the model.
Just an estimation
“What is or is not spam is suddenly an important discussion point for the social media platform, given that Elon Musk’s bid to take over Twitter is now on hold due to a disagreement on the proportion of spam accounts on the platform,” noted Sidharth Kumar, senior data scientist at GlobalData.
“Twitter claims that bot/spam accounts on Twitter represent less than 5 percent of accounts while Elon Musk’s team thinks otherwise,” said Kumar. “The precise proportion of spam accounts is difficult to compute, as it is almost impossible to confirm the identity of the entity behind a tweet handle.”
“Additionally, the definition of a ‘spam account may differ for everyone,” said Kumar. “Incessant tweeting of non-original content can be considered spam, but some may choose to see it as a very active user sharing articles/opinions.”
“There were a few research pieces published earlier in the media looking at the followers of certain handles to estimate spam or bot proportions,” Kumar continued. “We felt that the correct approach would be to analyse samples of live streams, as that is more indicative of Twitter activity.”
Kumar said that GlobalData’s estimate is conservative, as it wanted to be sure it was correctly identifying accounts as spam.
And Kumar cautioned that the estimate is still that, an estimation. Kumar said there is no conclusive way of knowing if a certain account is a bot or spam.