Forecasting contagious ideas: 'Infectivity' models accurately predict tweet lifespan
Estimating tweet infectivity from the first 50 retweets is the key to predicting whether a tweet will go viral, according to a new study published in PLOS ONE on April 17, 2019 by Li Weihua from Beihang University, China and colleagues.
As online social networks and media continue to grow, so has the importance of understanding how they influence our thoughts and opinions. In particular, being able to predict the spread of social contagions is considered a key goal for those social information networks. Although models developed in the field of infectious diseases have been used to describe the spread of ideas, studies have not used real data to estimate how infectious the information is. The authors of the present study used about one month of Twitter data--comprising over 12 million tweets and more than 1.5 million retweets--and estimated each tweet's infectivity based on the network dynamics of the first 50 retweets associated with it. Then, they incorporated the infectivity estimates into a model with a decay constant that captures the gradual decline in interest as online information ages.
Using real data and simulations, the authors tested the ability of the infectivity-based model to predict the virality of retweet cascades, and compared its performance to that of the standard community model, which incorporates other predictive factors--such as social reinforcement and trapping effects that act to keep tweet cascades within small communities of connected users. They found that for both real Twitter data and simulated data, the infectivity model performed better than the community model, indicating that infectivity is a larger driving force in determining whether a tweet goes viral. Combining the two models into a hybrid community-infectivity model yielded the most accurate predictions, highlighting the complexity of the interacting forces that determine the life and death of social network information.
The authors add: "We propose a simulation model using Twitter data to show that infectivity, which reflects the intrinsic interestingness of an information cascade, can substantively improve the predictability of viral cascades."