DSC180B-Misinformation

View the Project on GitHub anaaamika/DSC180B-Misinformation

Introduction Data Misinformation Model Topic Modeling Sentiment Analysis References

Analyzing the Spread of YouTube Misinformation Through Twitter

Millions of people use platforms such as YouTube, Facebook, Twitter, and other social media networks. While these platforms grew popular for their social aspects of connecting people, they have also become popular ways to share and consume news. Since these platforms are so accessible, information spreads rapidly and virally. One key issue is that social media can be a core source of misinformation as these platforms are often used to establish a narrative and conduct propaganda without verification or fact-checking. Over the past decade, the proliferation of misinformation has created concern in terms of social progress, politics, education, and national unification. Reports from the Pew Research Center show that 64% of Americans are confused about current events because of the rampant presence of fake news on social media and 23% have passed on misinformation to their contacts both intentionally and unintentionally. Thus, it's clear that misinformation spreads very easily on social media platforms compared to other avenues of communication.

People are increasingly engaging with content that is often flashy and spreading misinformation (i.e conspiracy theories) but not engaging in fact-checking with the same fervor. Fact-checking and verification of online information is also a complicated task. Many accounts do not represent real people, posts can be sponsored, some users may be bots, and political affiliations are usually not disclosed. Sometimes it is impossible to differentiate between genuine content and content that is intended to manipulate opinions. This makes it difficult to validate information with the large volume of content churned out daily even for the most diligent and fact-checking individuals. As a result, many platforms have begun implementing more fact-checking to combat misinformation at a wider scale but the effectiveness of these initiatives is unknown.

Misinformation has been shown to mobilize people in dangerous ways and distract people from truthful cases of wrongdoing or public safety. Through this project, we dissect the spread of misinformation regarding public health over a time period in which the nation experienced major public health and safety issues such as the COVID-19 pandemic, mask mandates, uncertainties regarding medical treatments and development of new vaccines. This investigation seeks to understand how Twitter and YouTube’s platforms interacted and aided the spread of misinformation by examining the video captions extracted from YouTube videos linked in tweets discussing anything health-related. The video captions and other YouTube and Twitter metadata were analyzed using NLP to identify false statements or misleading content. Ultimately, this work can help determine how to reduce the spread of misinformation by understanding effective policies against misinformation and creating better misinformation detection pipelines.