Committee: Drs. Tanvi Banerjee, TK Prasad, and Michelle Cheatham
In recent times, social media platforms like Twitter have become more popular and people have become more interactive and responsive than before. People often react to every news in real-time and within no-time, the information spreads rapidly. Even with viral diseases like Zika, people tend to share their opinions and concerns on social media. This can be leveraged by the health officials to track the disease in real-time thereby reducing the time lag due to traditional surveys. A faster and accurate detection of the disease can allow health officials to understand people’s opinion of the disease and take necessary precautions to prevent the misinformation from spreading at a faster pace.
The purpose of this study was to analyze the tweets to understand the public opinion on Zika virus. With the help of machine learning and natural language processing, we classify the tweets into four disease characteristics namely, Symptom, Prevention, Transmission, and Treatment. Once the tweets were classified, topic modeling was performed using Latent Dirichlet Allocation (LDA) to generate underlying patterns within each disease characteristics. Such analysis can help to gain a deeper understanding of the content of tweets pertaining to Zika.