Election Prediction (Sentiment Analysis)

Abstract:

In this analysis, I aim to determine if the sentiment of Reddit users towards a particular candidate is enough to determine the outcome of the 2017 provincial election in British Columbia. I perform a sentiment sentiment analysis of Reddit posts in the r/Vancouver and r/BritishColumbia subreddits, to determine how they feel about Christy Clark of the B.C. Liberal Party, and John Horgan of the B.C. New Democratic Party.

Skills:

This Python-based project demonstrates my experience with web scraping, data cleaning, and sentiment analysis using the VADER package.

Discussion of VADER:

VADER is a rule-based lexicon that has been trained on social media data. VADER outputs sentiment as a compound score, which its developers describe as a “normalized, weighted composite score.” The compound score can be further divided into negative, neutral, or positive sentiment. VADER is an elegant package that is simple and able to perform as well or better than machine-learning based approaches.

Data:

The data is from most recent 1000 posts each of the two most politically active subreddits in the province: r/Vancouver and r/BritishColumbia subreddits. Posts later than 1000 posts were not related to the election. Three limitations are the small sample size, which was on the order of 100 posts only, the sample composition, and the relationship of sentiment to voting intention.

Results:

According to the two subreddits, the net positive sentiment (positive sentiment – negative sentiment) towards the NDP is 13%, compared to 2.7% for the Liberals.

Considering the sentiment towards the individual candidates, John Horgan comes out again, at a net positive sentiment of 17%, compared to Christy Clark with 6%.

This may suggest that the NDP would win, but we have to keep in mind the caveats listed above. Furthermore, an analysis on the sentiment towards the Green Party gives a net positive sentiment of 100%.

Conclusion:

Combining the problems of sample size, sample composition, and relationship of sentiment to voting intent, this analysis cannot be used as a reliable indicator of the outcome of the election. However, this analysis shows the potential of sentiment analysis as a useful tool for election prediction. Interestingly enough, the outcome of the election ended up being in line with the sentiment of Reddit: Horgan won, and British Columbia found itself with a new provincial government.