Text Analysis

Election prediction: Predicting the outcome of the BC 2017 provincial election based on Reddit user sentiment. Accurately predicted John Horgan’s win! Link to Medium article.

Twitter sentiment analysis: Exploration of Justin Trudeau’s Twitter feed. Found that Trudeau tends to be more positive in English than French. Link to Jupyter notebook.

Resume text analysis: Upload your resume to a web application to find your best job matches on Based on cosine similarity. Link to Github repository.

Stock Market Analysis

Rating stock market guru predictions: A system to verify the accuracy of public forecasts of stocks by “gurus” or other individuals who like to post online. This is the description of the idea of the project, and how I would implement this using Python and SQL. Link to Github repository.

Churn Prediction

Vancouver Symphony Orchestra: Using data from Tessitura to predict customer churn. Link to Github repository.

Data from Kaggle

Diagnosing schizophrenia: Group project for Statistical Machine Learning class. We used a dataset from a Kaggle competition that required us to automatically diagnose patients with schizophrenia. Link to Github repository.

Wine reviews: Exploring a data set from Kaggle on various wines, and predicting price with linear models (lasso, ridge, PCR). Link to GitHub repository.

Data Exploration and Visualization

Vancouver public art: Exploration and visualization of the public art in Vancouver since 1936. Link to Jupyter notebook.

Quick Work

English to Cantonese translator: Translating from English to Cantonese Jyutping (Romanization) without any training data, or neural networks. This is a quick hack that uses two translation websites to do the work for us. This isn’t really a project, but I really liked what I came up with, and it solved a problem I had. Link to Github repository.