Quantifying Home Field Advantage in the NFL Using Linear Models in R

If you pay attention to NFL football, you’re probably used to hearing that homefield advantage is worth about 3 points. I’ve always been interested in this number, and how it was derived. So, using some data from FiveThirtyEight, along with some linear modeling in R, I attempted to quantify home field advantage. My analysis shows that home field advantage (how much we expect the home team to win by, if the teams are evenly matched) is about 2. [Read More]

Iterating on a 2016 Election Analysis

Jake Low wrote a really interesting piece that presented a few data visualizations that went beyond the typical 2016 election maps we’ve all gotten used to seeing. I liked a lot of things about Jake’s post, here are three I was particularly fond of: His color palette choices Each color palette that was used had solid perceptual properties and made sense for the data being visualized (i.e. diverging versus sequential) He made residuals from a model interesting by visualizing and interpreting them He explained the usage of a log-scale transformation in an intuitive way, putting it in terms of the data set being used for the analysis. [Read More]

Everything I Know About Machine Learning I Learned from Making Soup

Introduction In this post, I’m going to make the claim that we can simplify some parts of the machine learning process by using the analogy of making soup. I think this analogy can improve how a data scientist explains machine learning to a broad audience, and it provides a helpful framework throughout the model building process. Relying on some insight from the CRISP-DM framework, my own experience as an amateur chef, and the well-known iris data set, I’m going to explain why I think that the soup making and machine learning connection is a pretty decent first approximation you could use to understand the machine learning process. [Read More]