The 2021 NFL Draft is coming up this week- so I thought I might look at how last year's rookie class did in the 2020 NFL season and project some key statistics for the 2021 season. I looked at 4 players: Joe Burrow, Jalen Hurts, Justin Herbert and Tua Tagovailoa.

I’ve written some previous articles on my dive into bayesian statistics which can be found here: https://medium.com/codex/bayesian-inference-of-nfl-tds-6d4b1251eeda or https://ecavan.medium.com/predicting-nfl-first-downs-453a683a827d.

Basically, bayesian inference works by taking your prior beliefs and updating them using observed data. For example: I might think a coin is rigged, and so my prior might be (0.8…

In this post I wanted to compare two different ways of projecting next year’s TD (touchdown) totals for new Jeopardy host Aaron Rodgers given some prior knowledge (i.e his TD totals from previous years). If you simplify this example, it’s like if I flip a coin 3 times and I want to know what comes up next. There are 2 points of view: The Bayesian perspective and the Frequentist perspective. One is not necessarily better than the other, but they will both give us different answers. Check out this post for some work I did with Bayesian statistics with R…

One of my favourite parts about any sports drafts are the player comps (short for comparisons). This is where a scout will relate a player coming into a particular sports league, to a current or historical player. For example, “that Connor McDavid really has some shades of Sid the Kid in him”. Usually these comparisons are made without data — they rely on the ‘eye’ of the scout.

In the context of Data Science and Statistics, if we wanted to accomplish player comps using data we would be using clustering techniques. …

Machine learning generally has 3 uses:

- Regression (fitting a curve to some data)
- Classification
- Clustering

If you train a regression model with some features then you can make predictions by feeding into the model those same features from some new data. Classification is self explanatory — the model predicts the class of the data/image based on the classes it’s been trained on. Clustering is a mixture of the first 2, the trained model enables us to determine the classes in the data by fitting a curve between the data points.

Logistic regression is a special form of regression used for…

A projection system is a statistical model which takes into account historical data to predict future performance. Some famous projection systems include ZIPS:

Which is an advanced projection system for baseball players. Similarly, PFF (Pro Football Focus) is a site to go to for NFL projections.

The most basic projection system needs 3 things:

- Information about a players previous performance (data from previous seasons)
- A factor or aging curve which takes into account the fact that younger players are likely to improve their performance while older players are likely to see a decline in their performance.
- Information about the league…

Passions: Physics, reading, data science