BRELO + Players

March 18, 2019, 9:28 p.m.



Since posting my model methodology earlier this year I started looking into incorporating player information into my model, as seems to be the trend this year with a number of other modelers also including player information. Not wanting to get too far behind the pack I've decided to join in on the player action and incorporate player information into my match predictions.

There are a few challenges with this, the most pressing being:

  1. How do I measure player value?
  2. Do I combine this with my existing team based model or come up with a new model?
  3. How do I ensure my predictions are up to date (to ensure they're reflected on Squiggle and Monash) with late changes possible mere minutes before the opening bounce?

1. Measuring player value

I've decided to use the AFL player ratings as my proxy for player value. This takes into account every stat under the sun over a player's last 2 seasons (and a maximum of 40 games).

AFL player ratings are good as they are designed so that if you look at a particular match and the rating points accumulated by each team, the difference between the teams is very highly correlated with the margin (\( r \approx 0.96\)).

The player ratings for each player are based on the ratings points they have accumulated over the last 2 seasons (and max 40 games), discounting the 31st - 40th matches progressively less than the 1st - 30th matches.

I have decided to use this measure to value players given a lot of the work in valuing player performance has already been done and it directly translates to score board impact.

However some weaknesses include:
  • Players start at 0 points. I probably can't do much about this but I would think debutants contribute slightly more than nothing.
  • The last 2 seasons (and max 40 games) seems kind of arbitrary. Maybe some other number of games would work better but this will be good enough for now.

2. How to incorporate player information

For this challenge I decided to stay simple and use the player information in a regression to predict the home team margin for a future match. Since player ratings accumulated in game correlate to margins in the same game, they probably have some predictive power in future games too.

So the regression I've gone with can be expressed as:
\[\text{Margin Predict} = \mu = a \times \text{Margin Predict (BRELO)} + (1-a)\times b \times (\text{Rating Points Home - Rating Points Away})\]
where \(b\times(\text{Player Ratings Home - Player Ratings Away})\) can be interpreted as the margin prediction using only player information, while \(\text{Margin Predict (BRELO)}\) is using only team information. We then take a weighted average of these with weights \(1-a\) and \(a\) respectively to come up with our final margin prediction.

I can then plug this into \(\Phi (\frac{\mu}{\sigma})\) where \(\sigma\) is the same value used in my BRELO model to convert this to a probability.

To train this model I sourced AFL player ratings for players in all matches I could find, my BRELO margin prediction for the same matches and split the matches into a random 80% train sample and a 20% test sample.

To optimise the parameters I fitted on MAE, and I was able to improve MAE from 28.3 using only my BRELO model to 27.4 using player information as well, a 0.9 point improvement!

The optimal parameters came out to be \(a=0.631\) and \(b=0.0324\).

To make sure we haven't overfitted to the training dataset we then use these parameters to estimate predictive power on an unseen portion of data.

Running this on my 20% test sample improved MAE from 29.8 to 29.2, a 0.6 point improvement which is smaller than the train sample improvement (as expected) but still a great result.

To put this into context, below are my BRELO model performance in terms of MAE alongside the MAE of betting markets using the closing line from data provided here for 2016-2018 (BRELO was trained using data up to 2015 so I am ignoring prior to 2016 here).

Year BRELO Bookie
2016 29 28.7
2017 29.4 29.0
2018 26.4 26.0

My model performed well, but slightly worse than the bookies on average. If I can get anything like a 0.6 point improvement in MAE from BRELO this year by using player information, the season is looking very promising.

With confidence this model can legitimately improve predictions going forward I then re-trained on all the data to get the final optimal parameters of \(a=0.65\) and \(b=0.03215\).

In season this model (named BRELOP) will be an extension of my BRELO model and will not interact with it. So BRELO will keep running and updating team strengths the same way (i.e. no special treatment in ranking updates due to having a weaker than expected team play). This may be something I look at later but with the season beginning this week I'm kind of out of time...

The prediction with player information included will sit on this page and will be displayed with an * next to the teams if player information has been considered. I won't be displaying both BRELO and BRELOP (BRELO + P for players), just BRELOP, or if player information isn't available, BRELO.

3. Keeping predictions up to date

For this, I have been frantically working to deploy a bot that will regularly check the AFL website for teams and update my predictions if there is a new team lineup released or if there is a change to existing teams. Hopefully this will allow me to incorporate most of the late player changes in my predictions without having to sit in front of my computer all weekend.

This is new to me so I expect something to go wrong but hopefully my predictions don't go crazy due to some coding error, we'll see...

That's it for now, bring on 2019!

Comments