NWSL Elo Methodology

The Elo rating system is a method for calculating the relative skill levels of players in competitor-versus-competitor games such as chess. It is named after its creator Arpad Elo, a Hungarian-born American physics professor. [1]

Inspired by EloRatings.net, the FIFA Women’s World Rankings, and Nate Silver’s work, I created an Elo system for the National Women’s Soccer League (NWSL). Here’s what I did:

Elo after 1st weekend of gamesAt the start of the first NWSL season (2013), each team is assigned a 1500 Elo rating. For every game, we compare expected results to actual results and adjust both team’s ratings (one up and one down by the same amount). The average rating is always 1500. The math is the same as that outlined here, with a few choices we had to make. We’re using a K value of 20 for regular season games, 30 for playoff semi-finals and 40 for finals. We’re also using a home field advantage of 64 (added to the home team’s rating before calculating expected results and points change after the match).

Because there was no prior history to draw on for that first weekend’s games, all home teams had an expected win percentage of  59.1% while visiting teams 40.9%. Three of those first week’s games ended in a draw, with the home team in that case losing about rating points while the away team gained about 2 points. Sky Blue beat the Flash 1-0, gaining and losing about 8 rating points, respectively.

When the favored team wins, the two teams’ ratings will move less than when the underdog wins, and the math just rolls from there for each week of the season.

regression toward mean graph'13-'14 regression toward mean

How does one season relate to the next?

In chess or international football, there is no concept of a “season.” We could simply ignore seasons altogether, but there’s so much change in rosters from year to year that doesn’t seem right. What we do instead is leave all teams in the same order, but squish the ratings together using “progression toward the mean.” In our case we regress by 50%. A team at 1400 moves to 1450 and a team at 1550 moves to 1525, etc. We could choose a greater or lesser regression, and we may need to revisit this in the future. For now it seems about right.

If you have questions or suggestions about any of this, I’d love to hear from you. @TedSarvata

To Top