This weekend is the NCAA Division I Mens Lacrosse Quarterfinals. I wrote a Python script to perform Monte Carlo analysis on game outcomes for two teams based on the ratings and average goal differential (AGD) provided by LaxNumbers. I’m not sold on the approach taken in the script but it takes into account the two teams relative ratings and their ability to score more goals than their opponents (via the AGD).

Predicted winners are Cornell, Syracuse, Notre Dame, and Maryland.

Click through for details.

Monte Carlo simulation is a statistical technique to approximate the outcome of an event by generating random samples from a probability distribution. Here, I am using Monte Carlo simulation to predict the number of wins for each team in a series of matches (games).

Here are the quarterfinal teams sorted by rating, high to low. AGD is average goal differential, GF is goals for, and GA is goals against:

Team Record Rating AGD GF GA
Cornell 15-1 99.99 6.12 275 176
Notre Dame 9-4 99.52 5.00 179 114
Maryland 13-3 98.43 3.12 177 127
Syracuse 12-5 98.42 3.88 250 183
Penn State 11-5 97.96 3.00 194 146
Richmond 14-3 97.27 5.88 246 147
Princeton 13-3 97.90 3.06 234 186
Georgetown 11-5 95.20 3.12 196 146

Sunday

  • Notre Dame, 9-4, rating 99.52, AGD 5.00, GF 179, GA 114

  • Penn State, 11-5, rating 97.96, AGD 3.0, GF 194, GA 146

  • Maryland, 13-3, rating 98.43, AGD 3.12, GF 177, GA 127

  • Georgetown, 11-5, rating 95.20, AGD 3.12, GF 196, GA 146

I have several versions of Python installed on my system so in the examples below I am specifying use of Python3.10.

For ND vs Penn State:

python3.10 mc_teams.py 5000 99.52 5.0 97.96 3.0

ND wins 2,882 games or 57.64% of the time.

For MD vs Gtown:

python3.10 mc_teams.py 5000 98.43 3.12 95.20 3.12

Maryland wins 2,817 games or 56.34% of the time.

Saturday

Cornell and Syracuse each won yesterday by a single goal but let’s look at the numbers.

  • Cornell, 15-1, rating 99.99, AGD 6.12, GF 275, GA 176

  • Richmond, 14-3, rating 97.27, AGD 5.88, GF 246, GA 147

  • Syracuse, 12-5, rating 98.42, AGD 3.88, GF 250, GA 183

  • Princeton, 13-3, rating 97.90, AGD 3.06, GF 234, GA 186

For Cornell vs Richmond:

python3.10 mc_teams.py 5000 99.99 6.12 97.27 5.88

Cornell wins 2,926 games or 58.52% of the time.

Let’s try entering the GF per game instead of the ratings for each team. So Cornell has 275 / 16 = 17.188 goals for per game and Richmond has 246 / 17 = 14.471 goals for per game:

python3.10 mc_teams.py 5000 17.188 3.88 14.471 3.06

In this case, Cornell wins 3,494 games or 69.88% of the time. This method significantly favors Cornell and I do not think is as good as the other.

For Syracuse vs Princeton:

python3.10 mc_teams.py 5000 98.42 3.88 97.90 3.06

Syracuse wins 2,640 games or 52.80% of the time.

Python Script, mc_teams.py is shown below. I’m not sold on the rating plus AGD to calculate a team’s “expected goals”, it’s more a total expected rating, but the approach of using the Poisson distribution is sound.

Code Explanation

The code consists of two main functions: simulate_match and monte_carlo_simulation.

  • The simulate_match function takes the ratings and AGD of the two teams as input and returns the number of goals scored by each team. It uses the following formulas to calculate the expected number of goals:

    • team1_expected_goals = team1_rating + team1_agd
    • team2_expected_goals = team2_rating + team2_agd
  • The monte_carlo_simulation function takes the number of trials and the ratings and AGD of the two teams as input and returns the number of wins for each team. It uses a loop to repeat the simulation for the specified number of trials and keeps track of the number of wins for each team.

  • The main function reads the command line arguments, calls the monte_carlo_simulation function, and prints the results.

Example Use Case

To use this code, you would need to run it from the command line and provide the following arguments:

  • num_trials: The number of trials to run the simulation for.
  • team1_rating: The rating of the first team.
  • team1_agd: The average goal difference of the first team.
  • team2_rating: The rating of the second team.
  • team2_agd: The average goal difference of the second team.
#mc_teams.py
import numpy as np
import sys

def monte_carlo_simulation(num_trials, team1_rating, team1_agd, team2_rating, team2_agd):
    def simulate_match(team1_rating, team1_agd, team2_rating, team2_agd):
        team1_expected_goals = team1_rating + team1_agd
        team2_expected_goals = team2_rating + team2_agd
        
        team1_goals = np.random.poisson(team1_expected_goals)
        team2_goals = np.random.poisson(team2_expected_goals)
        
        return team1_goals, team2_goals

    team1_wins = 0
    team2_wins = 0
    
    for _ in range(num_trials):
        team1_goals, team2_goals = simulate_match(team1_rating, team1_agd, team2_rating, team2_agd)
        
        if team1_goals > team2_goals:
            team1_wins += 1
        elif team2_goals > team1_goals:
            team2_wins += 1
        else:
            if np.random.rand() < 0.5:
                team1_wins += 1
            else:
                team2_wins += 1
            
    return team1_wins,ref team2_wins

if __name__ == "__main__":
    # Read command line arguments
    if len(sys.argv) != 6:
        print("Usage: mc_teams.py <num_trials> <team1_rating> <team1_agd> <team2_rating> <team2_agd>")
        sys.exit(1)

    num_trials = int(sys.argv[1])
    team1_rating = float(sys.argv[2])
    team1_agd = float(sys.argv[3])
    team2_rating = float(sys.argv[4])
    team2_agd = float(sys.argv[5])

    team1_wins, team2_wins = monte_carlo_simulation(num_trials, team1_rating, team1_agd, team2_rating, team2_agd)

    print(f"After {num_trials} trials:")
    print(f"Team 1 Wins, %: {team1_wins}, {team1_wins / num_trials}")
    print(f"Team 2 Wins %: {team2_wins}, {team2_wins / num_trials}")