stuck in Monte Carlo

I knew what the Monte Carlo feature looked like in my head but I got bogged down reading the wikipedia article. Ultimately the season-simulator I’m thinking of is what I think something like 538 is using:

  • best-guess probabilistic model to assign probabilities of the discrete event outcomes (games)
  • overall evaluator that can determine larger season outcome from a series of games
  • harness that applies some good randomness to repeated runs of all the events
  • roll that up into a probability-weighted overall outcome list.

It’s clear from the article that Monte Carlo can be used in a lot of other scenarios but let’s just write some psuedo-code:

def monte_carlo_simulation(time_sorted_games):
    # get a predictor object
    predictor_object = PredictorAPI.get_instance()
    scoreboard = PossibilityScoreboard("Monte Carlo Simulation")
    for run_id in range(1, 10000):
        # make a clean copy of the games
        local_games_copy = []
        for cur_mcc_game in time_sorted_games:
            local_games_copy.append(copy.copy(cur_mcc_game))
        for cur_mcc_game in local_games_copy:
            if (cur_mcc_game.away_points is None) :
                # simulate it
                predictor_object.simulate_game(cur_mcc_game)
        standings = build_standings(local_games_copy)
        ordered_standings = sorted(standings.values(), reverse = True, key = standings_sortfunc)
        break_ties(ordered_standings, local_games_copy, [])
        scoreboard.record_winner(ordered_standings[0].team_name)
    print(str(scoreboard))

The only part that’s truly “pseudo” is the predictor_object. That’s a pretty big missing piece though. So how can we predict game outcomes and write clean code to do it?

The single best predictor of games available to us is the closing Vegas line. This is true in the overwhelming majority of cases. (If there were some other system that consistently out-predicted Vegas available to the public it would be used to win so much money that it would be arbitraged into the Vegas lines quite soon.) Vegas lines are only available for the next game, however. We want to assign probabilities to every event on the schedule.

Enter the Elo rating system. To make another long wikipedia article short, Elo numbers are a power ranking within a closed system that can be plugged into a simple formula to give a probability of the result of a future contest. One good reason to use Elo is that the cfbd API calculates it and we have a ready Elo endpoint. We also have a good python3 code sample and lots of background from the cfbd creator thanks to this deep dive post. Always good to jump on the coattails of a fresh project. (538 has also written publicly about how they use Elo.) Elo numbers are clean and the formula to calculate P(outcome) is one line long.

Elo isn’t perfect, obviously. For one thing Elo was created for chess and has no “native” understanding of home field advantage. So custom sports-Elo implementations have to bolt on HFA somehow. Same goes for margin of victory. From a software architecture point of view one model comes to mind: abstract interface/superclass that defines our ideal predictor; custom implementations for different predictive strategies. That way we could implement a few “reference” predictors: home team always wins, the 50/50 coin flip, a naive record+rank predictor.

Some real java-style pseudo-code:

class Predictor {

    // fill in a score prediction into the game object
    //
    public abstract void PredictGame(game);

    public float random_decimal() {
    	// implement a wrapper to good system random
        // that all subclasses can depend on
        //
    }
}

class ELO_Predictor extends Predictor {

    public void PredictGame(game) {
        // find Elo values for teams given in game
        // find P(win) using known Elo formula
        // adjust for HFA
        // roll the dice by testing random_decimal against the P(win)
        // fill in score in game object
    }
}

The more I stare at this the more I realize biting off the whole Elo 538 clone in one go is foolish. Better to get the MC harness working with some very basic reference predictor, like “home teams win 55% of the time.”

Here’s the combined diff. It basically looks like the above p-code except our implemented predictor just does a coinflip-with-HFA. The new verbose output now includes a Monte Carlo run with that predictor:

$ python3 ./mcc_schedule.py -v
San José State 7 at USC 30 on Sep 04, 2021
Stanford 42 at USC 28 on Sep 11, 2021
Fresno State 40 at UCLA 37 on Sep 18, 2021
UCLA 35 at Stanford 24 on Sep 25, 2021
San Diego State 19 at San José State 13 on Oct 15, 2021
Fresno State 30 at San Diego State 20 on Oct 30, 2021
UCLA at USC on Nov 20, 2021
California at Stanford on Nov 20, 2021
Fresno State at San José State on Nov 24, 2021
California at UCLA on Nov 26, 2021
USC at California on Dec 03, 2021

Full Enumeration Simulation:
Fresno State 19 [59%]
USC 4 [12%]
California 5 [15%]
UCLA 4 [12%]

Monte Carlo (Home Team Predictor) Simulation:
Fresno State 6667 [66%]
UCLA 1377 [13%]
USC 1339 [13%]
California 616 [6%]

Fresno State            2-0
San Diego State         1-1
UCLA                    1-1
Stanford                1-1
USC                     1-1
San José State          0-2
2021, 11, ,

This confirms our gut that Fresno State is still a favorite and that Cal’s path is especially tough, since they have to run the table that includes two road games. Every run produces slightly different Monte Carlo totals but the percentage points rarely differ.

Note the schedule change: Cal/USC was covid-cancelled this past weekend but has been re-scheduled for December. Good to see the powers that be recognizing a crucial MCC game must be played! If Cal comes in 2-0 that will be their championship game. USC has a path to be alive by then as well. (They need a win and a Fresno loss.)