Faking 2022 – Mythical California Cup

I am impatient for cfbd to populate the 2022 schedules so I can see how the Monte Carlo stuff performs on “fresh powder.” The good news is with the testing harness we can fake up the announced schedule without too much trouble. This should be it:

def real_life_future_schedule():
    stanford = [ 24, 'Stanford']
    cal = [ 25, 'California' ]
    sdst = [ 21, 'San Diego State' ]
    ucla = [ 26, 'UCLA']
    usc =  [ 30, 'USC' ]
    sjst = [ 23, 'San José State']
    fresno = [ 278, 'Fresno State' ]

    games = {}
    games[1] = create_game(1, ucla[0], ucla[1], stanford[0], stanford[1], None, None, create_date(-100))
    games[2] = create_game(2, ucla[0], ucla[1], usc[0], usc[1], None, None, create_date(-101))
    games[3] = create_game(3, cal[0], cal[1], ucla[0], ucla[1], None, None, create_date(-102))
    games[4] = create_game(4, usc[0], usc[1], cal[0], cal[1], None, None, create_date(-103))
    games[5] = create_game(5, cal[0], cal[1], stanford[0], stanford[1], None, None, create_date(-105))
    games[6] = create_game(6, usc[0], usc[1], stanford[0], stanford[1], None, None, create_date(-106))
    games[7] = create_game(7, fresno[0], fresno[1], sjst[0], sjst[1], None, None, create_date(-107))
    games[8] = create_game(8, usc[0], usc[1], fresno[0], fresno[1], None, None, create_date(-108))
    games[9] = create_game(9, fresno[0], fresno[1], sdst[0], sdst[1], None, None, create_date(-109))
    games[10] = create_game(10, sdst[0], sdst[1], sjst[0], sjst[1], None, None, create_date(-110))
    
    return games

Doing the teams as unstructured tuples is kind of weak but all we’re looking for is a quick and dirty way to see the games. Here’s what a run looks like now. You’ll note that the dates are wrong, since all that matters is in the future I didn’t try to finesse them perfectly.

Stanford at UCLA on May 03, 2022
USC at UCLA on May 04, 2022
UCLA at California on May 05, 2022
California at USC on May 06, 2022
Stanford at California on May 08, 2022
Stanford at USC on May 09, 2022
San José State at Fresno State on May 10, 2022
Fresno State at USC on May 11, 2022
San Diego State at Fresno State on May 12, 2022
San José State at San Diego State on May 13, 2022

Full Enumeration Simulation:
USC 152 [14%]
San Diego State 144 [14%]
San José State 144 [14%]
California 128 [12%]
UCLA 124 [12%]
Stanford 124 [12%]
Fresno State 116 [11%]
No Winner 92 [8%]


Monte Carlo [Sampled Home Margin Predictor] Simulation:
USC 1712 [17%]
Fresno State 1606 [16%]
UCLA 1476 [14%]
California 1437 [14%]
San Diego State 1398 [13%]
San José State 1163 [11%]
Stanford 1137 [11%]
No Winner 71 [0%]

Monte Carlo [Elo Predictor] Simulation:
USC 2057 [20%]
Fresno State 1704 [17%]
California 1566 [15%]
UCLA 1482 [14%]
San Diego State 1368 [13%]
San José State 933 [9%]
Stanford 826 [8%]
No Winner 64 [0%]

There are no standings, possibly because no games were completed.
2016, 10, ,

The interesting thing is the blocks of predictor scoreboard results. First thing to note is that the printout is now arranged in descending order of probability which makes scanning big tables easier. Second is this diff where we now track a “no winner” scenario in the scoreboard. Before we weren’t bothering to actually see if the tie-breaking worked when we ran it inside the tight Monte Carlo loop. But we’ve been spending so much time getting tie-breaking right we should try to respect it in the simulations. A failure here represents some actual resolution failure.

With no games played yet, the “full enumeration” scoreboard has a very wide spread of possibilities. (Sanity check: all the totals add up to 2^10 = 1024.) We expect this to be roughly even so USC’s slight advantage comes from its extra game: 4-0 beats 3-0 so there’s a whole leaf of the possibility tree it can own there. It also holds tie-breaker of sorts over the G5 teams because of its game against Fresno State. (Unfortunately it looks like that’s the only P5/G5 crossover this year.) There are 92 distinct “no winner” scenarios. That suggests that surrendering on 4-way ties is going to bite us sooner than we think.

In the Monte Carlo simulations we see a much stronger advantage for USC at the top, especially in the Elo sim. Not bad! USC will indeed be a favorite in those three games barring something strange. I think a human would give them more than a 20% chance at the title but there’s no way Elo can account for the coaching hire and transfer portal action of the past month. We would need a more sophisticated simulator based on composite roster talent or something.

In the real-world sims the unresolvable ties are very unlikely, under 1% each.

Comparing the two Monte Carlo predictors gives a sense of how much team quality is adding to just raw home field advantage. The “Sampled Home Margin Predictor” is just throwing typical home/road results at the matchups. Stanford plays all its MCC games on the road so it has a structural disadvantage beyond being foreseeably terrible. Overall I am pleased by the simulator code on a fresh season. We didn’t get this code done until only a few games were left last season.

Just for fun, here’s another test where we clone the 2022 and add fake scores to give us one possible four way tie:

San José State 34 at San Diego State 20 on Oct 05, 2021
San Diego State 21 at Fresno State 18 on Oct 06, 2021
Fresno State 33 at USC 8 on Oct 07, 2021
San José State 2 at Fresno State 40 on Oct 08, 2021
Stanford 30 at USC 20 on Oct 09, 2021
Stanford 24 at California 12 on Oct 10, 2021
California 33 at USC 8 on Oct 12, 2021
UCLA 15 at California 24 on Oct 13, 2021
USC 28 at UCLA 31 on Oct 14, 2021
Stanford 14 at UCLA 30 on Oct 15, 2021

TBRK 4 is too many tied teams for us.
2016 final standings

UCLA                    2-1
Stanford                2-1
California              2-1
Fresno State            2-1
San José State          1-1
San Diego State         1-1
USC                     0-4

could not resolve a winner for 2016
2016, 10, ,

All it takes is for USC to go 0-4 and we can start to see some real chaos! UCLA’s one loss is to Cal, Stanford’s one loss is to UCLA and Cal’s one loss is to Stanford. Fresno State’s one loss is to SD State. I’m not sure how to resolve this yet.

The bigger question this raises is: do we still have a problem with three way ties if there are circularities like this in head-to-head results? I think we do. Here’s the same run as above with one result tweaked: Fresno State loses an extra game so we’re only three at the top:

San José State 34 at San Diego State 20 on Oct 05, 2021
San Diego State 21 at Fresno State 18 on Oct 06, 2021
Fresno State 23 at USC 33 on Oct 07, 2021
San José State 2 at Fresno State 40 on Oct 08, 2021
Stanford 30 at USC 20 on Oct 09, 2021
Stanford 24 at California 12 on Oct 10, 2021
California 33 at USC 8 on Oct 12, 2021
UCLA 15 at California 24 on Oct 13, 2021
USC 28 at UCLA 31 on Oct 14, 2021
Stanford 14 at UCLA 30 on Oct 15, 2021

TBRK 3-team tie
TBRK 3-team tie broken by H2H for UCLA vs Stanford
TBRK 2-team tie
TBRK tie broken by head-to-head
2022 final standings

California              2-1
UCLA                    2-1
Stanford                2-1
San José State          1-1
San Diego State         1-1
Fresno State            1-2
USC                     1-3

2022, 10, California, 2-1

The three-team tie-breaker code short-circuits on the first head-to-head and then the two-team tie-breaker happily fails on the next. But the first eliminated team also has a victory over the eventual winner! This suggests that our strategy isn’t rigorous enough and we need to detect circularities before declaring victory in the three-team tie-breaker. Dang.

(In this case all three teams have a common opponent, USC, but how common will that be? Is it even worth writing the three-actor common oppo code?)