Just to keep myself honest this is version 0.1. Single year run with hard-coded school list. The dataset is a bit incomplete on stadium info, which is the only way to query the abstract location of a school. (I think.) So for now we’ll populate the CA schools. This is still incomplete, especially for the wonky 1940s stuff.
The general strategy is to query the games API for our teams and then loop through each checking the opponent is also on the list. Then we use a standings-specific sort to do rough standings order and then drill down tie breakers.
import cfbd
from datetime import datetime
configuration = cfbd.Configuration()
configuration.api_key['Authorization'] = 'secret'
configuration.api_key_prefix['Authorization'] = 'Bearer'
all_ca_teams = {25: 'California', 278: 'Fresno State', 16: 'Sacramento State',
21: 'San Diego State', 23: 'San José State', 24: 'Stanford',
302: 'UC Davis', 26: 'UCLA', 30: 'USC',
# hand-added becuse stadium data incomplete
1000003: 'Santa Clara', 1000004: 'San Francisco', 1000920: 'Saint Mary\'s',
1000034: 'Loyola Marymount', 1000044: 'Pacific', 1000867: 'Pepperdine',
1000007: 'California-Santa Barbara', 1000888: 'San Francisco State',
1000006: 'Long Beach State', 1000012: 'Cal State Fullerton',
# mis-classified stadiums
13: 'Cal Poly'
}
api_instance = cfbd.GamesApi(cfbd.ApiClient(configuration))
cur_year = 2012
mcc_games = {}
for team_id in all_ca_teams :
#print("looking for " + all_ca_teams[team_id])
all_teams_games = api_instance.get_games(year=cur_year, team = all_ca_teams[team_id])
for cur_game in all_teams_games :
#print(cur_game)
other_team_id = -1
if (cur_game.away_id == team_id) :
other_team_id = cur_game.home_id
else :
other_team_id = cur_game.away_id
if other_team_id in all_ca_teams :
other_team = all_ca_teams[other_team_id]
#print("This was a MCC game " + all_ca_teams[team_id] + " versus " + other_team)
mcc_games[cur_game.id] = cur_game
print(str(cur_year) + " had " + str(len(mcc_games)) + " MCC games")
def timesortfunc(mcc_game) :
fmt = "%Y-%m-%dT%H:%M:%S.%fZ"
return datetime.strptime(mcc_game.start_date, fmt)
time_ordered_games = sorted(mcc_games.values(), reverse = False, key = timesortfunc)
for cur_mcc_game in time_ordered_games:
print (cur_mcc_game.away_team + " " + str(cur_mcc_game.away_points) + " at " +
cur_mcc_game.home_team + " " + str(cur_mcc_game.home_points) + " on " + cur_mcc_game.start_date);
class StandingsRecord:
def __init__(self, wins, losses, team_name):
self.wins = wins
self.losses = losses
self.team_name = team_name
def __str__(self):
return self.team_name + "\t" + str(self.wins) + "-" + str(self.losses)
standings = {}
for mcc_game_id in mcc_games :
cur_mcc_game = mcc_games[mcc_game_id]
if (cur_mcc_game.away_points > cur_mcc_game.home_points) :
if (cur_mcc_game.away_id in standings) :
standings[cur_mcc_game.away_id].wins += 1
else :
standings[cur_mcc_game.away_id] = StandingsRecord(1, 0, cur_mcc_game.away_team)
if (cur_mcc_game.home_id in standings) :
standings[cur_mcc_game.home_id].losses += 1
else :
standings[cur_mcc_game.home_id] = StandingsRecord(0, 1, cur_mcc_game.home_team)
else :
# home team won
if (cur_mcc_game.home_id in standings) :
standings[cur_mcc_game.home_id].wins += 1
else :
standings[cur_mcc_game.home_id] = StandingsRecord(1, 0, cur_mcc_game.home_team)
if (cur_mcc_game.away_id in standings) :
standings[cur_mcc_game.away_id].losses += 1
else :
standings[cur_mcc_game.away_id] = StandingsRecord(0, 1, cur_mcc_game.away_team)
#print("raw standings")
#for team_id in standings :
# print(all_ca_teams[team_id] + " " + str(standings[team_id]))
def sortfunc(sr) :
return (sr.wins * 1000 / (sr.losses + sr.wins)) - sr.losses + sr.wins
ordered_standings = sorted(standings.values(), reverse = True, key = sortfunc)
#print("ordered standings")
print()
#for line in ordered_standings:
# print(line)
# enforce win minimum
while(True) :
first_place = ordered_standings[0];
if (first_place.wins <= 1) :
print("disqualifying one win " + first_place.team_name)
ordered_standings.pop(0)
else:
print("first place wins is " + str(first_place.wins))
break
# return -1 if team 1 is winner, 1 if team2, 0 if tie
def head_to_head_winner(team1, team2, all_games):
retval = 0
for mcc_game_id in all_games :
cur_mcc_game = all_games[mcc_game_id]
if (cur_mcc_game.home_team == team1 and cur_mcc_game.away_team == team2) :
if (cur_mcc_game.home_points > cur_mcc_game.away_points) :
# team1 was the home team and home team won
retval -= 1
else:
# team1 was the home team and the home team lost
retval += 1
elif (cur_mcc_game.away_team == team1 and cur_mcc_game.home_team == team2) :
if (cur_mcc_game.away_points > cur_mcc_game.home_points) :
# team1 was the away team and the away team won
retval -= 1
else:
retval += 1
else:
# this game was not head to head
pass
return retval
# return -1 if team 1 is winner, 1 if team2, 0 if tie
def common_opp_margin(team1, team2, all_games):
oppos = {}
gross_margin_team1 = 0
# first find all team1's margins
for mcc_game_id in all_games :
cur_mcc_game = all_games[mcc_game_id]
if (cur_mcc_game.home_team == team1) :
oppos[cur_mcc_game.away_id] = (cur_mcc_game.home_points - cur_mcc_game.away_points)
elif (cur_mcc_game.away_team == team1) :
oppos[cur_mcc_game.home_id] = (cur_mcc_game.away_points - cur_mcc_game.home_points)
else:
# team1 was not involved in this game
pass
# now we have all team1's margins organized by opponent
print("oppo check for " + team1 + " and " + team2)
#print(oppos)
for mcc_game_id in all_games :
cur_mcc_game = all_games[mcc_game_id]
if (cur_mcc_game.home_team == team2) :
if (cur_mcc_game.away_id in oppos):
# this is a common opponent with team2 as home team
this_team2_margin = (cur_mcc_game.home_points - cur_mcc_game.away_points)
gross_margin_team1 += (oppos[cur_mcc_game.away_id] - this_team2_margin)
#print("common oppo detected with " + cur_mcc_game.away_team + " gross team1 margin " + str(gross_margin_team1))
else:
# team1 didn't play them
# print(team1 + " didn't play " + cur_mcc_game.away_id);
pass
elif (cur_mcc_game.away_team == team2) :
if (cur_mcc_game.home_id in oppos):
# this is a common opponent with team2 as away team
this_team2_margin = (cur_mcc_game.away_points - cur_mcc_game.home_points)
gross_margin_team1 += (oppos[cur_mcc_game.home_id] - this_team2_margin)
#print("common oppo detected with " + cur_mcc_game.home_team + " gross team1 margin " + str(gross_margin_team1))
else:
# team1 didn't play them
# print(team1 + " didn't play " + cur_mcc_game.home_id);
pass
else:
# print(cur_mcc_game.home_team + " vs " + cur_mcc_game.away_team + " is not relevant")
# this is not a team2 game
pass
if (gross_margin_team1 < 0):
return 1
elif (gross_margin_team1 > 0):
return -1
else:
return 0
print()
# break ties
if (len(ordered_standings) > 2 and sortfunc(ordered_standings[0]) == sortfunc(ordered_standings[1])) :
print("looks like a tie for the cup")
# find head to head
h2h = head_to_head_winner(ordered_standings[0].team_name,
ordered_standings[1].team_name, mcc_games)
if (h2h < 0) :
print("Tie broken by head-to-head")
# proper team is in first
pass
elif (h2h > 0) :
print("Tie broken by head-to-head")
# promote second
improper = ordered_standings.pop(0)
ordered_standings.insert(1, improper)
else:
print("head to head didn't resolve anything")
oppo_check = common_opp_margin(ordered_standings[0].team_name,
ordered_standings[1].team_name,
mcc_games)
if (oppo_check < 0) :
# proper team is in first
pass
elif (oppo_check > 0) :
# promote second
improper = ordered_standings.pop(0)
ordered_standings.insert(1, improper)
else:
print("common opponent margin didn't resolve anything")
exit(1)
print("ordered standings")
print()
for line in ordered_standings:
print(line)
print()
print(str(cur_year) + " MCC winner is " + ordered_standings[0].team_name + " (" +
str(ordered_standings[0].wins) + "-" + str(ordered_standings[0].losses) + ")")
And this is what it’s dumping out…
2012 had 11 MCC games San José State 17 at Stanford 20 on 2012-09-01T02:00:00.000Z UC Davis 13 at San José State 45 on 2012-09-09T00:00:00.000Z USC 14 at Stanford 21 on 2012-09-15T23:30:00.000Z California 9 at USC 27 on 2012-09-22T22:00:00.000Z San José State 38 at San Diego State 34 on 2012-09-23T00:00:00.000Z San Diego State 40 at Fresno State 52 on 2012-09-30T02:00:00.000Z UCLA 17 at California 43 on 2012-10-07T02:00:00.000Z Stanford 21 at California 3 on 2012-10-20T19:00:00.000Z USC 28 at UCLA 38 on 2012-11-17T20:05:00.000Z Stanford 35 at UCLA 17 on 2012-11-24T23:30:00.000Z UCLA 24 at Stanford 27 on 2012-12-01T01:00:00.000Z first place wins is 5 ordered standings Stanford 5-0 Fresno State 1-0 San José State 2-1 USC 1-2 California 1-2 UCLA 1-3 UC Davis 0-1 San Diego State 0-2 2012 MCC winner is Stanford (5-0)
Some obvious failings: it doesn’t handle ties at all. It also doesn’t handle a possible three-way tie in the standings. The standings need to be padded. We need to refactor so year is a param and then run them all. The tie breaking produces some controversial results which might not be correct.
We should factor out the tie-breaking stuff which gets unwieldy and hide that down in a lib somewhere. Overall, I am pleased with the cfbd API.
About that sort…
def sortfunc(sr) :
return (sr.wins * 1000 / (sr.losses + sr.wins)) - sr.losses + sr.wins
This feels dirty… it’s a one-line way to combine the idea that winning percentage “wins” but more wins are better. I think it’s stable and correct. As I type this out it occurs to me that the whole idea of tie-breakers being separate from sort also feels algorithmically weak… if we have a stable way of ranking teams shouldn’t we be able to express that in a scored sort that we can apply once? But I think the nature of tie-breakers is that it’s not stable for the whole set, only one pair.