some tests are better than none

I was hung up for a while on the idea of unit tests and a proper test harness and how to pickle test sets. Python3 pickle serialization package seems really nice but it occurred to me that the most bang for my buck is some kind of end-to-end test with artificially created schedule data. The beautiful thing is that we only use about 5 fields in the Game object. We can pack a lot more information into code-created data than trying to store real-life data with tons of extra cruft. Even if we used pickle we’d be spending a lot of time crafting perfect torture test scenarios so why not just skip the serialization step entirely?

In fact with these two methods we’re basically there as far being able to make skeleton game objects for testing:

# fake a date in the CFBD format.
# negative past means date is in the future.
#
def create_date(days_in_past):
    today = date.today()
    delta = timedelta(days=days_in_past)
    gameday = today - delta
    return gameday.strftime(constants.CFBD_DATE_FMT)

def create_game(game_id, home_id, home_team, away_id, away_team, home_points, away_points, start_date):
    game = cfbd.Game()
    game.id = game_id
    game.home_id = home_id
    game.home_team = home_team
    game.away_id = away_id
    game.away_team = away_team
    game.home_points = home_points
    game.away_points = away_points
    game.start_date = start_date
    return game

So now we have schedule_maker.py for all our test schedule needs. We plug that in to the existing find_mcc_games() in our main code with a big switch for testing versus not testing. My idea for the polish on this is to make the testing switch responsive to an environment variable that says testing yes/no and which test to use. That we can keep the rest of the code naive to tests and run a harness that invokes over and over again with different env variables.

Not the most elegant but it will get at the main problem for now: torturing our resolution algorithms with sticky scenarios we’ve only found in limited places in the wild.

We already found some great bugs: the common opponent margin calculation was very brittle for schedules with multiple games against one opponent. That doesn’t come up very much in the real world, but it is possible even today with conference championship games. And of course we have seen how in the past teams did schedule the same team twice in one year. Now that common oppo check is re-factored completely with a clean mapping of each team’s margins and then running the intersection on that.

Obviously as it stands now we’re still just eyeballing results and regressions, so the next part of testing is to capture the expected results and make the harness actually check against that. I should be able to find something that does that rather than write the harness.

Published
Categorized as code