anatomy of a timezone fix

Timezone bugs and libraries are always hilarious fun so let’s take a look at what I did and what I should still do before I forget it. The diff in question.

The timestamps in the database are in UTC time. So it should be a simple enough matter to parse them as UTC and then print out the Pacific timezone equivalent. Unfortunately python’s default datetime object is timezone-aware, but doesn’t read the UTC timezone in the stamp so it ends up with false timezone (machine local). My fix surgically replaces the timezone so the stamp is correct, then uses the astimezone() method to give us the Pacific equivalent.

Q: That timestamp in the record looks pretty legit, can’t you parse it in place correctly and get to a valid stamp in one step? Yes, in fact the stamp in the record looks like it follows RFC 3339. (Sample: 2021-09-19T00:00:00.000Z) Here’s a medium post about RFC 3339 stamps. Python3 built-ins have a strptime() method to parse stamps but it looks like the method doesn’t accept the trailing Z as a simple zulu-time/UTC designator. Here’s a long python3-specific post about parsing 3339 problems that seems credible. So we have to just treat the ‘Z’ as a dead character and parse what precedes it.

Q: Looks like they used the lib pytz, why not just use that? Yes, in reading a bunch of stackoverflow answers and forum posts it seems like there are a few weak points in the python3 builtins that drive people to use pytz. I want to try to avoid too many unnecessary outside lib dependencies. There’s nothing more demoralizing than trying to steal use some code snippet only to realize it requires more lib dependencies. As long as we don’t have the perfect solution yet anyway, let’s try to work with the builtins.

Q: Not perfect? Yes, can you believe it this raises some questions. Why are we using a hand-rolled Pacific timezone as our default? Since this is the Mythical California Cup this will almost assuredly be correct for our purposes but we split this “virtual conference” library out so that it could be used in theory by any caller with an arbitrary group of teams. Sticking on Pacific will give us good day-of-game display for any US-based game, but what if we wanted to expand our printout to the exact time? It seems like the right thing to do here is go all times local, based on the exact spot where they were played. But to find that we need a timezone keyed to the venue, and here’s where the data model falls down a little. There is a venue_id in the games table but that points to “locations” which are only available as subservient records to a big separate query to the teams endpoint. If you have a venue_id match you can pull a timezone there. So it means keeping all that location data around and doing a local lookup. It would be a big new query with a suspect joining criteria and some extra work to build the lookup table and handle errors and ambiguity. Let’s put that as a future nice-to-have.

So in summary we have an economical solution which doesn’t depend on outside libs but is still a little hacky in that it doesn’t do a clean one-shot parse and also assumes a hard-coded Pacific offset as the timezone of record.

Categorized as code