[Cosmo-dev] CosmoUI Timezone Proposal
Matthew Eernisse
mde at osafoundation.org
Fri Oct 27 21:25:24 PDT 2006
Thanks, Adam, for the comments. It's really nice to get some input from
outside of OSAF on the list.
Bobby Rullo wrote:
> Tricky to do. My first instinct would be to package it by timezone, but
> multiple timezones reference the the same "rules" - eg. America/New_York
> and America/Los_Angeles both use the "US" rule of when to do the DST
> switch.
>
> Maybe we could split by countries? Mde, would that work? Do Zones ever
> reference rules from other countries?
Here's some stuff to consider WRT exceptions/country:
1. The exceptions are there because they point to country rules in that
particular file. Pacific/Easter references the Chile rules, which is why
it's in the southamerica file. Asia/Vladivostok points to Russia, which
is why it's in the europe file. Pacific/Honolulu is a part of the US,
which is why it's in the northamerica file.
Consumers of the northamerica file are much more likely to want
Pacific/Honolulu than consumers of the australasia file, which is where
most of the other Pacific zones live.
2. According to the 'Theory' file, they avoid country names for
timezones because the DB needs to:
"Be robust in the presence of political changes. This reduces the number
of updates and backward-compatibility hacks. For example, names of
countries are ordinarily not used, to avoid incompatibilities when
countries change their name (e.g. Zaire->Congo) or when locations change
countries (e.g. Hong Kong from UK colony to China)."
3. The Olson DB does change. According to Wikipedia (apply grains of
salt as needed), "New editions of the database are published as changes
warrant, usually several times per year."
>> Is it really ever safe to
>> generalize and say you only need "North American" time zone data?
>> Wouldn't you usually have to know either just your local timezone or
>> be prepared for any request?
>
> Yeah, you can't say you just need North America, unless you're really
> targeting a quite Xenophobic user-base.
Yes, you'd have to be prepared. Obviously whatever approach we take will
involve some sort of as-needed loading. We'd also like to make the cost
as small as is feasible for the people who don't need timezones for the
whole world.
>> Do the Olson files have the canonical names of all the timezones (e.g.
>> EST/EDT/ET? Or just America/New_York)
>
> Sort of. In the case you mention the zone name is "America/New_York" but
> has rules on how to format the abbreviated timezone to EST, EDT whatever
> depending on the time of year. But the way you find a zone is by the
> "America/New_York" type id.
As Bobby noted before, there are also a bunch of entries in the
'backward' file that have entries like 'US/Eastern' and 'US/Pacific'
that alias to the exemplar cities (Chicago, Los_Angeles).
>> In general, I'm not sure either the "pure Olson" or the "server-fu"
>> approach is suitable for Dojo, but perhaps we want to end up somewhere
>> in between? It's hard to say without really understanding the problem
>> better, but I'm wondering if it would be worth massaging the data such
>> that your exceptions no longer exist and/or packaging it so that it
>> can be pulled out in the right granularity?
Some thoughts on preprocessing:
Server-side libraries like Ruby's TZInfo -- and it seems the Linux zone
data as well -- take the approach of pre-expanding offset changes for a
large date range for each timezone (like 1950-2050). This is great for a
local machine or a server-side solution where space and bandwidth are
not a concern (obviously). That seems to be sort of the standard way of
dealing with all the different dependencies in the files -- assume the
data is all available at once during pre-processing, and compile each
timezone into a package.
Pre-compiling the data into JSON packages for each timezone might work
-- and then you wouldn't have to do any calculations, you could look up
the value by date range. But the resulting files themselves would then
be pretty large because of all the expanded dates. The data in
/usr/share/zoneinfo on my Ubuntu machine here is 5.5MB. The
'definitions' directory for Ruby's TZInfo is 3.4MB.
The amagamated data would be really big (I guess it'd have to be a
package separate from Dojo proper), and you'd also have the downloads of
the individual files to the browser. Just as an example, again with
TZInfo, the America/Los_Angeles file is 11KB. It's mostly data -- but
does include some object code, so the JSON equivalent would be a bit
smaller. But 8KB or so still seems a tad hefty for a single exemplar city.
Zipping it up would make it a lot smaller, but you'd still be fetching
each TZ you need individually, which would result in some connection
overhead. Obviously a server-side solution could grab all the expanded
TZ info for all the needed timezones and serve them up as a single file.
For comparison's sake, it looks like with comments and empty lines
stripped, the 'northamerica' Olson file clocks in at 29KB uncompressed.
That same stripped file after gzipping is 5.5KB. That file and the
europe file are the two biggies and are about the same size -- so if
you're zipping your data, you'd get all of North America and Europe for
around 11K.
Matthew
More information about the cosmo-dev
mailing list