[Chandler-dev] Re: Request for Coments on Chandler Internationalization / Egg Integration Proposal

Brian Kirsch bkirsch at osafoundation.org
Tue Jul 25 13:56:22 PDT 2006

Hi Phillip,
See comments inline.

Phillip J. Eby wrote:

> At 02:23 PM 7/21/2006 -1000, Brian Kirsch wrote:
>> Hello,
>> Attached is a proposal for the Egg I18n API as well as how to 
>> integrate the Egg I18n API in to Chandler.
>> The Egg I18n API's int
>> A short cut could also be added to the resource ini logic that ent is 
>> to provide an easy and robust means to localize Python applications and
>> would be distributed as a separate module / egg from Chandler.
>> The majority of the work detailed in the proposal has already been done.
>> I welcome comments / suggestions on the EggResourceManager API as 
>> well as on my proposal on how
>> to integrate this functionality in to the current Chandler I18n 
>> Architecture.
> Comments in no particular order:
> * getResourceAsString and getResourceAsLines don't document how 
> encoding is handled; must the underlying resource be ASCII, UTF-8, ...?

Yes, good point the encoding is important. I was trying to finish the 
proposal on Friday, so I did skip over
a few things which I was planning on adding documentation for during the 
implementation phase.

My initial thoughts were to require all resources to be UTF-8, but I do 
want the EggResourceManager API to
be as general as possible.

So here is what I would like to do:

1. Add an additional argument to the getResourceAsString and 
getResourceAsLines methods that specifies the
encoding of the resource. The default value will be 'UTF-8'.

getResourceAsString(self, domain, name, locale=None, encoding='UTF-8'):
getResourceAsLines(self, domain, name, locale=None, encoding='UTF-8'):

The encoding parameter must match the actual encoding of the resource 
file or
a UnicodeDecodeError will be raised by the methods.

If an invalid or unsupported encoding is specified a LookupError will be 
raised by the methods.

The GetResourceAsStream method will return the exact bytes of the 
resource file .
Any encoding conversions will need to be handled by the method caller.

> * getResourceAsStream claims that it is always a StringIO returned; 
> this seems like an implementation detail that shouldn't be exposed, 
> and should instead be described as a "file-like" object per Python 
> convention.

+1, I think this is a good suggestion.

> * I would suggest making the default filename use a .ini extension 
> rather than .info; .info has an existing meaning and .ini more closely 
> reflects the file format.

Yes, actually I was already thinking about doing this.

+1 the file name will be renamed to resources.ini

> * I would like to see a more prescriptive approach to domains. That 
> is, instead of saying "you can use whatever domain you like", I think 
> we should tell people what domains to use and precisely how to 
> formulate a domain string. While this is not an issue for the 
> underlying implementation (which need only handle mechanism, not 
> policy), I think that a clear domain policy is essential to making the 
> overall "ecosystem" work and minimizing the number of decisions that 
> developers have to make and get right.
> In any case, you can't *really* use "any unique string" as a domain 
> name. Based on implementation constraints, the string may not contain:
> * line feeds
> * closing square bracket ("]")
> * double colons ("::")
> because these would interfere with parsing of the .ini file. So, I 
> think we should spell out exactly what you can and can't have, and 
> make inclusion of the source project name mandatory, adding a dot and 
> another name if the project includes more than one domain. So if you 
> have a project like "Chandler-FeedsPlugin", then the domain should be 
> "Chandler-FeedsPlugin", or if there is more than one domain for the 
> project, you would have "Chandler-FeedsPlugin.foo", 
> "Chandler-FeedsPlugin.bar", etc.
It was my intent to put enforcement of specific domain names at the 
Chandler level and leave the
EggResourceManager API as generic as possible with the exception of the 
token parsing restraints you mentioned
above (line feed etc).

However, I am certainly not opposed to adding a "prescriptive" approach 
to domains at the
EggResourceManager API level.

I do think, as you suggested, that projects and domains should be 
equivalent to prevent confusion.

Since translations and implementation eggs can be distributed 
separately, as is the case with Chandler,
the limitation should be that a domain must match a project name. 
Meaning the domain does not have to match
*your* project name.

So taking the Chandler-FeedsPlugin example you use above.

The resource ini in the Chandler-FeedsPlugin would contain an entry as 
such as:

someResource = somePathToResource

The French translation for the Chandler-FeedsPlugin would be in a 
project named
Chandler-FeedsPlugin.fr and have a resource ini entry such as:

[Chandler-FeedsPlugin::fr] #this points to the project name of another egg
someResource = somePathToFrenchResource

There is still the issue of dependencies and versioning between 
localizations and
implementation distributions.

I would like to talk with you further on the best way to establish these 

Thanks for your comments and feedback,


Brian Kirsch 
Internationalization Architect/ Mail Service Engineer
Open Source Applications Foundation
543 Howard Street 5th Floor
San Francisco, CA 94105

More information about the chandler-dev mailing list