[Dev] Re: Chandler Internationalization .6 Specification is ready for review

Nicholas Bastin nbastin at mac.com
Fri Jul 15 14:19:53 PDT 2005


On Jul 15, 2005, at 5:06 PM, Brian Kirsch wrote:

> Hi Andrea,
> Thank for the feedback. Comments are included incline:
>
>
> I18n G.A.L. wrote:
>
>> Brian, et al,
>> I'm not able to do a whole lot online, but I've read over the plan.  
>> Here are my comments:
>>  1.  In general, it's best to specify the character encoding scheme 
>> (or form) directly.  "Unicode" can mean UTF-8, UTF-16 (BE or LE), or 
>> UTF-32.  I recommend using UTF-8 wherever possible, unless working 
>> within a 16-bit oriented environment (such as Java), where I'd 
>> recommend UTF-16.
>>
>
> When referring to unicode lower case I am talking about  Python's 
> unicode object which can be utf-16 (BE or LE) or utf-32 depending on 
> the platform it was compiled on.

It's actually UCS-2 or UCS-4, which, at least for the time being, makes 
making a 100% functional ICU wrapper impossible using the built-in 
python unicode object (see python-dev archives from a few months back). 
  Also, I would recommend using UTF-16 as your standard encoding if at 
all possible, as it avoids lots of nasty encoding problems, as well as 
being a nice space compromise for almost any language whose characters 
aren't a subset of Latin-1.

--
Nick



More information about the Dev mailing list