[Dev] Chandler Internationalization .6 Specification is ready for review

Brian Kirsch bkirsch at osafoundation.org
Fri Jul 15 17:57:25 PDT 2005


Hi Grant thanks for the feedback. Please see comments in line.

Grant Baillie wrote:

> Hi, Brian
>
> I had a look at the 0.6 i18n spec: Overall, it's a very nice carving  
> of a set of tasks out of a big chunky rock of a problem :). Comments  
> are below; quotes are from the doc and the text in [] is an attempt  
> to identify which part of the doc I'm talking about.
>
> [Overview][Goals and Objectives]
>
>> The goal of this development cycle is to move Chandler from the 8- 
>> bit english only space to the world of internationalization and  
>> unicode.
>
>
> When you say "8-bit english", do you really mean "ascii" (a.k.a. "7- 
> bit")?

Yes and no. I am referring to the Python string type which is 8-bit so 
saying ascii would technically not be correct. But really yes I am 
talking about ASCII.

>
> Nit-picky naming question: Should it be i18nManager or I18Manager? I  
> guess it's hard to distinguish "I" and "1" in sans-serif fonts.
>
Ok sure I18NManager it is. Also FYI, the ResourceLoader is being renamed 
the I18NLoader since it is going to be used going forward for more than 
just loading of media resources.

> [0.6 Strategy]
> Should maybe have a section about making our wx dialogs localizable.  
> Would this be done by having localizable .xrc files, and if so, are  
> there ways of making sure UI elements don't get "lost in translation"?
>
I think this is an important point but I an not sure this needs to be 
addressed in .6. Off the top of my head I would guess it is going 
require a custom .xrc file for each locale that requires layout 
alterations. This is gonna be a major pain not to mention the 
enforcement issues.


> [The 0.6 Strategy][CPIA]:
>
>> 3. Ensure that WxWidgets correctly converts keyboard input commands  
>> to displayable glyphs in the correct language and perform character  
>> set conversion when incoming textual data is not unicode
>> 4. Ensure that all displayable blocks render multi-byte unicode  
>> correctly.
>
>
> It might be worth adding to 4: wxWidgets needs to be able to display  
> general multi-byte unicode correctly (not just multibyte unicode  
> entered via an input manager). It would be interesting to see how,  
> say, email messages with mixed R2L and L2R text will display on all  
> platforms.


+1
R2L what is that? Ha hah

My guess is Chandler would barf but R2L is not a priority for 1.0 
anyways. Would be great to have though.

>
> <<<On reading further, this seems to end up being one of John  
> Anderson's tasks:
>
>> wxWdigets that display text are wrapping native Operating System  
>> widgets
>
>
> Is this realistic? e.g. aren't our table & grid controls non-native  
> (I'm no wx expert, so I could easily be off base here)?>>>
>
Good point I will follow up with the CPIA team and see what the best 
option is here.



> From [Tasks for Katie]:
>
>> Target languages of review could include Chinese, German, and Hebrew.
>
>
> Hebrew (or Arabic for that matter) is a good one, since it will  
> unearth a large can o worms... Our UI layout is pretty much hard- 
> coded to be L2R: (e.g. positioning of (sidebar, summary, detail)  
> views, position of icons in sidebar, alignment of text in sidebar,  
> position of labels in detail view, alignment of text in detail view).
>
> It would be good to call out whether this kind of layout  
> configurability is or is not

> a goal for 0.6 (my guess is not).


Complex layout is not a goal for .6 so I will make it more clear in the 
spec.

Hebrew is merely for the design team to review. The point is to get 
people thinking about these larger issues now even though they will not 
be tackled in the 1.0 timeframe. R2L also will not be tackled but German 
or Chinese might.



>
>
> [General]
>
> Somewhere, we probably need to point out that there is a lot of QA  
> work involved in verifying that Chandler works well with input  
> managers for different languages (especially given that these vary by  
> platform, too). Maybe this would be an area where outside volunteers  
> could help out.
>
Agreed. I was waiting for feedback on the i18n proposal before pointing 
out specific QA tasks to make sure there were not going to be any major 
proposal changes.

QA will have a very important job in Internationalization.  Both manual 
and automated tasks will need to be performed on each target platform 
and for each target language. It is not a trivial task by any means. 
Volunteers are certainly needed.





> <<<I wrote this before seeing Andrea's page, which is a much more  
> comprehensive outline of QA issues here>>>
>
> [Some questions about the bigger proposal doc]
>
> Are we going to output # c-format (or something similar) in .pot  
> files? In projects I've worked on in the past, inconsistent  
> translated format strings have caused a lot of grief (unexpected  
> raises, or crashes in C), and it would be good to be able to avoid this.


Since we are using the Python gettext api the #c-format should not come 
in to play. Python gettext does not put any #c-format comments in .pot 
files. I can not think of a need for theses macro's at this time. Do you?

>
> Lastly, because I'm a gettext newbie: Many translations require some  
> context (e.g. in the case of formatted strings, what the arguments  
> are). Is the gettext approach that translators figure that out from  
> the source file? (Ours was more that you'd add a comment in the  
> equivalent of the .pot file).
>

We are going to be using the PyICU MessageFormat syntax and 
MessageFormat class for translations. As such the syntax is pretty 
explicit on the types for each argument.

MessageFormat.formatMessage( _("At {1,time} on {1,date}, there was {2} 
on planet{0,number,integer}."), args)

So the .po would contain:
msgid "At {1,time} on {1,date}, there was {2} on planet{0,number,integer}"


The gettext api does not provide any mechanism to put comments inline 
via code. But it would be nice to have additional comments.


#Argument zero is the number of planets as a integer
#Argument one is a PyICU Date.
#Argument two is a unicode string
msgid "At {1,time} on {1,date}, there was {2} on planet{0,number,integer}"

This would have to be custom code. It is certainly worth exploring at a 
later date post .6.


Have a good weekend,
Brian

> --Grant
>
> Grant Baillie
> Open Source Applications Foundation
> http://www.osafoundation.org
>
>
>
 

-- 
Brian Kirsch - Email Framework Engineer
Open Source Applications Foundation
543 Howard St. 5th Floor
San Francisco, CA 94105
(415) 946-3056
http://www.osafoundation.org



More information about the Dev mailing list