[Chandler-dev] Unit and Functional Tests updated to handle Unicode
bkirsch at osafoundation.org
Wed May 31 17:05:47 PDT 2006
Yes the mbcs file encoding that is the default on Windows is the issue.
TestCrypto passes when the utf-8 charset encoding is used instead of sys.getfilesystemencoding.
I have already made a patch for all OS calls to use utf-8 instead of sys.getfilesystemencoding.
The utf-8 encoding is supported on all three platforms.
Still need to test further though to confirm that this is the right move.
>> Also I see a problem with the way uw is implemented, because it randomly
>> selects from 25 code points and inserts them in the strings, meaning
>> that you will likely see intermittent test failures (for example, on
>> Windows u'\u00C5' and u'\u00FC' work fine, but the rest don't).
I disagree. I like the fact that it is random and that introducing characters
can cause it to fail :) That models the real world to me.
I want to select from know unicode characters that cause issues in Chandler so that we can resolve the
However, it would be nice to have a clear means to log which characters caused a problem.
Printing the erroneous value to log (after encoding it of course) should provide enough of a trace.
>> uw() should be deterministic.
Can you give an example of what your are thinking. The previous test for TestCrypto was deterministic i.e. it always used the \u00FC character which passed testing. It was only after introducing the new characters from uw that this mbcs Windows error was exposed.
Brian Kirsch - Cosmo Developer / Chandler Internationalization Engineer
Open Source Applications Foundation
543 Howard St. 5th Floor
San Francisco, CA 94105
Heikki Toivonen wrote:
>Brian Kirsch wrote:
>> import sys, os
>> from i18n.tests import uw
>> u = uw("test")
>> path = os.path.join(os.path.dirname(__file__),
>In the tests I did this does not work - os.makedirs will fail. This is
>on Windows, and is the reason why unit tests currently fail on Windows.
>Did you figure this out?
>Also I see a problem with the way uw is implemented, because it randomly
> selects from 25 code points and inserts them in the strings, meaning
>that you will likely see intermittent test failures (for example, on
>Windows u'\u00C5' and u'\u00FC' work fine, but the rest don't). uw()
>should be deterministic.
>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>Open Source Applications Foundation "chandler-dev" mailing list
More information about the chandler-dev