[cosmo-dev] More Unicode
Travis Vachon
travis at osafoundation.org
Mon Dec 3 10:57:59 PST 2007
On Dec 2, 2007, at 8:36 AM, Brian Moseley wrote:
> On Nov 30, 2007 2:56 PM, Travis Vachon <travis at osafoundation.org>
> wrote:
>
>> In any case, my gut feel
>> is that it would be easier to fix the client for Unicode support than
>> to change the server to blacklist characters.
>
> how would you do that?
We can modify our utf8 encoding logic to detect the utf-16 character
pairs and translate appropriately instead of encoding each character
as an actual character.
To really be complete, however, we'd need to do this for all output
(that is, all text in server requests) generated on the client. This
would probably be a pretty significant amount of work, so on second
thought I think it might be worth looking at just how much work we'd
need on the server.
>
>
> re validation - we only have to validate the syntax of usernames when
> writing them into the database. we don't have to validate them when
> they are used in queries. the only code change we'd have to make is
> adding a regex to the ui validators and to the User model. explaining
> this restriction to users would probably be more complicated.
Yeah, explaining the restriction to users and enforcing it client side
is something required by both of our proposals.
It's important to note that this problem isn't limited to usernames.
Currently, characters with code points above U+10000 works fine in the
desktop client and syncs to the server. Bringing this data up in the
web ui and saving distorts this data. I think that even if we decide
to limit usernames to U+0000-U+FFFF (the BMP) we'll still need to fix
this bug, which will require all of the difficult work that we'd need
to do for supporting U+10000 and above in usernames.
All that said, I think I'm leaning toward favoring the following:
1) Restrict usernames to U+0000-U+FFFF server side, add client side
logic to explain and enforce restriction
2) Create 1.0 bug to handle U+10000 and above characters in data
correctly.
-Travis
More information about the cosmo-dev
mailing list