[Cosmo-dev] Allowable Characters in Cosmo Usernames

Grant Baillie grant at osafoundation.org
Thu May 18 08:52:42 PDT 2006


On May 18, 2006, at 8:28, Todd Agulnick wrote:

> On 5/18/2006 7:49 AM, Brian Moseley wrote:
>> in the spirit of being as permissive as possible, however, it  
>> might be
>> okay for us to explicitly allow all of these characters, including
>> '/', and field any support questions that might come up from
>> mycrazyusername at here&there.
>
> Easy for you to say! You're not going to have field those support  
> questions. ;-) As someone who *does* field 'em, I'm very much in  
> favor disallowing characters that are going to cause problems for  
> unwitting users. Allowing anyone to have "&" in their username  
> isn't worth even a single user writing to say that they can't  
> access their account. I have my own front-end for account creation,  
> so I can be more restrictive than Cosmo, but I'd prefer that Cosmo  
> lock down these characters anyway because users can always go  
> around my front end.
>
> On the more generic question of non-ASCII characters in usernames,  
> are there good examples of online services that allow them? My  
> quick scan didn't turn up any and I wonder whether that's due to a  
> preponderance of US-centric, international-unaware development --  
> or whether there's some other problem lurking here. I, for one, am  
> nervous (again) about supporting users whose usernames I can't read  
> (or can't render because I don't have the fonts). And my  
> preliminary tests with my client failed when I tried using non- 
> ASCII characters (probably because the encoding happened at the  
> wrong level, or maybe because encoding happened into UTF-16 instead  
> of UTF-8), so there is some non-trivial complexity here.
>
> Either way, definitive answers to these questions would be really  
> useful. Has anyone had experience with a service that allowed non- 
> ASCII usernames?  Are there other issues that we haven't foreseen?

I don't have experience of non-ASCII usernames in Web services, but  
the email protocol world has pondered these issues: The specs for the  
various SASL mechanisms (used in IMAP/SMTP/POP authentication) have  
been updated to support non-ASCII usernames and passwords.

In their case, and probably yours, I believe that saying "use  
unicode, but encode with UTF-8" is not enough. To avoid collisions  
(where you have strings that are equivalent to the user, but are not  
identical character-for-character), you need to specify some kind of  
unicode normalization, and also weed out, or remap, undesirable  
characters.

The IETF got started down this road with IDNA (essentially, DNS with  
unicode support). There, it was important to prevent phishers from  
registering, say Kmart.com, where the "K" is "Kelvin sign", or  
inserting invisible spacing characters in domain names, etc. I'm not  
sure what the potential attacks are for cosmo usernames, passwords,  
but they're probably there :).

--Grant



More information about the cosmo-dev mailing list