[cosmo-dev] Basic Auth is sooo 1996

Travis Vachon travis at osafoundation.org
Tue Nov 27 18:29:56 PST 2007


Hi everyone

I've been wrestling with bug 11100 (unicode characters in username  
cause login to fail) for a day or so and wanted to let folks know how  
the work is going and solicit some thoughts on next best steps.

I've been able to iron out all of the issues we had regarding exotic  
characters (that is, unicode characters that are not url-safe) in urls  
and url-encoded post content, allowing users with exotic characters in  
their usernames to log in. Unfortunately, the user experience goes  
downhill from there.

To fully understand why I need to wax technical on HTTP Basic auth for  
a second, please bear with me (note that some of this is a repeat of  
an email I sent earlier in the week, but the denouement will be  
different):

RFC 2617 specifies the value of the Authorization header for Basic  
authentication as:
       credentials = "Basic" basic-credentials
       basic-credentials = base64-user-pass
       base64-user-pass  = <base64 encoding of user-pass>
       user-pass   = userid ":" password
       userid      = *<TEXT excluding ":">
       password    = *TEXT

where TEXT is defined in RFC 2616 as follows:

    The TEXT rule is only used for descriptive field contents and values
    that are not intended to be interpreted by the message parser. Words
    of *TEXT MAY contain characters from character sets other than ISO-
    8859-1 [22] only when encoded according to the rules of RFC 2047
    [14].

        TEXT           = <any OCTET except CTLs,
                         but including LWS>

Since in this case userid contains a non-ISO-8859-1 character,  
according to the spec we should use RFC 2047 to encode the userid and  
password before creating the user-pass token, which will look  
something like:
userid: =?utf-8?Q?=E2=80=A0ravis?=


Unfortunately, the RFC 2047 encoding portion of this algorithm is not  
supported in any client I tested (Firefox, Safari, python httplib2) or  
Acegi's BasicProcessingFilter.

 From this thread:

http://lists.osafoundation.org/pipermail/ietf-http-auth/2006-September/000374.html
(in particular http://lists.osafoundation.org/pipermail/ietf-http-auth/2006-October/000393.html)

it appears that a) other folks have noticed this and b) we aren't  
likely to see a standardized solution before the next version of HTTP.  
For Mozilla's experiences with this same problem, see here:

http://lists.osafoundation.org/pipermail/ietf-http-auth/2006-October/000393.html


There is one silver lining in this story: I peeked at the Acegi  
BasicProcessingFilter code and it appears that simply using raw utf-16  
encoded userid and password strings should work. There's no real spec- 
provided reason to do this, but in this case "it works" seems as a  
good a reason as any. This will require a some tweaks in the current  
client side base64 encoding code but shouldn't be too tricky.


Going forward I'd recommend the following:

1) Make the tweak to our client side base64 encoding algorithm to get  
this working in our application
2) I think this provides yet another reason we should look into  
alternate authentication mechanisms a la WSSE (http://www.xml.com/pub/a/2003/12/17/dive.html 
) or Google's authentication scheme. The first step I'd like to take  
in this vein is to read some of the archives of the ietf-http-auth  
mailing list to come up to speed on http authentication proposals and  
report back here.

If there is general support for the idea I'd be happy to ditch (1) and  
go straight to (2), making a new authentication scheme a prereq for  
unicode username support. Otherwise I think this ordering (short term  
hack + starting work on real solution) makes sense.



Any thoughts?

-Travis



More information about the cosmo-dev mailing list