[Ietf-calsify] Issue 1: new proposed text

Mark Crispin mrc at CAC.Washington.EDU
Wed Sep 20 08:49:33 PDT 2006


On Wed, 20 Sep 2006, Eliot Lear wrote:
>> Actually it is for everybody who wants to produce well-formed UTF-8.
> But nobody will care unless they're looking at the raw message.  Yes,
> it's malformed, but then it's not really meant to be represented.  What
> really should be happening is that implementations should use MIME
> multipart/alternative and not put a text/calendar message in front of a
> user.

I support Eliot's position.

My position can best be summarized as being "do it right or don't do it at 
all."

That is, we should choose between either "only fold at a point that 
represents a complete glyph" or "it doesn't matter where folding takes 
place, since the text will always be unfolded prior to processing."

It is pointless to talk about "well-formed UTF-8" and say that is "good 
enough."  Yes, you can break a UTF-8 string such that the two substrings 
are both independently interpretable by a UTF-8 decoder.  The problem is 
that if the break is before a combining character codepoint, the string 
after the break is NOT a valid UTF-8 string since it begins with a 
combining character.  Put another way, it's valid UTF-8 encoding of an 
invalid string.

I believe that a half-solution is the worst possible choice, since it 
creates the illusion of viability for a primitive tool that uses the raw 
message without doing proper folding.

If we punt, and say that folding can occur anywhere, then primitive tools 
are broken; they will attempt to deal with dangling UTF-8 strings which 
end in the middle of a codepoint sequence, and with strings that start in 
the middle of a codepoint sequence.

If we go all the way, and say that folding can only occur at a glyph 
point, then primitive tools will be fine.  That is a lot of work to 
undertake just to make it easier for some lazy programmer to foist a 
primitive tool as being acceptable.

In my opinion, we should choose between supporting primitive tools, or 
declaring primitive tools to be totally broken.

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.


More information about the Ietf-calsify mailing list