[Ietf-calsify] Issue 1: new proposed text
Mark Crispin
mrc at CAC.Washington.EDU
Wed Sep 20 08:49:33 PDT 2006
On Wed, 20 Sep 2006, Eliot Lear wrote:
>> Actually it is for everybody who wants to produce well-formed UTF-8.
> But nobody will care unless they're looking at the raw message. Yes,
> it's malformed, but then it's not really meant to be represented. What
> really should be happening is that implementations should use MIME
> multipart/alternative and not put a text/calendar message in front of a
> user.
I support Eliot's position.
My position can best be summarized as being "do it right or don't do it at
all."
That is, we should choose between either "only fold at a point that
represents a complete glyph" or "it doesn't matter where folding takes
place, since the text will always be unfolded prior to processing."
It is pointless to talk about "well-formed UTF-8" and say that is "good
enough." Yes, you can break a UTF-8 string such that the two substrings
are both independently interpretable by a UTF-8 decoder. The problem is
that if the break is before a combining character codepoint, the string
after the break is NOT a valid UTF-8 string since it begins with a
combining character. Put another way, it's valid UTF-8 encoding of an
invalid string.
I believe that a half-solution is the worst possible choice, since it
creates the illusion of viability for a primitive tool that uses the raw
message without doing proper folding.
If we punt, and say that folding can occur anywhere, then primitive tools
are broken; they will attempt to deal with dangling UTF-8 strings which
end in the middle of a codepoint sequence, and with strings that start in
the middle of a codepoint sequence.
If we go all the way, and say that folding can only occur at a glyph
point, then primitive tools will be fine. That is a lot of work to
undertake just to make it easier for some lazy programmer to foist a
primitive tool as being acceptable.
In my opinion, we should choose between supporting primitive tools, or
declaring primitive tools to be totally broken.
-- Mark --
http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.
More information about the Ietf-calsify
mailing list