[Chandler-dev] Re: [Cosmo-dev] dealing with characters that XML 1.0 doesn't allow

Grant Baillie grant at osafoundation.org
Thu May 24 15:24:43 PDT 2007


On 24 May, 2007, at 15:16, Phillip J. Eby wrote:

> At 02:58 PM 5/24/2007 -0700, Heikki Toivonen wrote:
>> Morgen Sagen wrote:
>> > #3 feels like the right thing to do.  One suggestion is to  
>> encode all
>> > non-allowed by XML characters using %XX where the XX are hex  
>> digits.
>> > Should we take that route?
>>
>> I said this on IRC, posting here to keep everyone in the loop.
>>
>> I think characters not allowed should be encoded with the standard  
>> XML
>> way, for example ©.
>
> Characters that are not allowed can't be encoded in this way -  
> that's what it means that they're not allowed.  The resulting XML  
> is not well-formed, by definition.

In XML 1.1, those characters are merely "Restricted" ... the grammar  
seems to have changed some. I think that's the reasoning behind the  
summarization in <http://lists.xml.org/archives/xml-dev/200701/ 
msg00011.html>:

> 1-1F except CR, TAB, NL:
> Can't occur in XML 1.0.  Can occur in XML 1.1 and must be escaped.



--Grant




More information about the cosmo-dev mailing list