[Dev] Re: chandler toolbar for mozilla
Heikki Toivonen
heikki at osafoundation.org
Mon Mar 7 11:02:59 PST 2005
Brett Clippingdale wrote:
> On Fri, 2005-03-04 at 17:36 -0800, Heikki Toivonen wrote:
>>I thought about this a little bit, and I think by far the easiest way
>>would be to hook into the webserver parcel in Chandler. Not the sexiest
> Your solution is an interesting one, to be certain. The mechanics of
> the toolbar are not difficult; the interesting problem here is
> connecting data to the repository. You may well know that, in addition
> to using forms, JavaScript/XUL can do a POST to a server with it's
> XMLHttpRequest(), which is the hack used for Google's autosuggest, maps,
Yes, XMLHttpRequest was one of my main areas of responsibility at
Netscape from NS 6 to 7.1.
> IIRC, you want to send both the URL and the page contents to the
> repository, and I'm thankful for the guidance on how to connect it to
> the repository. I'll look into that, and will contact Morgen.
I think you may want to send even more - the HTTP headers. But one step
at a time: URL, URL+data, URL+data+headers. I think this is also in the
order of difficulty.
The URL is trivial.
> What I don't know is how to handle the DOM, which is what I presume we
> want to store, in addition to the URL string. (Is this correct? That's
> a *lot* of data!) XMLHttpRequest() has a send() command where the real
> data will be sent, and one can send a string, the entire DOM, or a DOM
> fragment. In any case, AFAIK the object will be/must be serialized
> using Mozilla's XPCOM serializer, nsDOMSerializer.
>
> Does Python have a library that will de-serialize these objects? For
> instance, I'm not sure if Mozilla's is the standard W3C DOM "Load and
> Save Level 3," which I presume Python can handle, or if there is any
> relation to Apache's asDOMSerializer interface, which is also used with
> Java. Better yet, does the webserver parcel have an interface that can
> handle this stuff?
>
> I'm also concerned about how non-W3C-compliant web pages will be passed.
> I hope someone more experienced with these issues will know, otherwise
> I'll continue to research it.
The DOM serializer is most likely the piece you'd need to use. Serialize
to string first, then send the string with XMLHttpRequest in POST
operation. (send() can take DOM document as well, but it will use XML
serializer which would be bad in most cases since we'd be dealing with
HTML -
http://lxr.mozilla.org/seamonkey/source/extensions/xmlextras/base/src/nsXMLHttpRequest.cpp#1429.)
This will mean that the data you send is not exactly the data Mozilla
received over the wire, but I think that is ok. You see, Mozilla does a
lot of work to understand the tag soup out there that is sometimes
called HTML, and will build a DOM it thinks the authors meant. Typically
only the markup will change to well formed, in some extremely rare cases
with horrible markup some actual text may be lost. So, you'd be sending
the data as it was understood by Mozilla, which I think is fine - that's
what you see in the browser anyway. The added benefit is that on
Chandler side you will be dealing with well-formed HTML, in case you
want to parse it (although I guess I would just store it as a text
object and let Chandler index it as is).
Regarding headers, I think the best thing would be to take a look at the
Live HTTP Headers extension (http://livehttpheaders.mozdev.org/).
--
Heikki Toivonen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://lists.osafoundation.org/pipermail/dev/attachments/20050307/ee8f25f3/signature.bin
More information about the Dev
mailing list