 |
[Dev] RDF vs Python Objects
Internet Dog
Tue, 26 Nov 2002 23:14:54 -0500
On Tuesday 26 November 2002 05:35 pm, David McCusker wrote:
> The opposite direction from Python to RDF probably involves another
> streaming interface of some kind. (What is the general stream API
> in Python which is satisfied by files, sockets, and in-memory buffers?)
The send() and recv() methods for sockets is somewhat different from the
read() and write() methods used on files or in-memory buffers. Sockets are a
bit of a problem because it is hard to push the bits back into the pipe, so
the seek method would not work with that one. IIRC the send() and recv()
method names were used because they are the standard names used for the
functions in the C library. Python tries to smooth over the differences
between names and semantics inherited from prior art, but sometimes there are
no win-win choices.
You might find the popen2 (Subprocesses with accessible I/O streams) [1] to be
a better module choice than sockets. The popenXX functions return file
objects. The pipes module [2] might also be of interest. It allows chains of
commands to be applied to a file. It requires the /bin/sh command so it is
POSIX specific.
StringIO [3] are "memory files" and can be used anywhere a file object is
used. Here are the methods used by the two object types.
>>> import StringIO
>>> dir(StringIO.StringIO)
['__doc__', '__init__', '__iter__', '__module__', 'close', 'flush',
'getvalue', 'isatty', 'read', 'readline', 'readlines', 'seek', 'tell',
'truncate', 'write', 'writelines']
>>>
>>> f = open("xx","w")
>>> dir(f)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
'__init__', '__iter__', '__new__', '__reduce__', '__repr__', '__setattr__',
'__str__', 'close', 'closed', 'fileno', 'flush', 'isatty', 'mode', 'name',
'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell',
'truncate', 'write', 'writelines', 'xreadlines']
> Import from RDFS to create schema objects can create an in-memory
> data structure (either DOM or some friendly native Python form),
> because schemas will not generally be so large that a stream parse
> with a SAX parser is needed.
>
> I'll write more after I draw a diagram.
Something very similar has been developed in the XIST [4] module. This XML
library takes advantage features of the new type system in Python 2.2. It
also departs from the Java roots of the DOM and SAX APIs. With XIST the XML
classes in a DOM parser are not generic nodes like they are with DOM. Instead
the class names are the same name as the entity name and attributes are
accessed using the usual __getitem__ method of Python. This results in a much
more Pythonic experience when using the objects parsed from an XML file. In
the HOWTO [5] section titled "Generating XML trees from XML files" the
functions in module ll.xist.parsers are discussed. They are used to create
XML trees by parsing XML files.
There a bunch of other neat stuff in XIST that can be used to build converters
that convert an XML tree into an alternate XML tree. (XIST was created to
facilitate building tools for converting XML files to HTML files.)
The Zope server has entertained the use of RDF and the Dublin Core for many
years. I haven't used it because I've never figured out where it would be
more useful than just using dynamically created Python classes. Go to
zope.org and search on RDF to see more like the these [6]. A better source of
information on RDF and Python would be Uche Ogbuji's Akara site [7]. The
4Suite developers are heavy contributors to the Python xml-sig. You'll find
Uche's name on many articles about Python and XML on the IBM website.
[1] http://www.python.org/doc/current/lib/module-popen2.html
[2] http://www.python.org/doc/current/lib/module-pipes.html
[3] http://www.python.org/doc/current/lib/module-StringIO.html
[4] http://www.livinglogic.de/Python/xist/
[5] http://www.livinglogic.de/Python/xist/Howto.html
[6] http://www.zope.org//Wikis/zope-xml/ZopeRDF
[7] http://uche.ogbuji.net:8080/uche.ogbuji.net/tech/akara/4suite/
and http://www.zope.org/Members/EIONET/RDFGrabber
|