[Dev] RDF and ZODBMichael R. Bernstein 31 Oct 2002 22:55:51 -0800
Ok, given what little I know about Chandler's proposed use of RDF stored in the ZODB, I went hunting for the RedFoot developers to ask them about their library. I caught up with Eikon on the #redfoot IRC channel. I started by assuming that triples needed to be represented by class instances and stored somehow, but this turns out not to be the case. A triple really only consists of references to a subject, predicate, and object, so the RDFLib triple store uses nested dictionaries to store triples. Each triple is stored in two sets of nested dictionaries as follows: spo[s][p][o] = 1 pos[p][o][s] = 1 (s)ubject, (p)redicate, (o)bject. Persisting very large dicts in the ZODB is usually a bad idea, because in order to access a key/value pair, the whole dict needs to be loaded into memory. So, the ZODB has a more efficient persistent data type called a BTree (Binary Tree). The ZODB BTree implementation has the same API as a dict, so can more or less be used as a dict replacement, but BTrees are ordered (like lists) so getting the correct value by key only requires loading the branch of the tree that leads to the key/value pair. BTree documentation can be found here: http://www.zope.org/Members/ajung/BTrees/FrontPage Anyway, Eikon downloaded the Standalone ZODB package (http://www.zope.org/Products/StandaloneZODB) and in a couple of hours had successfully modified his in-memory triple store to use BTrees inside a ZODB instance. All this looks very promising, although without knowing more about Morgen Sagan's Shimmer RDF database prototype, it's hard for me to tell whether I'm barking up the wrong tree here, or duplicating his efforts. In any case, storing the triples is only part of the story. The triple is just a set of references tying together three object instances, a subject, a predicate, and an object together in a relationship, presumably these objects are also stored in the ZODB somewhere. There are a couple of different ways we could store these object instances. We could: - Store everything in one BTree, giving each object a unique id. - Store each object type in a separate BTree (email, contacts, events, etc.) - Store all Items (subjects and objects) in one BTree, and the predicates somewhere else, perhaps in another BTree Some more information about the ZODB: http://www.zope.org/Documentation/Articles/ZODB1 http://www.zope.org/Documentation/Articles/ZODB2 http://www.zope.org/Documentation/Books/ZDG/current/Persistence.stx Of course, this approach may be entirely naive, but I think it serves as a point of departure, at least. Michael Bernstein.
|