Open Source Applications Foundation

[Dev] ZODB is not a Storage Technology (Re: other formats )

David McCusker Sun, 03 Nov 2002 13:09:25 -0800


Michael R. Bernstein wrote:
> The ZODB is not in and of itself a Storage technology. It is a Python
> object persistence layer, that has a pluggable storage back-end.

I'm just learning about ZODB, so thanks for clueing me in faster.  Now
I know little but later I expect to get into the nitty gritty details.

So it's kind of the top interface to a storage technology, and it has
a pluggable backend, which is great.  Mostly transparent systems for
objects are a good idea.

(Totally transparent persistence with no developer control can turn
into trouble.  It's a good idea to have a commit() method, or anything
else that puts developers in the loop for deciding when saves matter.)

When working in high level dynamic languages, especially delicious
untyped ones (well, typed values but untyped variables) like Python,
having object persistence at a high level is a great developer benefit.

> Here is the diagram modified to take account of how the ZODB works:
> [ nice storage layer diagram snipped ]
> 
> Currently, I understand the Storage selected as the ZODB back-end for
> Chandler is the Berkely DB, but ZODB has other storages available, and
> it's certainly possible to create more.

I understand Berkely DB has fabulous btree indexes for maps, provided
they are stored as one btree index per file. (Has that changed?)  Does
Berkeley DB provide a way to store arbitrary sized objects without
putting each one in a separate file?  I gathered it didn't years ago.

A conventional thing to do with arbitrary sized objects is append them
to a single file (like mbox format files containing email messages),
and then index them from other files which summarize the contents.
This approach requires a file rewrite to compact after object deletes,
which has time proportional to db size rather than deleted content size.

I should look into the way Berkeley DB hooks into ZODB so I'll have
better informed ideas regarding the way it works and the way alternatives
could be plugged in as replacements.  I hope I get around to starting
this research in a few days.  (I'm writing a spec at home right now.)

> All the ZODB really cares about is transparently persisting Python
> objects and their attributes. 

That sounds like an elegant degree of simple focus, and transparently
persisting objects is a good goal and a nice service.  I could also
look into the way it does this, in case tweaking it is useful.  (Who
knows, maybe slight variations in coding have performance effects.)

Does anyone want to lecture on how ZODB works inside?  Maybe other
folks would find such a presentation useful on this dev list.  Not
that I want to turn the dev list into all storage all the time.  Just
tell me to knock it off when I get carried away.

--David McCusker