Open Source Applications Foundation

[Dev] bdb in ZODB (was ZODB is not a Storage Tech)

David McCusker Mon, 04 Nov 2002 10:57:35 -0800


Jeremy Hylton wrote:
> I wonder how appropriate Berkeley DB is for end user applications.
> Running a Berkeley database entails a lot of management responsibility
> -- checkpointing, log management, recovery, deadlock detection, etc.
> It's a database, and running a database requires some database
> administration.

I'm not deeply enthused about using Berkeley DB, though what I said
might have sounded like advocacy.  (In the past I've been critical, and
I'm trying to be agreeable about what I know many folks like.  However,
I understand working on Berkeley DB someplace is one of my potential
job situations.  Hey, I could probably make it better somehow.)

I was trying to point out Berkeley DB's predilection for using multiple
files, to keep indexes away from each other to reduce lock contention,
so default behavior is better for server side use.  This approach is
awkward in contexts where too many files will be created.  (Netscape
had many separate mbox format files, one per folder, and indexing each
of these several ways with one index per file would have caused painful
file system latencies just to open the index files.)

The administration drawbacks you mention are ones I hadn't noted before.
I like systems that don't have complex administration for end users.

On the topic of recovery, I should mention folks ought to assume every
type of persistent store will become corrupt, and have some sort of
plan for recovering stored data.  Not recovery at the user level,
although backups are always a good idea.  I mean Chandler developers
ought to design storage with a view toward being able to fish good
data out of bad when a database begins to look bad.

(I have a long rant on entropy I won't present here, in which I cover
why every database will get corrupt, and why it's a just a race to
failure between a database and the underlying file system or media.
Databases that look very stable have longer mean time to failure than
the host file system.  This is more common on Unix than the old MacOS.)

The user interface for recovery might be simpler if a user explicitly
requests that data be recovered from apparently corrupt databases
after errors are reported.  That can help about complex speculative
excursions in Chandler code it if decides to recover automtically in
response to errors that suggest corruption.

Instead, such errors might generate new menu items, or preferrably an
entirely new view about stores-with-errors, so users can tell Chandler
to recover stores which have appeared in UI reports as maybe corrupt.
Since Chandler might have some replication and synchronization features,
this could feed the problem of merging partially corrupt content into
a known code path for "importing" external content.

> The cost of administration is a drawback for any database.  I imagine
> that Chandler would want to minimize the amount of administration an
> end-user needs to do.  The ZODB storage with the least administrative
> costs is FileStorage, which works much like you describe -- append
> arbitrary objects to a single file.  It needs to be packed
> occasionally; pack is the operation that removes old revisions of
> objects.

Less administration sounds good.  It's also possible to use a format
like a file system in a file, which can re-use freed space in a file
without needing to rewrite in a pack operation.

But a format accumulated by appending makes it easier to implement
transactions.  As someone noted in another email here, the append based
FileStorage can abort appended transactions by shortening the file.

Files updated-in-place with random access writes can be transacted by
means not too complex in comparison.  I like using a block-based patch
system, which accumulates block replacements applicable as a patch in
a batch operation at commit time.  One can arrange that such a store
does not involve more administration for users.

> The BerkeleyDB storage for ZODB is still experimental, but it's
> intended more for server-side environments where there's a sysadmin on
> hand to properly manage the database.

That argues in favor of avoiding it or improving it for Chandler.  I'd
think however Chandler does persistent storage could improve available
open source code for others, when changes are made available.

I'd think removing BerkeleyDB's administration needs could be hard if
it's a consequence of the way it's architected.  Trying to make it smooth
for end users might be too time consuming when other things have higher
priority.  Chandler developers probably don't have time to delve
exclusively into BerkeleyDB internals for very long (say many weeks).

Maybe someone who already knows BerkeleyDB really well could outline a
plan for making it less sysadmin dependent, in case the approach already
looks clear from experience.

--David McCusker