[Dev] ZODB is not a Storage Technology (Re: other formats )
john at osafoundation.org
Sat Nov 9 09:09:49 PST 2002
Thanks for the very nice overview. Makes lots of sense and it will help
me as we jump into the code. I did have one question, see below
Mike C. Fletcher wrote:
> Okay, here's a quick overview of the guts, presented as an outline.
> I've assumed you'll be reading the summaries with the source-code open
> in another window to see what's being described, so I've not gone into
> any details as to how anything is done.
> The objects likely best to concentrate on for understanding the
> low-level guts are the FileStorage, the Connection, and the
> _defaulttransaction. I've given you quick summaries of what you'll
> find in most of the files in the ZODB4 CVS packages (ZODB, Transaction
> and Persistence), the zLOG project is just logging facilities, nothing
> really close to the core of the ZODB. The indentation is primarily
> showing usage patterns (for instance, fsindex is really only used by
> FileStorage AFAIK), though I've also used it to group items which can
> be considered sub-categories of the superior item.
> I'll work on details tomorrow if I can get some more time,
> questions/directions in which you'd like more coverage quite welcome.
> BTW: I've copied the ZODB-dev list so that others can correct anything
> I've messed up, or add anything that they consider critical to
> understanding the system.
> Storage (BaseStorage sub-classes):
> """Storages are responsible for maintaining object state records
> They can also maintain undo (transaction) and versional records.
> """Default ZODB storage
> The FileStorage is a linear aggregate of all transactions,
> and transactions are aggregates of all changed objects.
> Transactions are added at the end of the file, with
> later changes to a particular object conceptually overwriting
> the earlier changes.
> Versions (personal views of the dbase) are just transactions
> which are declared to have version information. The versions
> form linked lists (they point to the last transaction in the
> Storages which have undo support (such as filestorage) have
> a pack method which basically copies all objects forward until
> there is a single current set. Then discards anything not in
> the current set.
Does it copy "in place" so that if you pulled the plug while in pack
your file is corrupted?
> """Index from persistent OID -> file position index
> The fsIndex provides optimised index to
> individual objects
> within the data file of the FileStorage. The index can
> be rebuilt merely be scanning through the entire datafile.
> """Storage for transaction save-points"""
> """Simple storage based on GDBM/AnyDBM"""
> """A demonstration of a volatile in-memory storage"""
> utility mechanisms:
> """TimeStamp C exetension type"""
> """Pickle-like storage (cPickle plus some custom code)"""
> """finds object refs in pickle strings"""
> """(small) wrapper to do cross-platform locking of
> fsdump, fsrecover:
> """Debugging/utility code"""
> """Object-space in which application objects live
> Uses an in-memory object-cache (see below)
> Provides object-access (get root dict, get object by oid)
> though normal access is via getting root and then
> drilling down through the object references.
> Other than this, almost the entire class is support
> for the transaction and persistence mechanisms.
> """Mix-in providing XML import/export"""
> """Manages multiple Connections to a storage
> Provides a pool of connections
> Provides mechanisms for applying functions
> to all object caches in all connections
> Tracks object modifications for versions? (not
> sure about this, I've never used versions)
> Provides most of the primitives on which Connection and
> Transaction build the transaction mechanism. (tpc_*)
> """The default transaction machinery
> Combined with the connection object, this is most
> of the transaction-driving code in the system. It
> is fairly tightly coupled to the Persistent module
> (e.g. it assumes _p_jar and the like on all registered
> """Data-storage for the current transaction"""
> """Entry point for transaction APIs"""
> """Python 2.2.2 implementation of IPersistent
> Basically, this is a Pure-python version of the cPersistence
> code that really gets used (I'm not sure if there's code
> anywhere to fall back to using this version if the cPersistence
> code isn't compiled).
> This is quite useful for figuring out what's going on,
> but (having used it for a few months), it seemed too slow
> to be of use in a real-world system (too much time spent in
> """Provides optimised IPersistent implementation"""
> """Provides an in-memory object cache to reduce reloads from disk
> Basically this is a high-level cache, it has a target size
> and a few methods implementing garbage collection. The
> DB calls the connection's GC methods, then the connection calls
> it's cache's GC methods.
> particular data-types:
> PersistentDict, PersistentList:
> """Dictionary and List types which track their changes
> Basically allow you to use them as lists/dicts without
> needing to spend code tracking changes yourself. These
> items, however, re-store the entire list/dict on each
> save, so see BTree for large dicts.
> """BTree implementation using individually persistent nodes
> Allows large dictionaries to be stored so that only a small
> sub-set of the dictionary needs to be re-stored on
> Function, Module, Package:
> """References to these types w/ importing
> Never used these myself (I think they're new),
> they appear to store name-references, or actual
> code objects in the case of functions.
> John Anderson wrote:
>> I'd be interested in an overview of the guts. Start with a big
>> picture, then move into some details and describe what's in which
>> files. I'd like to eventually learn the code base so I can decide how
>> to improve it.
>> Mike C. Fletcher wrote:
>>> At what level would you like the description (I've been using ZODB
>>> for years now, and have just released a calendaring application on
>>> it). I assume you understand the basics, so are you looking for
>>> analysis of where/how it starts to fail/how to update it, or what
>>> the actual machinery inside is doing for any given action?
>>> I'll push some time around and try to get a description posted this
>>> weekend if you can tell me which area you need.
More information about the Dev