[Dev] transactions per second
Jeremy Hylton
jeremy at alum.mit.edu
Mon Nov 11 15:32:04 PST 2002
>>>>> "DM" == David McCusker <david at treedragon.com> writes:
>> There's only one kind of transaction in ZODB, and it has the
>> standard ACID properties.
DM> If D stands for durability, and flush() is not durable, then why
DM> do you say ACID when the D is not supported? (Just curious. It
DM> takes me a while to calibrate other folks' standards.)
Because I'm rushing a bit too much when I respond to these emails :-(.
FileStorage calls flush() during the prepare() phase of the two-phase
commit (2PC). It writes a status byte and calls fsync() when the 2PC
commit occurs. So there is the possibility that a storage crashes
after voting yes on the transaction but before the transaction
actually completes. I think that's good enough for D, though we're
still not doing "careful writes."
DM> Jeremy Hylton wrote:
>> A ZODB transaction with FileStorage calls flush() before
>> reporting that a transaction has committed.
DM> Which moves the buffering from the app to the operating system,
DM> and doesn't guarantee content is on disk. I know you know this,
DM> so this is for other readers.
To be sure that everyone is clear on these points. The fsync() call
will copy any file buffers in memory to disk. I believe it also
updates the file metadata.
>> That's the durability guarantee you get.
DM> When FileStorage works by appending, and the transactions are
DM> marked so the starts and ends are findable, this is an adequate
DM> amount of durability since one can ignore incomplete
DM> transactions. (Incidentally, this is exactly how I had the Mork
DM> text db format work for Netscape.)
Right. The failure modes leave transactions without a valid status
byte on the disk. In log-structured storage like this, it is always
safe to ignore an incomplete transaction at the end of the file.
A careful write, as I understand it, would provide better durability
by writing to two different files on different media to guard against
low-level failures.
Jeremy
More information about the Dev
mailing list