[Dev] transactions per second

David McCusker david at treedragon.com
Mon Nov 11 13:03:36 PST 2002


To get the highest number of transactions per second, it might be
necessary to define exactly what's required and then tweak the db
usage so costs are not terribly high per transaction.

The term transaction can mean a number of things, where the most
important one is usually an atomic change, such that all change
in the transaction happens or none of it happens.  But some APIs
might also have a transaction that implies persistent, which is much
more expensive when disk latency is so horrific.

(In practice, it can be hard to even ensure bytes are really on
disk, rather than buffered somewhere on the way by the operating
system.  For example, on Unix it might be necessary to sync the
file system more than once.  And the cost of doing that is dramatic
if one intends to do it very often.)

I'm assuming transacted and undoable user actions need not always
be guaranteed persistent, in the sense that power failure immediately
after a commit will not stop the commit from still being a success
after a machine reboots and recovers.  If not, then non-persistent
transactions can be done at a much higher rate than persistent.

But you'd want to periodically make sure all committed transactions
are atomically flushed to persistent form, too, without the user
needing to do anything explicit when auto-save is the usage model.

The transaction interface supported by storage might expose atomic
versus persistent transactions separately, so an app can require
the least strict version after operations, for better performance.

It's probably not practical to get more than several (four or five?)
disk syncs a second, and it could be worse than that.

The transaction path for non-persistent (i.e. non-synced) commits
could be streamlined to buffer more effectively.  I'd have to study
the way any system did things for a while to see how to get this
result, though. I'm just speaking from first principles here, and
not from understanding a particular system being used.

If transacted changes are being written to disk in a suitable way
(say by appending to a file containing demarked transactions),
then they can persist incrementally, as the file system manages
to get them on disk, and the last persistent transaction would be
the latest one written completely.

That would tend to restrict the scale of non-persisted commits
to whatever had not been written recently.  In some contexts that
might be better than forcing the app to make explicit decisions
about how often a forceful flush is needed.

Maybe an app could track idle time or lack of computing activity
by the app to heuristically perform a strong flush, which would
be too costly to do all the time when a system is busy.  That
would avoid requiring a user to invoke a save operation to ensure
everything was really, really saved after user actions.

--David McCusker




More information about the Dev mailing list