[Dev] hacking file storage to debug RAP protocol
rys at treedragon.com
Wed Apr 9 16:27:10 PDT 2003
Jeremy Hylton wrote:
> A thought: You could write a throw-away RAP server in Python and use
> that for testing ZODB/RAP integration. It would probably be easier than
> having to mess with the C API and Python/C integration. It's probably
> also good to have a RAP implementation in Python for testing.
I have a couple different answers to this, depending on whether the server
is on the near or far side of the protocol.
If on the far side, then Gary Larson has to do it. sorry, I couldn't resist
the joke. Just ignore this paragraph.
In order to test and debug what happens in the protocol, we would want to go
through the protocol, in which case the python server would need to be on the
far side of the protocol, called by the server code in response to parsing
requests received on the wire.
We had proposed writing a throwaway server this way, by putting a zodb
file storage and the server, which would require the C API and Python/C
integration. in fact, it seems any python server on the far side would need
this kind of integration.
However, Lou has provided an interim (possibly the long-term) database
implementation in the server based on Berkeley DB. so it is not necessary
to have a throwaway python server on the far side, when we have something.
if we don't want to test the protocol, then we could mimic a rap server
on the near side in python or whatever. this doesn't seem to help write
now, but it does come close to something I'm interested in. When we do not
actually have a remote repository, we would sometimes like to short-circuit
and access a local repository more directly.
I had in mind accessing an item store interface in Python (which might
or might not be implemented in Python) for a local repository in such
cases. This would make the most sense if the same item store interface
was one that was usable for an item store inside a server repository.
Because then developers of item stores could use some either as local
or a remote repositories. With the protocol or not.
it's possible you don't mean either of those things. You might want a
full server implementation in Python, which parses the protocol and
does everything else as well. That would be throwaway work for Lou
rather than for me, because he is doing the protocol.
>>In the short-term, we would like an interim implementation of a storage which gives
>>us feedback about how the RAP API is actually used at runtime, so we can debug RAP.
> I assume at some point you can turn this process around and use
> FileStorage as the backup to verify that RAP is working correctly.
something like that would be a good idea, but I think that would require
my hacking the connection layer to read and write two different storages
at the same time, to see if they match. but I don't think that would work
because I can't find all the places in zodb where the storage is actually
accessed. Oh wait, now I have a better idea.
When I actually understand a storage well enough -- this might actually be
easier for you to do -- it would be possible to write a storage which
takes two other storages as inputs, which are compared to one another.
this would be a storage which delegates to two other storages at the same
time, in parallel. Everything done to one storage would also be done to
the other, so they ought to match. (Assuming all exceptions were caught
so this did not interrupt full duplication.)
Every read from a storage would read from both, and then you would compare
what was read to see if they were identical in both stores. This would
be an encouraging sign. (It would not prove the absence of oddities
which you do not ever attempts to read. For example, content which should
no longer exist in the database can only be checked by trying to read it,
to prove it is no longer there. Just not reading it at all shows nothing.)
I once performed this kind of test for randomized file interface unit
test (in old iron doc code) to prove that a more sophisticated file
implementation (which did a lot of demand paged buffering) was equivalent
to a simple file modulo performance differences.
that would be a really cool the unit test for you to write for zodb,
which would support new storage implementers, because they could then
run zodb against both file storage and a new storage implementation, to
verify they behaved equivalently.
(I wish I had time to do this. This is yet another example of how I
could easily spend a lot of time being just a zodb developer, which
I am not very good at because I don't understand the code much,
when I must also handle a lot of other things that are equally time
consuming. kind of like having multiple full-time jobs.)
>>The current wrap client interface does not support transactions, so
>>nothing will be done with these in the file storage implementation.
> It's probably a good idea to keep a log of stores() that occur during a
> two-phase commit. If the 2PC ends up aborting, you want to avoid
> writing any data to RAP. Or you can write data optimistically and keep
> a undo log, but there are some messy concurrency issues to deal with in
> that case.
we should expose the planned transaction interface in the client API, and
then either defer to the underlying database on the server side, or do as
you suggest on the server side when the underlying database does not have
transactions. Currently, Berkeley DBS transactions, so we should expose
transactions in the API.
> For reading and writing, do you store the object's revision number
> (_p_serial attribute)? I expect it's important to store that as an
> attribute of the object in the database so that you can detect
I was not planning to do so, but I could, by putting this into some
other attribute besides the one which has typical attribute. That's
probably a good idea if it will expose bugs if they exist in the
first throwaway implementation, if conflicts will be revealed.
> I'm assuming that RAP stores only a single revision of an object, but
> I'm not sure about that.
I keep mentioning to folks that RAP should store multiple revisions,
but this topic does not seem very interesting when I bring it up.
I don't know if this means we will get to it later, as expected,
or whether it means the feature will be resisted later. so far we
don't have revision numbers in the protocol interface.
Our original requirements include the need to support item histories
in the database. However, anything he database does could be modeled
in an application, though presumably less efficiently. You'd want a
database to do everything which might be done more efficiently.
the relevant section on this topic appears in the requirements at:
<blockquote>OsafDbItemHistories?: Through some combination of database backend
features and front-end application usage, we want a Chandler repository to be able to
reveal to users all changes made to items over time. A user should be able to look at
an item, and ask for the history of changes to that item, and see changes in the
granularity of individual attribute alterations. A database might or might not
support versioning, or even indexes on versions (OsafDbVersionIndexing?), to support
the implementation of item histories. But versioning is not a requirement per se,
because this is both more complex and ambiguous. The user experience only requires
being able to view the history of changes to an item. This does not require a
full-blown concept of versioning. Using a zodb storage, item histories might be
represented by chaining together all versions of an object, so that older versions
are reachable and can be visited. Using an RDF storage, one might annotate all
triples with a birth time and a death time, and this would allow the selection of
extant triples at any given moment in time. However neither of these two approaches
is required -- they merely illustrate different choices can be made. The issue of
whether object changes are recorded in in attribute granularity or object granularity
is a separate topic on the history granularity (OsafDbHistoryGranularity?).
More information about the Dev