[Dev] (db policy) transparent persistenceMichael McLay Wed, 27 Nov 2002 01:39:01 -0500
On Tuesday 26 November 2002 03:38 pm, David McCusker wrote: > Before I respond to earlier database threads, I should make progress > on some disclosure and interface design fronts. This is a disclosure > message, and a separate post will discuss import/export interfaces. > > This is a note on the plan to use transparent object persistence. I > think it's a good idea, but more importantly, John Anderson intuits > this is a good choice, and I think it fits Andy Hertzfeld's desire for > high usability criteria in engineering judgment. I agree with the choice of transparent object persistence, but you also need to blend in a couple additional features to facilitate collaboration between multiple databases owned by multiple users. For instance, if I'm scheduling a meeting with 10 people in 4 different organizations. Since one organization is off the net at the moment several people cannot be scheduled while their system is down. I tentatively schedule the other attendees and have a transaction pending for the remainder. When their system comes back on line I find one of the remaining persons calendar is full that day so I need to role back the other scheduling transactions and select an alternative date that meets everyone's schedule. The process is complicated because one person I'm inviting to the meeting is a higher up in my organization. I don't have authorization to reserve time on their schedule, so scheduling the meeting requires first getting permission from this person to be on their schedule. While I'm waiting for approval for the time slot I have tentatively reserved time on the schedules of the other 9 attendees. With this simple scenario we will also need to have sophisticated access control mechanism [0] with features such as the ability to override the roles for some activities for specific individuals. We also need transaction control so that transactions that do not successfully run to completion on all remote systems can be easily rolled back. This rollback capability has the pleasant side effect of providing a handy undo feature for the the application. (Have you tried using the undo tabs through the Zope management interface? They've done a nice job of making the prior transactions atomic. A tricky part of this architecture will be the server-to-server interface for collaborative scheduling. This is going to be a peer-to-peer protocol in which the person calling the meeting is connecting with N external servers to find a common time for a meeting with the N external persons. The algorithm for finding the time slot will probably need rules based on the importance of attendance of the individuals. Some people must be there or it is a show stopper. Others may be invited out of courtesy and can be scheduled in spite of preexisting conflicts. The current Zope server does not support peer-to-peer transactions. The ZSync "product" provides for a server-to-server synchronization. ZServer relies on XMLRPC calls between the servers, but this is not as complex a problem as managing access control lists across systems, conducting negotiations between servers, and then managing transaction rollback on remote systems when an activity is canceled. Using XMLRPC for ZSync may have been a bit of a stretch. Building the infrastructure for schedule synchronization will be facilitated by using a peer-to-peer framework. The BEEP [1] framework would be an obvious choice. The RoadRunner C library is maturing to the point were the framework is becoming usable and there are Python bindings built on top of RoadRunner. There is also Beepy, which is a pure Python implementation of BEEP. The initial work on interoperability testing is just getting underway. On top of BEEP there are a couple other layers of software that may be of assistance as well. The "Application Exchange Core" (APEX) message relaying service [2] provides a core architecture for communications between applications. There are access control services and publish and subscribe services [4] built on top of APEX.There is also an instant messaging protocol built on top of APEX [5]. The BEEP developers have also mapped iCAL onto the BEEP framework with the introduction of CAP [6]. Rich Salz has written a good introductory article [7] on BEEP. Ideallly I'd like to see Mozilla integrate BEEP for HTTP [8] into the client and Apache integrate it into the server. Other technology will follow if Mozilla and Apache take the lead. It's time to move beyond the TCP, single threaded straight jacket of HTTP. > What does transparent object persistence mean? > > It means persistent content is mainly the attributes of some > collection of objects, or of subobjects recursively embedded in other > top level persistent objects. Interacting with this content involves > using normal Python objects. Database updates merely involve modifying > these objects and then committing the database. > > There need not be any overt operations on a database per se. However, > it should also be possible to read and write the database through > alternative means, so it's not necessary for every single change to > actually manifest in memory as a Python object before it can exist. > > (Content can appear in a database by other means, but an app developer > cannot prove it did not come from a Python object in memory first. If > it gets shown to you as a Python object when you read it, how can you > tell it was not originally a Python object when written? You can't.) > > However content gets in the database, it's possible to look at all of > it as the attributes of Python objects that can be accessed by asking > other Python objects for them. The root of a database should have an > app object, and from this it should be possible to navigate to any > object in the database by using the APIs of objects traversed down > from an app object. (And we can have other top level objects besides > the app, of course.) The Zope server can be accessed through HTTP, FTP, XMLRPC, and if the server has an embedded SQL database adaptor the SQL database can be updated using the usual SQL network connection to the database. The control loop for accessing Zope is built on top of the Python async module. I hope this list of capabilities is expanded to include a BEEP interface to Zope in the near future. And then there are those crazy Twisted guys:-) Barry Warsaw as impressed by Twisted. I wouldn't be surprised if Twisted were to be placed at the bottom layer of Zope someday. [...] > Does this mean the database must be an object database? > > No, not really, because the layer that serializes Python objects when > they leave memory (or when they get flushed) can write to an API > that doesn't assume much about how it gets stored. So the database > can be a relational database, as long as it has some way (maybe not > in the core RDB part) which will store attributes never previously > described in the table schemas. This is how the Zope adaptors to databases work. There are some interesting issues raised in this architecture. This architecture for hiding an SQL database also enables "smart queries" to be written. (need to add a reference to "SQL with brains" here.) > How are searches expressed? > > You can hide the way a database searches for content by asking a > Python object in memory to create a new Python object that represents > the results of a search. Then asking this result object for objects > it contains will expose search results as Python objects in memory. > (Sorry for repeating the word "object" so many times.) > > Abstract Chandler database API layers must partly be specified as the > APIs of Python objects that answer queries like this, so folks who > write database plugins can provide implementations of these Python > objects that put the right face on however a database actually does > things under the covers. > > Is there a pattern for making this kind of thing work? > > Yes, a lot of this style of database plugin system can be implemented > easily if the interfaces involved use a "factory" pattern. Let's > assume you've never heard of that before. What's a factory? > > A factory is an object which creates or gives access to other objects. > Instead of creating objects out of the blue, or assuming you know > where to go look for them, you instead go to a factory object and ask > it for what you want. It gives you objects you request, but you > don't know how the factory does memory management, or where it gets > the objecs that satisfy factory requests. > > So a database plugin will emphasize a factory based interface. The > root of a database plugin might be an object that provides access to > the factory objects which answer questions about the database. For > example, to perform a search (which generates Python objects that > satisfy a search) you can go to a factory object and ask for a suitable > search factory, and then ask this factory your query, and it will > return something that actually generates the result objects. The words are slightly different, but the plugin factory based interface you are describing are found throughout the Zope Wikis. In a tutorial on creating "Products" Hathaway states: One of the defining characteristics of Zope products are that they can be added to Zope Folders. To allow your product to be created in this way you need to provide a creation form and a factory method. Factory methods are methods whose purpose is to create an instance of a class and place it in the ZODB. A "Product" is a plugin, it can be added at different places in the Zope hierarchy, so one plugin might be visible to a calendar, but not to a contact list. The factory methods work as you described, for placing creating content for the database. The discovery process for factories is being refined in Zope3. For Zope2 the process was through acquisition. An interesting idea, but one that is implicit rather than explicit. "Import This" warns against implicit and Zope3 is backing away from acquisition. Here [8] is one example which talks about how to adapt content for new views of the content. The "Example" section about half way down the page discusses the issues encountered when storing the contact data for a contact database in a relational database while still providing a transparent means of accessing this content through the Zope object database interface. This specific example should be of immediate interest to Chandler. The Zope database adaptors have been heavily field tested. They provide great flexibility for gluing Zope to existing databases within an organization. Potential users will want to integrate Chandler with existing databases so you will eventually need to provide this same glue layer for Chandler. For instance, the access control mechanism might map directly to an LDAP server within a company for user authentication. > Sorry if this sounds tedious. It's something easy to implement by > turning a crank. All the artistry is in trying to make the interface > elegant and clear. It doesn't represent a technical engine problem. > > I'll stop this note here before I veer too far from the original intent > of explaining the transparent object policy generally. Please let me know if my running commentary about the parallels of Zope and Chandler is more annoying than helpful. I see a strong pattern in the requirements and the bits and pieces of this pattern may not be immediately obvious to you if you haven't been watching the evolution of Python and Zope. Hopefully the references will be helpful in filling in the gaps in your view of this pattern. [0]http://www.zope.org//Wikis/DevSite/Projects/ComponentArchitecture/SecurityFramework [1] http://www.beepcore.org/beepcore/specsdocs.jsp and http://www.beepcore.org/beepcore/docs/rfc3080.jsp [2] http://www.beepcore.org/beepcore/docs/apex-core.jsp [3] http://www.beepcore.org/beepcore/docs/rfc3341.jsp [4] http://www.beepcore.org/beepcore/docs/apex-pubsub.html [5] http://www.beepcore.org/beepcore/project.jsp?projectid=11 [6] http://www.ietf.org/internet-drafts/draft-ietf-calsch-cap-09.txt [7] http://www.xml.com/pub/a/2002/10/16/ends.html [8]http://www.zope.org//Wikis/DevSite/Projects/ComponentArchitecture/AdaptContentForViews
|