Open Source Applications Foundation

[Dev] (db policy) transparent persistence

Andy Dent Wed, 27 Nov 2002 09:13:36 +0800


At 12:38 -0800 26/11/2002, David McCusker wrote:
>What does transparent object persistence mean?
>
>It means persistent content is mainly the attributes of some
>collection of objects, or of subobjects recursively embedded in other
>top level persistent objects.  Interacting with this content involves
>using normal Python objects.  Database updates merely involve modifying
>these objects and then committing the database.

I'm a big fan of at least some degree of transparency - the OOFILE API was
evolved with discussion on a similar design list to this one (back in
92-93) and has as much transparency as could be achieved in cross-platform
C++ pre-STL. (Note: to understand the following code examples you need to
know that OOFILE just compiles as pure C++ and doesn't use any database
preprocessor, using operator overloading for member access and search
specification instead).

ie: class declarations use base classes to indicate persistence of the
class and members (you can have non-persistent member variables in a
persistent class).


class StudentT : public dbTable {
   OOFILE_METHODS(Student)
   // only macro used, declares some inline functions for easier
programming, not essential

   dbChar  Name;
   dbShort  Level;
};

StudentT students;  // use a global in this example, would probably be
member in Document

The base constructors of these fields automatically hook the fields up to
the class and you could also declare a persistent class by dynamically
creating fields. The key to this flexibility is the two stage schema -
creating a persistent object defines an OOFILE Schema. Until you create
another dbTable object, all dbField descendants are linked to the
last-created dbTable. It's a simple idiom that allows for array definition
and other dynamic fields, not just a compile-time schema.

When a database connection is opened with that object, the schema is sealed
and translated to an appropriate dBase, c-tree Plus or other schema.

The big issue I had with the ODMG C++ database binding, and the assumptions
I've seen since in other object-relational layers, is that everything
happens via navigation from some known root.

(Hence gross hacks like using raw SQL to find the root, reifying that into
an object then starting to use OO paradigms.)

>From my experience and the mailing list feedback (combination of several
hundred years of database app development experience) a key issue with
databases linked to GUI's or business processing is that you do NOT care
about named objects most of the time. Almost all logic is collection
oriented and it is only important to have a Current Object (eg: record
displayed in form).

Iterators are the politically correct C++ way to do this (less-experienced
C++ programmers seem to find them hard).

OOFILE combines the collection and iterator (you can clone a collection to
act as a separate iterator). This is so the simple case of using the
database API is as easy as dBase or any other trivial database API - a key
design goal.

Populating a new instance in the collection:
students.newRecord();
students.Name = "Andy Dent";
students.Level = 19;
students.saveRecord();

searching the collection - restricts current set in the collection and sets
current record to first found:
students.search(students.Name=="Andy Dent");
or, if Name was the current sort key

students["Andy Dent"];


Searching is a major issue to consider with "transparent" database
operations - how do searches work?

Possibilities:
1) searches use only existing language idioms, no exceptions.
This can lead to awkward code in user apps and confuses the living
daylights out of people who DO have some database background.

2) searchable persistent collections have additional interfaces.
Means persistent collections may not always be used identically to
non-persistent collections but yields unambiguous user code. As we are
adding our own methods they can be idiomatic for both language users and
people with varying database backgrounds.

3) additional interfaces are added to non-persistent collections as well.
More work but means you can write user apps not caring if collections are
persistent or not.
Python is probably flexible enough to do this. C++ isn't so the way around
in OOFILE apps was to use a RAM-based temporary database rather than using
any native arrays. All user code was written to the OOFILE API. This also
had the benefit that an "array" using OOFILE could be connected to a form,
report or graph with no extra translation.


>However, it should also be possible to read and write the database through
>alternative means
I think this is a very distracting goal.

Differentiate the engine from the tools that people might want.
How about a Python app allowing ad-hoc database manipulation?

>The root of a database should have an
>app object, and from this it should be possible to navigate to any
>object in the database by using the APIs of objects traversed down
>from an app object.
I don't think this is an "app" object. What about opening multiple database
"documents" in the same application?


>Objects appear in memory on demand, pulled from some serialized form
>in the database into memory as a Python object, in response to calls
>to access the objects.
Yes!

One of the key performance issues I hated about using 4th Dimension was
that related data was instantiated as soon as you hit a record. OOFILE has
lazy instantiation of related and BLOB data based on member access.

 >(Side note: for performance optimization purposes in some contexts,
>app code might want to reduce display latency for users by "pre-
>touching" content before it gets displayed
This should be threadable. (Probably obvious, just thought I'd point it out :-)

>Is there a pattern for making this kind of thing work?
>
>Yes, a lot of this style of database plugin system can be implemented
>easily if the interfaces involved use a "factory" pattern.
The entire OOFILE database API is Open Source so any lessons learned there
are available :-)
<http://www.oofile.com.au/oofile_ref/html/classdb_connect.html>

It combines Factory and Bridge patterns - you specify a dbConnection which
is the Factory for the appropriate backend objects. eg: a dbConnect_dbase
will create OOF_dbaseBackend objects to implement database access.

>All the artistry is in trying to make the interface
>elegant and clear.

I hope we iterate the API on this list considerably before it's
implemented, to make sure that it's idiomatic to more experienced Python
programmers than David or I whilst being understandable for beginners.
-- 
Andy Dent BSc  MACS  AACM   http://www.oofile.com.au/
OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows
PP2MFC - PowerPlant->MFC portability