[Chandler-dev] [Sum] The Great Architecture Discussion of 2007
Phillip J. Eby
pje at telecommunity.com
Tue Oct 9 16:38:04 PDT 2007
At 04:12 PM 10/9/2007 -0700, Andi Vajda wrote:
>On Tue, 9 Oct 2007, Phillip J. Eby wrote:
>>1. application-level code meddling in storage-level details
>Could you give some examples ?
Any place where the application is creating collections or working
with indexes in order to achieve performance compared to "naive"
iteration or queries.
>>2. lack of sufficient domain-specific query APIs
>Again, please give an example of what you'd like ?
This isn't a repository problem - it's a domain-layer problem. If
the places where we're doing #1 were at least consolidated to single
points of reference, #1 wouldn't be so bad.
>>3. no indirection between the application's logical schema and its
>>physical storage schema
>Seems incorrect. I can change the physical storage schema (core
>schema or even repo format) without affecting app code. Or am I
>misunderstanding something ?
Sorry, I am using the relational meaning of logical and physical. A
logical schema does not include indexes or views, while a physical
schema does. I'm also extending this to refer to the lack of
distinction between our preferred form of data as encapsulated
objects, versus the best divisions of data from a performance point of view.
The core schema and repo format aren't a factor in this, as they're
at an even lower level than the "physical" schema I'm talking
about. In the repository today, the "physical" schema consists of
whatever sets/collections and indexes you create, which is rather
analagous to creating indexes or materialized views in an RDBMS, only
without the same transparency. In an RDBMS, if you add an index or a
materialized view, it doesn't change how you retrieve your data: it
just goes faster. So you can do application specific tuning without
changing your application.
>>4. implementing a generic database inside another generic database
>That was the goal, originally.
Not quite; having a generic database was the goal, not that it be
implemented *inside* another generic database. It is one thing to
have a BerkeleyDB persistence layer driven by the application's
dynamic schema, and another one altogether to implement a database on
top of a fixed BerkeleyDB schema.
For comparison purposes, consider OpenLDAP: it is a generic,
hierarchical, networked database implemented atop
BerkeleyDB. However, instead of having a fixed schema for storing
values, items, etc., in BerkeleyDB, it is dynamically extended as
attribute types and indexes are added. So the database is
*represented* in BerkeleyDB, rather than being implemented *inside* BerkeleyDB.
The same distinction applies to say, MySQL, which implements each
table using separate BerkeleyDB data structures, rather than creating
a generic "rows" data structure.
So, when I say it is implemented "inside" another database, I mean it
in the sense that the schema of the repository is not reflected in
the schema of its back-end storage, and thus cannot fully utilize the
back-end's features to maximum performance.
>Not to have a hard compiled app against a hard compiled relational
>schema. If Chandler is to become a hard compiled application with a
>static schema, where all data types have to be determined in
>advance, then of course, the chandler repository is overkill and can
>be replaced by some specifically optimized, domain-specific, schema.
I'm not sure what you mean by "hard compiled". Nothing stops us from
having a relational schema that's extensible by parcels, or from
doing so dynamically. In truth, the schemas we use with the
repository today are no less "hard compiled". If we at some future
time allow user-defined fields, there are still ways to represent
them within such a relatively-static schema, or to simply modify the
schema at runtime.
>>5. implementing generic indexes inside of generic indexes
>How so ? What are you thinking about ?
The skip list system is the main one I have in mind, but if I
correctly understand how versions and values are stored, then those
would be included too.
More information about the chandler-dev