Open Source Applications Foundation

[Dev] data model relationship types in the XML schema format

Brian Douglas Skinner Fri, 08 Aug 2003 17:55:35 -0700


Hey Katie,

I have some other half-baked thoughts for the XML schema format. This is 
about the idea of delete policies and copy policies.

Background:
    * Right now, Andi's building blocks implement a few different delete 
policies ('remove', 'cascade:count', and 'cascade:none').  And I think 
Andi's plan is to also have the idea of copy policies, to support 
policies like deep copy, and shallow copy (reference only), and no copy.
    * And as it stands now, my model.py thing uses three delete policies 
('weak reference', 'strong reference', and 'cascade'), which I think may 
map directly to Andi's three policies.  And I also have two boolean 
flags for 'transitive' and 'symmetric', inspired by the RDF/OWL notions.

Too Many Combinations:
    * The combination of delete policies and copy policies leads to a 
great deal of flexiblility in expressiveness.  If an attribute can have 
any of 3 delete policies, and any of 3 copy policies, then there are 9 
different policy combinations for that side of the relationship.  And 
each relationship has two endpoints, so that's 81 policy combinations. 
Plus if you factor in the 'transitive' and 'symmetric' flags, then there 
are 1,296 relationship types.  Yikes!

Suggestion:
    * For the person authoring a schema, I don't think we want all that 
flexibility.  I think we want a different abstraction, geared toward the 
way the schema author is looking at the relationship.  From the schema 
author's point of view, I think there are only a handful of relationship 
types.  I think the three important types are: 'ownership', 'peer', and 
'hierarchy'. And there are another few that might come up once in a blue 
moon: 'dependency', 'symmetric', and maybe 'equivalence set'.  Each of 
these relationships types implies certain delete policies and copy policies.
    * So my proposal is to set up the XML schema format so that the 
schema author never specifies delete policy, copy policy, transitive, or 
symmetric.  Instead, they just specify one of a few relationship types. 
  And then the XLST transform thing can read that and generate the 
Andi-format schema files that have the corresponding delete policies and 
copy policies.

Details:
    * Here are some proposed relationship types and the policies they 
would map to.
    * I know this all looks frighteningly complicated, but I think it's 
actually a vast simplification over what we get if we asked every schema 
author to set their own policies for delete and copy.  Really, I think 
there are only three relationships here that a schema author wants to 
think about: 'ownership', 'peer', and 'hierarchy'.

ownership:
    * A owns Z
    * examples
       * a Person has a Name
       * a Calendar Event has a Reminder
       * an E-mail Message has some Attachments
    * A.myZ
       * A.myZ <-- marked as 'owns' in the XML schema file
       * A.myZ --> delete policy of 'cascade'
       * A.myZ --> copy policy of 'deep'
    * Z.myA
       * Z.myA <-- marked as 'is owned by' in the XML schema file
       * Z.myA <-- cardinality must be 'single'
       * Z.myA --> delete policy of 'weak reference' (aka 'remove')
       * Z.myA --> copy policy of 'leave null if A.myZ is cardinality 
single, otherwise do a shallow copy'

peer:
    * A and Z are peers
    * examples
       * an Person appears in a Photo
       * an Person is a member of a Group
       * an Department inclues many Employees
    * A.myZ
       * A.myZ <-- marked as 'peer' in the XML schema file
       * A.myZ --> delete policy of 'strong reference' (aka 'cascade:count')
       * A.myZ --> copy policy of 'leave null if Z.myA is cardinality 
single, otherwise do a shallow copy'
    * Z.myA -- same as A.myZ

hierarchy:
    * A is a parent of Z
    * examples
       * a Department contains other Departments
       * a Group contains other Groups
       * a Kind is a superkind of another Kind
       * an Attribute is a superattribute of another Attribute
    * A.myZ
       * A.myZ <-- marked as 'children' in the XML schema file
       * A.myZ --> delete policy of 'strong reference' (aka 'cascade:count')
       * A.myZ --> copy policy of 'leave null if Z.myA is cardinality 
single, otherwise do a shallow copy'
       * A.myZ --> marked as 'transitive' for RDF/OWL
     * Z.myA
       * Z.myA <-- marked as 'parent' in the XML schema file
       * Z.myA --> delete policy of 'strong reference' (aka 'cascade:count')
       * Z.myA --> copy policy of 'shallow'
       * Z.myA --> marked as 'transitive' for RDF/OWL

dependency:
    * Z depends on A
    * example: a Place is in a Timezone
    * A.myZ
       * A.myZ -- marked as 'is used by' in the XML schema file
       * A.myZ -- delete policy of 'strong reference' (aka 'cascade:count')
       * A.myZ -- copy policy of 'leave null'
     * Z.myA
       * Z.myA -- marked as 'depends on' in the XML schema file
       * Z.myA -- delete policy of 'weak reference' (aka 'remove')
       * Z.myA -- copy policy of 'shallow'

symmetric:
    * A1 has a symmetric relationship with A2
    * examples
       * Person B is the spouse of Person C, and vice versa
       * Person D has a list of friends that includes Person E, and vice 
versa
    * A.myA
       * A.myA <-- marked as 'peer' in the XML schema file
       * A.myA <-- has 'myA' set as the 'inverse attribute'
       * A.myA --> marked as 'symmetric' for RDF/OWL

equivalence set:
    * A1 is in an equivalence set with A2
    * examples
       * in RDF, two Items are marked owl:sameIndividualAs
       * in RDF, two properties are marked owl:equivalentProperty
       * in RDF, two classes are marked owl:equivalentClass
    * A.myA
       * A.myA <-- marked as 'equivlent' in the XML schema file
       * A.myA <-- has 'myA' set as the 'inverse attribute'
       * A.myA --> marked as 'symmetric' for RDF/OWL
       * A.myA --> marked as 'transitive' for RDF/OWL




Katie Capps Parlante wrote:
> Thanks Brian, this is very helpful. I'll let you know as I run into 
> questions...
> 
> kt
> 
> Brian Douglas Skinner wrote:
> 
>> Hey Katie,
>>
>> Here are a bunch of feature requests for the XML schema format.
>>
>> This is all from the point of view of trying to have schema files that 
>> can capture the information that's currently stored in the model.py 
>> script.  Some of this stuff is more important than other stuff, so we 
>> can do triage if need be.  Let me know if you want to talk through any 
>> of this.
>>
>> - Brian
>>
>> =================================
>>
>>    * be able to define
>>       * attribute definitions
>>       * kinds
>>       * types
>>       * aliases -- type or kind aliases
>>       * domain schemas
>>       * items
>>
>>    * attribute definitions
>>       * flavors
>>          * global attribute definitions
>>          * non-global attribute definitions (within a kind)
>>          * reference definitions
>>       * attributes
>>          * ID -- identifier name
>>          * name -- display name
>>          * description (text)
>>          * super-attribute (attribute definition, maybe from another 
>> file)
>>          * hidden (boolean -- system attribute vs. user-visible 
>> attribute)
>>          * cardinality (ENUM: single, set, list)
>>          * type (can be multi-valued -- set of (types or kinds or 
>> aliases))
>>          * default value
>>          * required (bool)
>>          * derivation rule (text)
>>          * equivalent attributes (set of attribute definitions)
>>          * additional ad hoc fields in the XML file
>>       * additional attributes for reference definitions
>>          * inverse attribute (reference definition)
>>          * transitive (bool)
>>          * symmetric (bool)
>>          * policies -- might want to colapse these into a higher level 
>> primative, with values like "wholly-owned"
>>             * delete policy
>>             * copy policy
>>
>>    * kinds can have
>>       * attributes
>>          * ID -- identifier name
>>          * name -- display name
>>          * description (text)
>>          * super-kinds (kind, possibly defined in some other file)
>>          * typical attributes (set of attribute definitions)
>>          * source for display name (attribute definition)
>>          * hidden (bool)
>>          * abstract (bool)
>>          * additional ad hoc fields in the XML file
>>
>>    * types
>>       * flavors
>>          * new simple types
>>          * new compound types (made up of fields with names and types)
>>       * attributes
>>          * ID -- identifier name
>>          * name -- display name
>>          * description (text)
>>          * additional ad hoc fields in the XML file
>>
>>    * aliases
>>       * attributes
>>          * ID -- identifier name
>>          * name -- display name
>>          * description (text)
>>          * alias for (set of types or kinds or aliases)
>>          * additional ad hoc fields in the XML file
>>
>>    * domain schemas
>>       * attributes
>>          * ID -- identifier name
>>          * name -- display name
>>          * description (text)
>>          * definitions (sets of kinds, types, aliases, attribute 
>> definitions, items)
>>          * version info
>>          * dependency info (this schema is built on top of...)
>>          * additional ad hoc fields in the XML file
>>
>>    * items
>>       * features
>>          * items of a kind
>>          * ad hoc items
>>          * items with ad hoc attributes
>>          * items with attributes with list values
>>          * items with attributes with set values
>>          * items with attributes with compound type values
>>          * references to items in other XML files
>>          * values for compound types
>>
>>
> 
>