[Dev] ref collection-based path proposal

Andi Vajda vajda at osafoundation.org
Mon May 24 13:45:38 PDT 2004

The following is a proposal being considered for implementation in the data
model. There are still a number of API design issues remaining to be
addressed by the semantically aware layers above the data model. More
proposals are in the works in that area.


Proposal for ref collection-based paths

1. Current item locators

   Natively, the repository supports two ways of finding any item, by
   Universally Unique IDentifier (UUID) and by Path. The UUID works as a
   unique and constant persistent pointer to an item that is valid for the
   life of the item while the Path to an item is a sequence of intrisically
   named items acting as containers describing the item's current location
   in the repository. Paths can be expressed relative to any items or can be
   rooted with the repository.

   There can be only one UUID leading to a given item and there can only be
   one absolute canonical path leading to a given item. Because of this
   unicity constraint, attaching domain specific meaning to the canonical
   location of an item is bound to result in clashes since an item can serve
   more than one domain specific - semantic - purpose.

   The following is a proposal to implement an additional mechanism for
   locating items that is semantically rich and not constrained in the same

2. Definitions

   a. Bi-directional references

      A bi-directional reference between two items is defined by two
      attributes - the endpoints of the reference - each containing a
      pointer to the item being referenced.
      For example:

        Let 'A' and 'B' be two items and 'p' and 'q' be two attributes
        defined on 'A' and 'B' respectively. If 'q' is defined to be the
        'otherName' of 'p', and 'p' is defined to be the 'otherName' of 'q',
        the act of storing a reference to 'B' in 'p' automatically causes a
        reference to 'A' to be stored in 'q', namely:

          A.p = B  causes  B.q = A
          A.p.q == A
          B.q.p == B

   b. Ref collections

      A ref collection on an item is defined as an endpoint attribute on
      this item with a multi-valued cardinality. The cardinality of each
      endpoint is defined independently of each other. For example, the
      above 'p' attribute could be of 'list' cardinality while the
      corresponding 'q' attribute is of 'single' cardinality.
      For example:

          A.p.add(B)  causes  B.q = A
          A.p.get(B).q == A
          B.q.p.get(B) == B

      A ref collection is implemented as an ordered set of item references
      keyed on the UUIDs of the items being referenced. An item can only be
      referenced once in a given ref collection. The above get(B) operation
      is actually defined as get(B's UUID).

   c. Ref aliases

      When stored in a ref collection, the reference to an item can be
      assigned a name - a ref alias - by which it can also be looked up in
      the ref collection. For example:

          A.p.add(B, 'b')  causes  B.q = A
          A.p.get('b').q == A
          B.q.p.get('b') == B

3. Ref collection-based path spaces

   A ref collection-based path space is defined by a well-known root and a
   directed pair - up, down - of endpoint attributes.
   For example:

     Let 'R' be a root and 'p' and 'q' be two endpoint attributes having
     each other as 'otherName'. Let also 'q' be of 'list' cardinality and
     be the ref collection used in the down direction of the path. The
     cardinality of the 'p' attribute, the up direction of the path is
     irrelevant in this example, it could be 'single' or 'list'.
     Dereferencing the path /a/b/c/d in this space is equivalent to the
     following operation:


4. Properties of ref collection-based paths

   There can be a reference to a given item in any number of ref
   collections. Each of these references can be assigned its own name, a ref
   alias, independently of each other.

   Therefore, while each path leads to a different item reference, these
   item reference may yield the same item and:

     (a) in a given ref collection-based path space, there can be any number of
         ref collection-based paths leading to a given item.

   Since each ref collection-based path space is defined a by a well-known
   root and a pair of endpoints attributes defined in the data model's schema:

     (b) a given item may be referenced by a ref collection-based path in a
         given space if and only if it is of a kind declaring the attribute
         for the up direction of the directed endpoint pair.

     (c) a given item may act as a container in a given space if and only
         if it is of a kind declaring the attribute for the down direction
         of the directed endpoint pair.


     (1) a ref collection-based path space defines a semantic space in
         which items can be arranged in a hierarchy.

     (2) ref collection-based paths are semantically pure.

5. Implementation details

   a. Unicity

      (1) A reference to a given item is unique within a given ref collection
      (2) An alias to a reference is unique in a given ref collection

      While it would be difficult to lift the first constraint, the second
      one could be lifted. It is unclear whether this is necessary or
      desirable. Having a given ref collection-based path resolve to
      several item references instead of one would make exporting such paths
      more difficult.

   b. Well-known root

      It is desirable to define a way to cache the well-known root for a
      given ref collection-based path space. Currently, caching the root
      item on the attribute schema item for the down direction endpoint is
      being considered and reflects the attribute-centric approach used to
      define such path spaces.

More information about the Dev mailing list